These are chat archives for dropbox/pyston

Jul 2015
Rudi Chen
Jul 08 2015 00:46
Increasing the slot sizes of an IC of a hot function can cause a 1-2% performance drop apparently.
Jul 08 2015 01:35
Do you guys have any sort of ballpark idea of what your upper speed limit will be? Or is it all still unsure
I guess I am asking if you think you can beat pypy or v8, or if you have run into any fundamental limitations of your approach yet
Or if there is any sort of point that dropbox has decided is the goal and hitting that point is a project success
Kevin Modzelewski
Jul 08 2015 02:01
those are good questions :)
I'd guess that the "is it worth the risk" threshold is something like a 50% speedup
so if we can hit that then I would guess we would move services onto Pyston, which would make the project a success in my eyes :)
I think we can do better than a 50% speedup though
but we're not that close to that right now, so it's hard to say anything for sure
just getting to CPython's speed is quite a challenge
Jul 08 2015 03:23
Thanks for the reply, It is really a tough problem for sure.
Hopefully the faster you get, more and more companies will put money and time into it.
So you can get good patches for free :)
Kevin Modzelewski
Jul 08 2015 22:48
@undingen the "tiering overhead" breaks down as: 17% (of cpython's time) doing irgen, 14% in interpreter, 12% in the rewriter
if we're not seeing huge gains from making the code better or faster to generate, maybe we could adjust the thresholds so that we spend less time in the interpreter / irgen?
Marius Wachtler
Jul 08 2015 22:57
ok I will experiment with the thresholds. Concerning the interpreter: would be interesting to know how much of this get's spend interpreting and how much of it is setting up the interpreter before interpreting (or switching to the baseline JITed code)
oh I just tried disabling runtimeICs for the baseline jit:
pyston                 :    3.8s base_3: 3.8 (-1.5%)
pyston                      :    3.7s base_3: 3.6 (+4.2%)
pyston                  :    1.4s base_3: 1.6 (-17.8%)
Kevin Modzelewski
Jul 08 2015 23:05
out of 20 samples
8 were in runtime code (mostly importing stuff)
3 were in the baseline jit
9 were in interpreter overhead
where the interpreter overhead was mostly actively-interpreting but a couple were setting up the interpreter (ex initArguments)
Marius Wachtler
Jul 08 2015 23:28
ok thanks :-)
Kevin Modzelewski
Jul 08 2015 23:32
I guess those were samples taken at the beginning of the benchmark, which would be more import-heavy