These are chat archives for dropbox/pyston

3rd
Jul 2016
Nathaniel J. Smith
@njsmith
Jul 03 2016 00:35
I understand stuff happens
hope everything is well
Tristan Hume
@trishume
Jul 03 2016 20:43
Hi, I noticed the recent blog post on HN (I was the original commenter that mentioned LuaJIT) and I had an idea for Pyston that I think might be handy. Not sure if it's been discussed before.
What about a mode for Pyston that disables threading, or an annotation that stops other threads from being scheduled in certain loops. This would allow tons of optimizations like hoisting method hash table lookups out of the loops, since those tables can no longer be mucked with from other threads. This could be a huge win for mostly-serial code, and if it is 4-8x faster it could be even better than parallelizing slow code.
LuaJIT manages to compile such incredible inner loops and hoist all the dynamic language stuff out of them because Lua states are independent so it doesn't have to worry about things changing in the middle of a loop.
@kmod
Chris Seaton
@chrisseaton
Jul 03 2016 20:50
You can do all that hoisting anyway, even in the presence of other threads, if you use safe points to stop the world when other threads invalidate your optimisations. I think the overhead of safe points like that is almost statistically insignificant.
This is what VMs like HotSpot do
Tristan Hume
@trishume
Jul 03 2016 20:54
Oh hi Chris, yah that's true. How does HotSpot deal with state changes between safe points? Say I have global.hash.x = 5 compiled down to just directly store without doing global hash table lookup, and then a safepoint. What if I change global.hash = someotherthing just before the other thread stops at the safepoint, it will have modified a hash that it shouldn't have.
Chris Seaton
@chrisseaton
Jul 03 2016 20:56
Thread A says to all other threads 'I'm assuming that nobody is going to modify global.x.hash - if you want to do that, stop me and tell me before doing it'
Tristan Hume
@trishume
Jul 03 2016 20:58
oh okay that makes sense, so the thread that is running global.hash = someotherthing stops all other threads at a safepoint first, invalidates their assumption on global.hash, then modifies it, then starts all the other threads up again.
(possibly with other threads having to re-jit their code because of the invalidated assumption).
Chris Seaton
@chrisseaton
Jul 03 2016 20:59
Yes
Doing it on random objects like hashes may not be tractable but you can do it on classes
Tristan Hume
@trishume
Jul 03 2016 21:01
That makes sense. I guess Pyston will want to implement safepoints then. Or has it done so already? Does anyone know if PyPy uses safepoints to hoist out assumptions?
Tristan Hume
@trishume
Jul 03 2016 21:06
@chrisseaton are you working on Pyston or are you just hanging around to discuss things and offer advice as someone experienced in getting a python-like language to run incredibly quickly?
Chris Seaton
@chrisseaton
Jul 03 2016 21:17
No I work on Ruby. Just passionate about language VMs and lurking here to see how Pyston progresses.
PyPy calls these assumptions red variables I think and checks them when entering a trace
Tristan Hume
@trishume
Jul 03 2016 21:17
cool
Nathaniel J. Smith
@njsmith
Jul 03 2016 21:54
My impression is that pypy is very conservative about this kind of thing -- I think they might even treat local variables as potentially being modified by other threads (at least in some cases). Python's introspection machinery is very
powerful
(though even in cpython, while the APIs exist to introspect frame locals while it's running, cpython's optimizations mean they don't actually work correctly!)
Chris Seaton
@chrisseaton
Jul 03 2016 22:54
In my implementation of Ruby we solve the problem of other threads accessing your local variables using safe points as well
We keep local variables on the stack, and then if someone starts to modify them we stop the world in a safe point and move everything onto the heap, where we don't optimise accesses as aggressively