These are chat archives for dropbox/pyston

10th
Jun 2017
Kevin Modzelewski
@kmod
Jun 10 2017 03:36
Well libunwind is used to implement Python-level exceptions, so I guess it might be easy to disable, but hard to disable without removing support for exceptions
that said, we might have the ability to do that now, now that we have return-code-based exceptions as well. I'm not sure if those are usable everything; I think there are some cases that we require libunwind-based exceptions.
My guess is still that it's easier to fix our usage of libunwind than to replace it
The python bytecode -> machine code isn't a very important step; it's mostly just to enable our inline caches
Ethan Smith
@ethanhs
Jun 10 2017 03:40
I know it isn't the most relevant, but out of curiosity, what are your thoughts on the new frame-eval api?
Kevin Modzelewski
@kmod
Jun 10 2017 03:40
it might be feasible to run a program for a while and then freeze the code into an executable. I'm not sure if there's much of a benefit
Ethan Smith
@ethanhs
Jun 10 2017 03:41
It'd probably be better to write Python->C++ for static compilation I would think, or C or another language
Kevin Modzelewski
@kmod
Jun 10 2017 03:42
@ethanhs PEP 523? it's a very narrow change that doesn't support very many techniques. It'd be cool if it does work but my bet is that it's too limiting.
Ethan Smith
@ethanhs
Jun 10 2017 03:43
Due to the fact its frame based I take it?
Or is there some other limitation that I am missing?
Kevin Modzelewski
@kmod
Jun 10 2017 03:43
Well, the problem isn't that python bytecode is inefficient or that there are lots of optimization opportunities in it
This is also what makes static compilation of it not that interesting
So now you could get the information that "we are calling method foo here", but there's not much you can do with that
Ethan Smith
@ethanhs
Jun 10 2017 03:46
I guess you do end up jitting frames of bytecode instead of entire sources
Kevin Modzelewski
@kmod
Jun 10 2017 03:46
The interpreter doesn't spend very much time figuring that out, so it doesn't help to optimize it out. The hard part is figuring out how to "call method foo" quickly
Ethan Smith
@ethanhs
Jun 10 2017 03:46
right, so you end up writing a bytecode jit
so you can call foo quickly
Kevin Modzelewski
@kmod
Jun 10 2017 03:47
Well doing that doesn't really involve the bytecode at all
Ethan Smith
@ethanhs
Jun 10 2017 03:48
Well it can. You can call frame.f_code.co_code iirc and process that
Right?
Kevin Modzelewski
@kmod
Jun 10 2017 03:49
The bytecode will just tell you something like "fetch the 'foo' member from argument 1 and call it with no arguments"
Decoding that doesn't take very much time; most of the time is spent actually doing the work the bytecodes describe (looking up attributes, doing the descriptor protocol, doing calling-convention matching, etc)
Maybe you could do some type analysis on the bytecode to know what the types are, but without deeper hooks into the runtime, types alone won't specific how to execute bytecodes
Ethan Smith
@ethanhs
Jun 10 2017 03:51
I thought bytecodes told you what to execute?
Kevin Modzelewski
@kmod
Jun 10 2017 03:51
At a very high level
Ethan Smith
@ethanhs
Jun 10 2017 03:52
And it calls it on the stack, so what you really want to do is figure out how to efficiently execute the bytecode commands to manipulate the stack, unless I am missing something
Kevin Modzelewski
@kmod
Jun 10 2017 03:53
Yes, you can do that, but you might be looking at how to optimize a small fraction (I forget the exact number, maybe 5-10%) of the runtime.
Ethan Smith
@ethanhs
Jun 10 2017 03:57
But what else is there, you said decoding and doing bytecode operations are a minimal part of the runtime, which makes sense. But isn't the execution of what the bytecode does make up almost entirely the rest of it? Say I had BINARY_ADD with a and b each on the stack. if I did some clever analysis and knew they were both ints, couldn't I speed up that addition?
And thank you very much for discussing this, I have some knowledge of jits and Python internals, but Im always interested in learning more :)
Kevin Modzelewski
@kmod
Jun 10 2017 04:03
Yep you could! There are some tricky cases (integer addition overflows into longs), but you could get integer addition to be fast
other cases get much more complicated and you start needing more help from the rest of the runtime
ex fetching attributes
Ethan Smith
@ethanhs
Jun 10 2017 04:08
Ah, yes of course that makes a lot of sense. I didn't consider that. I suppose you could be clever and do things like cache frames, but that would still be slower. Thanks!
mhsjlw
@mhsjlw
Jun 10 2017 14:23
@kmod my question about compiling to a binary was just because I wanted to run it on my raspberry pi and not need an entire jit
and as far as libunwind goes, i think it would be really good to figure out a way to fix it, because as far as i can see, if we don't need libunwind i'm pretty sure i'd be able to get this to work on a mac
this is kind of stupid, but if you know a way to compile to a binary we could go beyond raspberry pi and maybe even run python on ios?
anyway, that's aside the point
i know languages like crystal and rust use the llvm and they are able to compile to native binaries, but from what i can see from pyston it's not exactly simple