These are chat archives for dropbox/pyston

3rd
Jun 2015
Marius Wachtler
@undingen
Jun 03 2015 00:01
I don't think it has anything to do with it. It's more like this: the ast interpreter inside the generator is stopping at the yield (-> frame will be registered inside the s_interpreterMap). But after the yield returns to the calling function we will never call the generator again. But it will still stay in memory because it's still registered in the s_interpreterMap because the yield never returned. That's how I suppose it can happen.
Kevin Modzelewski
@kmod
Jun 03 2015 00:01
oh man
that sounds tricky
I wonder if even if we solve this for the interpreter, if there could be other destructors on the stack that we won't run
Marius Wachtler
@undingen
Jun 03 2015 00:20
enough for today will look into it tomorrow
Chris Toshok
@toshok
Jun 03 2015 00:26
sigh, libunwind cursors are opaque (and their contexts are meant to be too). so the hack re: ASTInterpreter::execute is going to be a little more of a hack :/
basically the RegisterHelper won’t map from %rbp inside ::execute, it will map from %rbp inside ::executeInternal
Chris Toshok
@toshok
Jun 03 2015 00:53
do we have a general (not stats-related) boolean flag for “we’re done initializing our runtime types”?
alternatively the flag could be “we’re ready to start running actual code from .py files"
Kevin Modzelewski
@kmod
Jun 03 2015 00:55
usually I just pick a class and check whether it is null or not :/
depending on how far in the initialization you want to check, you can pick different classes
Chris Toshok
@toshok
Jun 03 2015 00:55
heh
i was thinking of just short circuiting all the getGlobals()/getLocals()/etc to return NULL during initialization
instead of trying to find a python frame by unwinding
Chris Toshok
@toshok
Jun 03 2015 01:54
django-template.py down to 6.5s
Travis Hance
@tjhance
Jun 03 2015 02:25
what tries to get a python frame during initialization?
Chris Toshok
@toshok
Jun 03 2015 02:33
Creating boxedfunctions iirc
The short circuit change really doesn't help much. I expect it's due to the short stack at that point in execution, and libunwind caching things
Travis Hance
@tjhance
Jun 03 2015 02:57
We need a stack frame to make a boxed function?
Chris Toshok
@toshok
Jun 03 2015 03:21
We lookup __module__ I think. Will check when I get to laptop again
Travis Hance
@tjhance
Jun 03 2015 04:20
Hm is this going to, like, murder our performance when people program with lots of closures?
Chris Toshok
@toshok
Jun 03 2015 05:03
unsure. it’s likely only a few frames until it hits a python frame, but the getattr change (even though it involves actually unwinding) is only 1 frame
Marius Wachtler
@undingen
Jun 03 2015 08:12
Do you guys think it makes sense to create a python class for the ASTInterpreter in order that it can participate in the gc scan? I think this would make it unnecessary to scan all registered AST interpreters (like we do now) but instead they will automatically get found if they are used or deleted when they are unreachable. But we may need a simple destructor to clear some stuff..
Kevin Modzelewski
@kmod
Jun 03 2015 08:15
I think that sounds reasonable (though I wish there was some way to make it participate in GC without making it a Python object)
I wonder though if we need to tackle the problem more generally, though, of destructing all objects left on the generator stack that we will munmap
Marius Wachtler
@undingen
Jun 03 2015 08:22
yeah we may need to do this in addition
Kevin Modzelewski
@kmod
Jun 03 2015 08:37
I like the look of this graph :)
Marius Wachtler
@undingen
Jun 03 2015 08:45
:+1:
Chris Toshok
@toshok
Jun 03 2015 21:04
how much of a win was it to intern string literals?
wondering if a similar approach might work for unicode literals
Marius Wachtler
@undingen
Jun 03 2015 21:41
we don't intern much currently. so don't think it's a win at this point
oh I think I'm talking about another interning (the CPython string interning)
The other one was a large win I think. especially because before all analysis passes operated on strings.
Marius Wachtler
@undingen
Jun 03 2015 21:48
Are we interested in the _elementtree (c implementation of the elementtree API)? It complies and looks like its working but I'm not sure if it's faster than the python implementation. (For the babel build step it looks to be slightly faster) But if we improve our python perf we may become faster than the c implementation?
Marius Wachtler
@undingen
Jun 03 2015 21:53
Ok I found a benchmark will try to get numbers
Marius Wachtler
@undingen
Jun 03 2015 22:07
ok 144secs vs 14 secs(C implemetation) -> we may want to use the capi for now
Chris Toshok
@toshok
Jun 03 2015 22:07
hehe ow
Chris Toshok
@toshok
Jun 03 2015 22:25
sigh. added a cache for all single char BoxedStrings, and it resulted in a big slow down
#ifdefed out everything but the function doing the caching itself, and still, 2% slowdown
This message was deleted
Marius Wachtler
@undingen
Jun 03 2015 22:37
When I got the multidict to successfully run the django bench it regressed it by about 2x :-D. Couldn't make sense why it's so much slower, until I saw that I had the memory stats enabled. After that it was slightly faster.
Chris Toshok
@toshok
Jun 03 2015 22:38
i think we need to figure out a way to bulk out the section-ordering file
Chris Toshok
@toshok
Jun 03 2015 23:22
yeah, so valgrind doesn’t like us using malloc for our ic’s :)
runtime ic’s that is