These are chat archives for dropbox/pyston

22nd
Apr 2015
Chris Toshok
@toshok
Apr 22 2015 01:15
interesting. allocations/request goes down over time
from 9267179 bytes to 8288152 over the course of 100 requests
Michael Arntzenius
@rntz
Apr 22 2015 01:29
perhaps stuff is getting cached somewhere
Chris Toshok
@toshok
Apr 22 2015 01:30
yeah there’s definitely some django-level caching, but that seems to be built up in the first request
Chris Toshok
@toshok
Apr 22 2015 18:32
hrm
src/asm_writing/rewriter.cpp:1081: int pyston::Rewriter::_allocate(pyston::RewriterVar *, int): Assertion `0' failed: Using all 152 bytes of scratch!
Travis Hance
@tjhance
Apr 22 2015 19:59
what is it trying to re-write?
Chris Toshok
@toshok
Apr 22 2015 20:15
looks like the call to IntegerField() at django/db/models/sql/aggregates.py:71
Kevin Modzelewski
@kmod
Apr 22 2015 20:18
the easiest thing to do is to figure out what kind of IC it is
ie the first runtime function hit from the JITed code / interpreter
and then increase the number of scratch bytes we give to that kind of IC
Chris Toshok
@toshok
Apr 22 2015 20:32
hm, gdb is showing this:
#7  0x00000000006846a1 in runtimeCall (obj=0x127215e408, argspec=..., arg1=0x0, arg2=<optimized out>, arg3=0x0, args=0x7fffd5d98400, keyword_names=<optimized out>) at src/runtime/objmodel.cpp:3161
#8  0x00007fffede08ba4 in _ordinal_aggregate_field_e1_1222 () at /mnt/toshok/pyston/lib_pyston/django/db/models/sql/aggregates.py:71
so I figured the number I should tune is that in createCallsiteIC?
Travis Hance
@tjhance
Apr 22 2015 20:34
the rewriter could probalby be smarter with scratch space, also
Kevin Modzelewski
@kmod
Apr 22 2015 20:34
oh all patchpoints have the same amount of scratch space
Chris Toshok
@toshok
Apr 22 2015 20:34
ah I see
Kevin Modzelewski
@kmod
Apr 22 2015 20:34
and yeah it should probably abort the rewrite rather than killing the entire process
Chris Toshok
@toshok
Apr 22 2015 20:35
okay, cool. because changing the numbers in the ICSetupInitialize didn’t do anything :)
Travis Hance
@tjhance
Apr 22 2015 20:35
I think there are some arg arrays that get allocated in scratch and don’t get collected...
Chris Toshok
@toshok
Apr 22 2015 20:43
okay, cool. having it abort the rewrite (hopefully that’s safe, it doesn’t appear to happen anywhere else) seems to fix it
Kevin Modzelewski
@kmod
Apr 22 2015 21:07
USE_CMAKE=1 is now the default
#thefutureisnow
Chris Toshok
@toshok
Apr 22 2015 21:09
🎉👍🏼💯
god that was a pain to type
Chris Toshok
@toshok
Apr 22 2015 21:35
so the pypy algorithm is pretty nice
O(3n) where n is the number object instances with finalizers (it maintains a list of all instances with finalizers)
Marius Wachtler
@undingen
Apr 22 2015 21:43
@toshok Do you have plans to submit your binary search patch to the GCC mailinglist?
Chris Toshok
@toshok
Apr 22 2015 21:44
yeah. lower priority than anything else, but I think the FSF still has my copyright assignment from 15 years ago on file, so they might be more inclined to take it :)
Marius Wachtler
@undingen
Apr 22 2015 21:44
Your gcc binary search patch is looking good to me, but the GCC peoples feedback is probably more valuable.
Chris Toshok
@toshok
Apr 22 2015 21:44
yeah
Marius Wachtler
@undingen
Apr 22 2015 21:49
but I suspect your patch is going to be an exceptional large speed improvement in such a core function. (even if normal won't hit it..) Not many people can say that the speed up a gcc runtime function by 10x or so.. :-D
Chris Toshok
@toshok
Apr 22 2015 21:53
heh, true. even though we won’t need it someone out there will probably also run into something similar :)
Travis Hance
@tjhance
Apr 22 2015 22:14

but I think the FSF still has my copyright assignment from 15 years ago on file, so they might be more inclined to take it

huh?

Chris Toshok
@toshok
Apr 22 2015 22:14
they initially pushed back on the bugzilla patch because of the contributor agreement (or lack thereof)
i think that’s why the maintainer said “i’ll do this (another way)” then dropped it
Travis Hance
@tjhance
Apr 22 2015 22:15
ohhh
Chris Toshok
@toshok
Apr 22 2015 22:15
huh, so I added sys.clear_stats() that resets all our stat counters to 0
and on the 201st request, there’s not a whole lot of slowpath action
Chris Toshok
@toshok
Apr 22 2015 22:26
ah that’s why - i was reseting at the end of the request, then printing them out
Marius Wachtler
@undingen
Apr 22 2015 22:26
:-P
turns out that getting the '# coding' thing right is harder than I thought. Pyxl is not that difficult because it still uses utf8 but other encodings...
Chris Toshok
@toshok
Apr 22 2015 22:32
interesting (scary :) allocation stats for the 201st request: 31780 tuples, 23217 dicts, 13717 instance methods, 11953 strings
Marius Wachtler
@undingen
Apr 22 2015 22:35
not sure how many other c++ applications create several times per second 23k std::unordered_maps...
or is this combined for all 201 requests?
Chris Toshok
@toshok
Apr 22 2015 22:36
i haven’t checked the std::unordered_map code, but i’m curious if the dynamic allocation is delayed until the first item is inserted
nope, this is just for the 201st :/
it’s not a perfect count - I reset the stats in the request handler’s init function, and dump on ctrl-c
so there’s some missing, but the counts should be >= what I dump
Chris Toshok
@toshok
Apr 22 2015 22:43
just for kicks to see if caching len=1 strings, only 2% of them (270) are single character
30% of boxed tuples are the empty tuple, though
Travis Hance
@tjhance
Apr 22 2015 22:44
cache it? :D
Chris Toshok
@toshok
Apr 22 2015 22:44
we already do :) EmptyTuple
even that’s unlikely to make all that much difference though. i was hoping for like 90% strings/tuples/whatever :)
is there a builtin module that’s an implementation specific area?
i have clear_stats in sys now, but that seems the wrong place for dump_stats+clear_stats
Kevin Modzelewski
@kmod
Apr 22 2015 22:46
how much memory do those allocations add up to?
we've been using __pyston__ for some of those things
Chris Toshok
@toshok
Apr 22 2015 22:47
ah perfect
hm, let me add more stats
Marius Wachtler
@undingen
Apr 22 2015 22:49
yeah pyston speaks: koi8-r :-)
Chris Toshok
@toshok
Apr 22 2015 22:50
nice :)
Marius Wachtler
@undingen
Apr 22 2015 22:50
pyston.cpp
__pyston__ module
oh now I'm feeling stupid... should have read until the end....
Michael Arntzenius
@rntz
Apr 22 2015 22:56
depending on whether I compile under CMake or not, different tests pass!
(the difference is probably down to using libcc 4.8 or 4.9)
Chris Toshok
@toshok
Apr 22 2015 22:56
different versions of libunwind/gcc/etc?
Michael Arntzenius
@rntz
Apr 22 2015 22:56
yeah, of libgcc
Chris Toshok
@toshok
Apr 22 2015 22:56
yeah :/
Michael Arntzenius
@rntz
Apr 22 2015 22:58
10 more tests pass with libgcc 4.8
Kevin Modzelewski
@kmod
Apr 22 2015 22:58
what kinds of failures are you getting?
Michael Arntzenius
@rntz
Apr 22 2015 22:58
yeah, that's what I'm looking into
some stuff to do with generators
Marius Wachtler
@undingen
Apr 22 2015 22:59
oh no, that was not what I wanted to read...
Chris Toshok
@toshok
Apr 22 2015 23:02
dicts and tuples are ~1M each (280k of the tuples are EmptyTuple). strings are 500k, instancemethods 438k
that’s just the python tagged memory for dicts, the unordered_map isn’t included
tuples should be fully accounted for though
Kevin Modzelewski
@kmod
Apr 22 2015 23:04
oh hmm, was hoping that might be obvious too
err, that the size stats would help make it obvious
Chris Toshok
@toshok
Apr 22 2015 23:05
yeah
getattr also yielded some fruit
I added the type name to the slowpath id, and this is at the top:
slowpath_box_getattr.type.__call__: 24189
Kevin Modzelewski
@kmod
Apr 22 2015 23:07
is that for a single request?
Chris Toshok
@toshok
Apr 22 2015 23:07
yes
after giving what hopefully is enough time to warm up (200 requests)
Kevin Modzelewski
@kmod
Apr 22 2015 23:08
oh man
Michael Arntzenius
@rntz
Apr 22 2015 23:11
hm, PyObject_GenericSetAttr seems to be causing some problems. and I'm getting KeyErrors in the wrong places, of all things.
Marius Wachtler
@undingen
Apr 22 2015 23:13
ok already quite late here, won't get the '# coding ' patch done today, will finish it tomorrow.
Chris Toshok
@toshok
Apr 22 2015 23:48
hm, if I had to wager, I’d bet that functools.partial is what’s killing us
we should be able to rewrite through that sort of construct, but we can’t since it goes through the capi