These are chat archives for dropbox/pyston

27th
May 2015
Chris Toshok
@toshok
May 27 2015 05:16 UTC
"The code between a comment // clang-format off or /* clang-format off */ up to a comment // clang-format on or /* clang-format on */ will not be formatted."
that seems useful
Marius Wachtler
@undingen
May 27 2015 16:13 UTC
oh goosh... I'm trying to find out why lxml is crashing when doing nontrivial examples since hours. I think I know it now (should have thought about this earlier): libxml allocated the nodes with the default mem alloc routines (=malloc). Therefore the content won't get scanned. And we will free child nodes...
At least that is my guess: I added a malloc hook and scan now also the malloced memory (=terrible slow) but I get further in the tests.
Chris Toshok
@toshok
May 27 2015 16:15 UTC
like, actually malloc() or PyObject_Malloc()?
Marius Wachtler
@undingen
May 27 2015 16:15 UTC
actual malloc
Chris Toshok
@toshok
May 27 2015 16:15 UTC
yuck
Marius Wachtler
@undingen
May 27 2015 16:16 UTC
lxml uses underneath the normal libxml and libxslt libraries they provide custom memory functions but the default is to use malloc.
Chris Toshok
@toshok
May 27 2015 16:17 UTC
self->data = malloc(self->size * sizeof(PyObject*)); from cpython/Modules/cPickle.c
wonder if there are others
Marius Wachtler
@undingen
May 27 2015 16:18 UTC
looks like it
Chris Toshok
@toshok
May 27 2015 16:18 UTC
wow, but it’s using them for allocating python objects?
Marius Wachtler
@undingen
May 27 2015 16:20 UTC
That's my guess but I haven't understood/found what it really does. Maybe libxml support to add user defined data to nodes and lxml uses this to store refernces to the Python objects???
Marius Wachtler
@undingen
May 27 2015 16:29 UTC
what I found strange was that it crashed even if I let gcIsEnabled always return false. But now I found out that the test is manually calling gc.collect and this one collects always...
if I disable this one also the tests work (much better)
Marius Wachtler
@undingen
May 27 2015 16:40 UTC
ok looks like the cPickle malloc you found is the only one which stores Python objects and is in an module we support.
Chris Toshok
@toshok
May 27 2015 16:48 UTC
I didn’t check to see if they were rooted in some fashion or just incref'ed
hm, they aren’t rooted
Chris Toshok
@toshok
May 27 2015 17:10 UTC
egg drop time :)
Marius Wachtler
@undingen
May 27 2015 17:13 UTC
???
Rudi Chen
@rudi-c
May 27 2015 17:18 UTC
Make a parachute that will allow an egg to be dropped without breaking?
(We did that too as an intern event)
Marius Wachtler
@undingen
May 27 2015 17:25 UTC
sounds like fun
Travis Hance
@tjhance
May 27 2015 17:33 UTC
ooh I did that once
I stuffed my eggs in a loaf of bread
it kinda worked
Chris Toshok
@toshok
May 27 2015 18:09 UTC
Josie made a pyramid out of chopsticks and straws, egg suspended inside with rubber bands. Vehicle definitely did not survive impact, but egg did, so... Win
Rudi Chen
@rudi-c
May 27 2015 18:12 UTC
nice :D
Marius Wachtler
@undingen
May 27 2015 18:55 UTC
:-)
Marius Wachtler
@undingen
May 27 2015 19:05 UTC
changing lxml to call xmlMemSetup() in order to let libxml use our GC memory functions works
But still we should discuss if we want to change this or replace the malloc routines with our own. (may be hard for new?)
Chris Toshok
@toshok
May 27 2015 19:08 UTC
are you allocating with ::CONSERVATIVE?
in the xmlMemSetup() case, that is
doing that globally in a malloc wrapper would likely end up being pretty expensive, since we’d have to scan all malloced memory (even for our container types that we explicitly don’t use StlCompatAllocator for :/)
Marius Wachtler
@undingen
May 27 2015 19:17 UTC
mmh probable extremely expensive :-(.
I'm registering the CONSERVATIVE routines with xmlMemSetup. Haven't measured it but feels like it's much slower
Marius Wachtler
@undingen
May 27 2015 19:37 UTC
I created a simple chart to compare the pip execution time with empty and full object cache: http://i.imgur.com/sPI0caM.png for a future blog posting. any suggestions?
Travis Hance
@tjhance
May 27 2015 19:38 UTC
why is ‘optimization’ so small?
Marius Wachtler
@undingen
May 27 2015 19:39 UTC
It took the: us_compiling_optimizing stats and its only 10ms
Travis Hance
@tjhance
May 27 2015 19:40 UTC
i just expected that optimization would be expensive
Marius Wachtler
@undingen
May 27 2015 19:47 UTC
I think opt may also currently be very cheap because there won't be much to do. basically all the patchpoints instructions will get skipped. So I suspect that the only thing which can get optimized is if we use llvm instructions instead of calls and this imho currently mostly for float code because the int stuff got removed when int to long promotion got added.
Travis Hance
@tjhance
May 27 2015 19:49 UTC
oh
Marius Wachtler
@undingen
May 27 2015 19:51 UTC
and it only contains the time spend optimizing the LLVM IR. Time spend later optimizing the machine instructions is tracked inside the us_compiling_jitting aka code generation in the chart.
Chris Toshok
@toshok
May 27 2015 21:12 UTC
hrm, one problem with incrementally building up the traceback is that the process needs to stop and continue around generators
Marius Wachtler
@undingen
May 27 2015 21:21 UTC
mmh yeah currently there is my hack in place to create tracebacks through generators..
Chris Toshok
@toshok
May 27 2015 21:22 UTC
i think the incremental piece should happen for free
it’s just how the unwind machinery has to worry about different stats of ExcInfo
1) ExcInfo with a traceback not set up for incremental additions
2) ExcInfo with a traceback with no line entries set up for incremental additions
3) ExcInfo with a traceback with line entries set up for incremental additions
Marius Wachtler
@undingen
May 27 2015 21:23 UTC
this mechanism is still like magic to me :-D
Chris Toshok
@toshok
May 27 2015 21:24 UTC
the start of the unwind will usually be in 2. 3 is the state where we restart the unwind after the pause in the generator
hm, actually i need to write somet tests
it’s less magic for me than it used to be. I’m still in awe of the amount of bookkeeping involved, though
and how much work the compiler has to do
Marius Wachtler
@undingen
May 27 2015 21:28 UTC
and 3 will probable also happen when we support creating custom frames from the capi. Cython uses them to create a traceback entry which points to the original cython source file. (kind of like the #line directive for the c++ preprocessor)
Chris Toshok
@toshok
May 27 2015 21:28 UTC
wow
custom frames?
Marius Wachtler
@undingen
May 27 2015 21:29 UTC
I mean: The cython source file get's translated into capi calls but it also creates custom traceback entries in order that tracebacks look like they did not came from the capi but from the cython source file
Chris Toshok
@toshok
May 27 2015 21:30 UTC
right
are there capi functions for pushing/popping frames? i can’t see how else that would work in C
Looks like PyTraceBack_Here does the magic.
Chris Toshok
@toshok
May 27 2015 21:35 UTC
oh wow, yeah that does:
PyTracebackObject *tb = newtracebackobject(oldtb, frame);
and i’m guessing the decref pops it
hm, no… __Pyx_AddTraceback is called at points in the cython generated code to add the frame
Marius Wachtler
@undingen
May 27 2015 21:38 UTC
I can upload the generated 180k line c file for lxml.etree if you want to take a look in more detail :-D
Chris Toshok
@toshok
May 27 2015 21:38 UTC
heh no worries. I googled for Pyx_AddTraceback and found, e.g. cython.org/hg/Pyrex/Tests/2/slicex.c
Marius Wachtler
@undingen
May 27 2015 21:39 UTC
It's also inside gist I linked :-D
Chris Toshok
@toshok
May 27 2015 23:43 UTC
so the incremental traceback stuff is limping to life. only works with -n and tracebacks are in reverse order due to using push_back instead of (non-existant in llvm::SmallVector) push_front