These are chat archives for dropbox/pyston

13th
Aug 2015
Rudi Chen
@rudi-c
Aug 13 2015 00:53
Dealing with hash structures whose keys can be Python objects in a moving GC is so tricky.
You can't replace pointer values directly because it could mess up the internals of the data structure.
You can't do a lookup for the old pointer to replace it with the new pointer.
Doing a lookup would call the hash function. You don't want to call the hash function because arbitrary code can run and that's really bad inside a GC (there could be allocations).
There's no way to do a lookup just via the pointer value - we just defined our hash table to hash the object rather than just the identity.
So you could iterate through all <key, value> pairs and check if the key has the pointer value you want to replace.
And erase it via the iterator.
But when you want to insert the new pointer value, you'll still need to do a hash call.
I'm not sure what's the right way to solve this problem.
Rudi Chen
@rudi-c
Aug 13 2015 00:59
Maybe if we scan the entire memory used by the hash data structure and replace all pointers (rather than just replacing via, say, what an iterator gives us), the hash table could still maintain it's integrity, because in both the old pointer and the new (moved) pointer should still point to the same object and have the same hash value.
Travis Hance
@tjhance
Aug 13 2015 00:59
depends on the implementation of the hash table, I guess
Rudi Chen
@rudi-c
Aug 13 2015 00:59
But that makes assumptions about the internals of the hash table implementation yeah.
It's rather unsafe to do.
It's possible we might need to implement a moving-gc-safe hash table?
Kevin Modzelewski
@kmod
Aug 13 2015 01:12
yeah it seems unlikely that we would find an nice way to do that for an arbitrary hashtable implementation
though I did just put up a PR that caches the hash result so maybe this will be more possible with that?
well
maybe another way to see this is that moving the object shouldn't change the hash value at all
Rudi Chen
@rudi-c
Aug 13 2015 01:13
Another solution a friend proposed is to store handles in the hash table instead.
Kevin Modzelewski
@kmod
Aug 13 2015 01:13
at least, for the object-based hashtables (as opposed to pointer-based ones like we use for hiddenclasses)
Rudi Chen
@rudi-c
Aug 13 2015 01:14
Moving the object won't change the hash value, but the data structure needs to know that.
Kevin Modzelewski
@kmod
Aug 13 2015 01:15
well, assuming that it just stores the key once, shouldn't we be able to just update that memory location?
there probably isn't an api for that though
Rudi Chen
@rudi-c
Aug 13 2015 01:16
I think llvm DenseMap doesn't store it just once.
If it did I probably wouldn't have run into another issue that I was trying to access a moved object which is what led me to this.
Haven't tried unordered_map but I'd rather not rely on the internals of the hash table only storing the pointer once.
Kevin Modzelewski
@kmod
Aug 13 2015 01:18
well this is inherently an implementation-specific issue
Rudi Chen
@rudi-c
Aug 13 2015 01:18
A handle guarantees that we know where the pointer is.
Kevin Modzelewski
@kmod
Aug 13 2015 01:18
we would want to encapsulate it in some nice way
Rudi Chen
@rudi-c
Aug 13 2015 01:19
We could have our own hash table which we guarantee to be moving-compatible.
Kevin Modzelewski
@kmod
Aug 13 2015 01:19
yes, that's basically what I'm suggesting
starting with llvm::DenseMap, and verifying that it has the "only stores it once" property
the handle approach sounds like it would be more complicated to make it performant
(ie not allocating the handles individually)
Rudi Chen
@rudi-c
Aug 13 2015 01:21
The issue showed up after I rebased on your DenseMap commit suggesting DenseMap doesn't have that property.
But it's possible that the handle approach doesn't have that much overhead.
If the amount of extra work done is small v.s. calling the hash function and such.
Kevin Modzelewski
@kmod
Aug 13 2015 01:22
it sounds like "hashmap with handles" is a particular option for "a hashmap that only stores the key once"
Rudi Chen
@rudi-c
Aug 13 2015 01:22
Yes
Kevin Modzelewski
@kmod
Aug 13 2015 01:25
looks like DenseMap only stores each key once
Rudi Chen
@rudi-c
Aug 13 2015 01:26
Hmm
I'll look at the moved pointer access problem again tomorrow.
I thought it came from the DenseMap...
Kevin Modzelewski
@kmod
Aug 13 2015 01:28
well it still might
or maybe it comes from some other reference into it?
like an iterator or something
Rudi Chen
@rudi-c
Aug 13 2015 01:30
Yeah we're not scanning those at the moment so that's possible.
Sun
@Daetalus
Aug 13 2015 03:47
Hi, @kmod test_float could pass now, except I hard coded the sys.float_info.mant_dig in the test file. Could I add sys.xxx_info support in a seperate PR?
Kevin Modzelewski
@kmod
Aug 13 2015 03:48
sounds good :)
Sun
@Daetalus
Aug 13 2015 03:49
I want to improve complex first, but I found it need fully supoort for float first. So I moved to test_float.
Kevin Modzelewski
@kmod
Aug 13 2015 03:51
yeah, that often happens :)
when I run into that I usually rewrite the commit messages to make it sound like I anticipated the dependency ahead of time :P
Sun
@Daetalus
Aug 13 2015 06:50
@kmod Would you mind restart the gcc CI in #830? And review it if you have time.
Kevin Modzelewski
@kmod
Aug 13 2015 07:25
this looks like an actual compile issue :)
I think gcc is a bit more aggressive about warning about const-correctness
Sun
@Daetalus
Aug 13 2015 07:25
Ok
Vinzenz Feenstra
@vinzenz
Aug 13 2015 07:27
@Daetalus you should use a const cast there then
Sun
@Daetalus
Aug 13 2015 07:27
I see. Thanks. I will try it.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 07:29
But in general, that's a bad sign
Sun
@Daetalus
Aug 13 2015 07:29
BTW, before commit, you can run make format.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 07:29
yeah I know, I just forgot about it
doing that right now
Sun
@Daetalus
Aug 13 2015 07:29
:smile:
Vinzenz Feenstra
@vinzenz
Aug 13 2015 07:30
I make a git clean -fdx before I build it and just pushed it and then built it
so I had to rebuild everything, which costed time
Sun
@Daetalus
Aug 13 2015 08:47
Could someone give some helps about the failure in #830 gcc build? Thanks!
Or how to reproduce it in local machine?
Vinzenz Feenstra
@vinzenz
Aug 13 2015 08:51
~/pyston $ git clean -fdx
~/pyston $ export CC=gcc
Sun
@Daetalus
Aug 13 2015 08:51
Thanks!
Vinzenz Feenstra
@vinzenz
Aug 13 2015 08:51
~/pyston $ make
But I am not sure if that will work
Kevin Modzelewski
@kmod
Aug 13 2015 09:15
try make pyston_gcc
yeah the const errors are annoying
they're because CPython's C API uses (non-const) char* everywhere
Sun
@Daetalus
Aug 13 2015 09:16
thanks! but would you mind to check fresh build. Seems the fresh build has problem...
I don't familiar with CMake, sorry.
Kevin Modzelewski
@kmod
Aug 13 2015 09:17
fresh build of master?
Sun
@Daetalus
Aug 13 2015 09:18
yes.
Kevin Modzelewski
@kmod
Aug 13 2015 09:18
ok, trying it
what are you running into?
Sun
@Daetalus
Aug 13 2015 09:18
CMake Error at /home/sun/pyston2/build/Debug/build_deps/libunwind/src/libunwind-stamp/libunwind-configure.cmake:16 (message):
  Command failed: No such file or directory
Seems the makefile couldn't copy some directories.
Kevin Modzelewski
@kmod
Aug 13 2015 09:19
is there more output from that error?
I think it usually says something about a log file
where the real error is listed
Sun
@Daetalus
Aug 13 2015 09:20
yes. wait a minute, please
Kevin Modzelewski
@kmod
Aug 13 2015 09:21
you can also try
rm -rfv build/Debug/build_deps/libunwind
I think it's easy to get the build system into a confused state
when I tried a fresh build it worked, but once I removed the source directory, the build was unable to recover
until I blew away the libunwind build directory
Sun
@Daetalus
Aug 13 2015 09:23
The problem just gone... I don't know why...
Thank's for your time.
I will check the error. When it done. I will notice you.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 09:28
@kmod that build error: https://travis-ci.org/dropbox/pyston/builds/75394327 is also weird
@Daetalus try: else if (PyObject_AsCharBuffer(v, const_cast<char**>(&s), &len))
Sun
@Daetalus
Aug 13 2015 09:34
Ok, thank you! I will try it!
Just building Pyston.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 09:35
wait, I just see that this is not the problem
it's the PyOS_string_to_double
Sun
@Daetalus
Aug 13 2015 09:35
Yes...
Vinzenz Feenstra
@vinzenz
Aug 13 2015 09:36
line 99
check the comment on github
Sun
@Daetalus
Aug 13 2015 09:37
Ok, thanks very much!
Why Pyston console use two ">", not three?
Sun
@Daetalus
Aug 13 2015 09:58
Emm, still failed. But in my local machine, use pyston gcc build could pass these tests.
Kevin Modzelewski
@kmod
Aug 13 2015 10:04
we use >> just to distinguish
I think PyPy uses >>>>
and CPython doess >>> so I just picked >> :P
hmm the gcc error might depend on the version of gcc
@vinzenz that is a weird failure
fyi the last step of the build is to use pyston to self-host the building of our extension modules
which is where it looks like it's failing
Sun
@Daetalus
Aug 13 2015 10:06
Try again.
And Pyston console seems couldn't handle multiple lines
Kevin Modzelewski
@kmod
Aug 13 2015 10:07
yes, our repl is quite simple :(
Sun
@Daetalus
Aug 13 2015 10:18
Still failed... :worried:
Vinzenz Feenstra
@vinzenz
Aug 13 2015 10:20
@kmod the REPL will improve with the time, I also think that pypa should have a better API, however for now, that you're still falling back/allowing to fallback on CPython that is enough
@kmod I actually started thinking about having an API that allows to create the pyston DOM directly through some callbacks or some kind of AST adapter
Sun
@Daetalus
Aug 13 2015 10:22
I will try another way.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 10:22
@Daetalus let me have a look at the build error
@Daetalus interestingly now there were failing unit tests. The build went through
Sun
@Daetalus
Aug 13 2015 10:23
I found there has a Pyston change in from_cpython/Objects/floatobject.c commented these function out. Maybe there have some reason to comment it our at first time.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 10:24
that's quite probable
I am not that active in that
so I am not tracking it that much
most of the time I am more focused on the parser
if I can find the time :-)
Sun
@Daetalus
Aug 13 2015 10:24
I see.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 10:24
I usually work on other things :D
since I am working for a different company ;)
Sun
@Daetalus
Aug 13 2015 10:26
I always intersted in parser. Maybe I can make some contribution to pypa in futre. And I will find a company in several months later. :smile:
Vinzenz Feenstra
@vinzenz
Aug 13 2015 10:58
well pypa is not funded by any company
libpypa was created as part of my personal interest in writing a parser for already quite some time and also due to my interest in pyston ;-)
libpypa has, at least from my knowledge, no other use than pyston at this point
Sun
@Daetalus
Aug 13 2015 11:50
@vinzenz I know that. :smile:
Vinzenz Feenstra
@vinzenz
Aug 13 2015 11:52
@Daetalus I am right now trying to debug your PR to see where this is coming from
but I just started to build it, so it'll take a moment
Point is, that there seems to be a segmentation fault on freeing memory and I am trying to find where this comes from
Vinzenz Feenstra
@vinzenz
Aug 13 2015 11:58
It seems like that this happens during GC'ing
Sun
@Daetalus
Aug 13 2015 12:17
Thanks, I am wonder whether I should use a different implementation.
The original author comment this implementation may has reason. But I still try to fix it, too.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 12:28
hmm I can't reproduce that crash on my gcc, but I use gcc 4.9.2
the one of travis is using 4.8
Sun
@Daetalus
Aug 13 2015 12:29
I use 4.8.4, but also couldn't reproduce that crash.
Vinzenz Feenstra
@vinzenz
Aug 13 2015 12:40
could be that this was a compiler this code is triggering
@kmod what's your take on that?
Sun
@Daetalus
Aug 13 2015 12:44
It seems is before dawn at his local time.
Seems you are in Europe?
Vinzenz Feenstra
@vinzenz
Aug 13 2015 13:28
yeah I am
Kevin Modzelewski
@kmod
Aug 13 2015 19:55
oh shoot, the test failure is also in release mode
that one is pyston_release_gcc
I'm trying that now, will let you know
this wouldn't be the first time that there was a gcc crash due to an optimization that they do
that hit some sort of not-quite-standards-compliant part of our code
Kevin Modzelewski
@kmod
Aug 13 2015 22:49
hmm I haven't been able to reproduce that crash