These are chat archives for ipython/ipython

16th
Feb 2015
Andreas Klostermann
@akloster
Feb 16 2015 09:30
@SylvainCorlay the C++ standard is still lacking a Guido construct, which would be necessary to have a pep8.
Jason Grout
@jasongrout
Feb 16 2015 15:47
@minrk - I wrote a js test for the serializing synchronization, and when I ran the tests, I got:
Can't test binary websockets on phantomjs.
so I guess the file at least can be parsed....
Min RK
@minrk
Feb 16 2015 17:14
@jasongrout yup, phantomjs doesn't support a lot of things. That's why we are planning to stop using it.
Jason Grout
@jasongrout
Feb 16 2015 17:23
well, phantomjs 2.0, released a few weeks ago, may support current websockets
I gave up checking when they said they don't have binary packages for linux yet, though
Min RK
@minrk
Feb 16 2015 17:27
when I was testing it, Casper didn't run on phantom 2
It's all a mess. I don't really have anything good to say about our experience with Casper and/or phantom so far.
Jason Grout
@jasongrout
Feb 16 2015 17:32
ah, forgot about the casper bit.
okay, well, end result is that I don't know how to automatically run the js test at ipython/ipython#7780
Jason Grout
@jasongrout
Feb 16 2015 17:55
@minrk - when I transmit a binary message from js to python, the buffers attribute is of this type: <zmq.sugar.frame.Frame object at 0x7fabd8b5bd70>
what is that?
(so msg['buffers'][0] is of that type, if I sent one binary buffer from js to python)
I was hoping for a memoryview() object with bytes
Min RK
@minrk
Feb 16 2015 18:04
It's little more than a memory view. It provides the buffer interface, so if you call memoryview or bytes on it, you should get what you expect.
Jason Grout
@jasongrout
Feb 16 2015 18:06
okay, thanks.
is it useful to expose the frame to a comm and/or widget user? It seems like the only extra thing it really has is the zmq_more flag, which isn't really useful at the comm layer
Jason Grout
@jasongrout
Feb 16 2015 18:21
(why don't we call memoryview() on all of the buffer frames at the comm message layer)
Jason Grout
@jasongrout
Feb 16 2015 18:28
huh, when I call memoryview() on the buffer that should contain about 24 bytes, using this code:
for i in msg['buffers']:
                    x = memoryview(i)
                    print x, x.shape, len(x), x.itemsize, x.strides, x.tobytes(), 'done'
I get: <memory at 0x7f9b4806f478> None 1 1 None done
Min RK
@minrk
Feb 16 2015 18:30
huh, maybe I'm wrong. You can use attr access - frame.buffer for the view, frame.bytes for the bytes
Probably should

why don't we call memoryview() on all of the buffer frames at the comm message layer

probalby should, or even at the lowest Session level

I think pyzmq might be implementing the buffer interface wrong
Jason Grout
@jasongrout
Feb 16 2015 18:33
in the above code, it seems that calling i.buffer and memoryview(i) gives the same thing.
but the memoryview seems nonsensical
Min RK
@minrk
Feb 16 2015 18:33
The data is right, but the shape field isn't populated correctly
call view.tobytes() and you will see the bytes
I wonder how that happened
Jason Grout
@jasongrout
Feb 16 2015 18:34
you'll notice that I do call tobytes(), and it doesn't print anything out.
oh, maybe I should print the repr
okay, printing the repr(i.tobytes()) gives all 0 bytes.
which shouldn't be right either
at least it's 24 zero bytes
so it's the right size
Min RK
@minrk
Feb 16 2015 18:38
And what does frame.bytes give you?
Jason Grout
@jasongrout
Feb 16 2015 18:40
all zero bytes as well
Min RK
@minrk
Feb 16 2015 18:42
okay, then that's what's being received
either that, or this is the first case of that information being incorrect.
Jason Grout
@jasongrout
Feb 16 2015 18:44
I'll check the bytes that are being sent.
I do see that https://github.com/zeromq/pyzmq/blob/32355426b41facc0010a07a5254b80481fb542c0/zmq/backend/cython/message.pyx#L213 seems to indicate all of the shape and other attributes are set to null, etc.
this indicates we should be setting some of those attributes to reflect the length: https://docs.python.org/2/c-api/buffer.html#the-new-style-py-buffer-struct
Jason Grout
@jasongrout
Feb 16 2015 18:49
ndim==0 means you have a scalar
hence len(x)==1
Min RK
@minrk
Feb 16 2015 18:53
Yup, it's an easy fix (PR forthcoming)
I've only ever tested getting the bytes out, and passing to numpy.frombuffer, which work.
zeromq/pyzmq#646
Jason Grout
@jasongrout
Feb 16 2015 18:55
cool!
that was fast -- I was still reading the buffer docs.
Min RK
@minrk
Feb 16 2015 18:58
Helps that I had written all this code before, I just hadn't tested this part.
Jason Grout
@jasongrout
Feb 16 2015 18:58
for now, I suppose we can create a memoryview from the bytes object at the lowest level
Min RK
@minrk
Feb 16 2015 18:59
Or just use the bytes
Jason Grout
@jasongrout
Feb 16 2015 18:59
I hesitate to use bytes because of python 2/3 things. I'm not sure when something will mess up.
but I guess there is no difference between a memoryview on the bytes, and just using the bytes themselves, right?
Min RK
@minrk
Feb 16 2015 19:00
Not really, no
I wouldn't worry about that for the message stuff - pyzmq is very explicitly "always bytes, never text"
Jason Grout
@jasongrout
Feb 16 2015 19:00
okay, I'll go find where session is getting the buffer stuff out, and get the bytes out of the frame
Min RK
@minrk
Feb 16 2015 19:01
Actually, now that I think of it, your first thought was probably right
Jason Grout
@jasongrout
Feb 16 2015 19:01
python 2/3 problems?
Min RK
@minrk
Feb 16 2015 19:01
No, using memoryview instead of bytes
not because of 2/3, but because of zero-copy
The reason we use copy=False is for things like receiving numpy arrays
on the buffers
Jason Grout
@jasongrout
Feb 16 2015 19:02
and the first .bytes makes a copy
Min RK
@minrk
Feb 16 2015 19:02
yes
Jason Grout
@jasongrout
Feb 16 2015 19:02
well, until pyzmq gets a new release, we have to use bytes :)
Min RK
@minrk
Feb 16 2015 19:03
Yes and no - you can't make the change in a way that will break the numpy zero-copy
I wonder why Py_buffer.len is unused
there's an assertion that len is equal to sum(shape), but it appears that len(memoryview) computes sum(shape) rather than reading len directly.
Doesn't really matter, I guess
Jason Grout
@jasongrout
Feb 16 2015 19:05
Ah, so the session message buffers are currently used to send numpy arrays out in the wild. And people are used to it being a zmq frame, so they can do zero-copy things. so we can't break that. Is that it?
Min RK
@minrk
Feb 16 2015 19:06
yes
Jason Grout
@jasongrout
Feb 16 2015 19:07
the zmq frame .buffer didn't work before, though, and .bytes made the copy. How did they do zero-copy??
Min RK
@minrk
Feb 16 2015 19:07
it worked fine
Jason Grout
@jasongrout
Feb 16 2015 19:08
what did you just fix, then?
Min RK
@minrk
Feb 16 2015 19:08
the shape information in the Python API didn't present correctly, but apparently the C-API numpy uses was still fine
although...
Maybe it didn't, actually
Jason Grout
@jasongrout
Feb 16 2015 19:09
oh, that was the numpy.frombuffer you're talking about.
question is what the numpy.frombuffer does
Min RK
@minrk
Feb 16 2015 19:09
I'm remembering a known failure that baffled me a long time ago, and maybe this is why it was failing
let me check on that real quick
Also I should be clear that 'out in the wild' is still inside IPython in terms of the Frame->object, so we should still be able to figure out something.
Jason Grout
@jasongrout
Feb 16 2015 19:11
oh, I thought maybe some of the parallel processing serialization code used it to transfer arrays
Min RK
@minrk
Feb 16 2015 19:12
yes, but the user API still gets arrays. IPython handles the serialization to pyzmq, from buffers, etc.
Ha, yes.
So while the API did work, it actually forced a copy
since memoryview.tobytes() worked, maybe numpy.frombuffer was falling back on that in the absence of shape information
Jason Grout
@jasongrout
Feb 16 2015 19:15
interesting
Min RK
@minrk
Feb 16 2015 19:17
so a similar workaround could probably be done:
buffers = [ frame.buffer for frame in buf_frames ]
if buffers and buffers[0].shape is None:
    # force copy to workaround pyzmq #646
    buffers = [ memoryview(frame.bytes) for frame in buf_frames ]
actually, use memoryview(frame) instead of frame.buffer for consistency. frame.buffer is a Python 2 buffer on py2.
but memoryviews are available on 2.7, and IPython doesn't support 2.6 (pyzmq still does, though)
Min RK
@minrk
Feb 16 2015 19:25
Which should be fine, since I was wrong that frombuffer worked without copying on Python 3.
Jason Grout
@jasongrout
Feb 16 2015 19:25
PR coming up...
Min RK
@minrk
Feb 16 2015 19:26
Awesome
Sylvain Corlay
@SylvainCorlay
Feb 16 2015 19:36
found a bad, hairy widget bug
if both the model and the view of a widget are loaded via the requirejs mechanism
and you try to display right away. The loading of the model module may not be complete.
I can see if there is a simple fix
Jason Grout
@jasongrout
Feb 16 2015 19:54
@minrk: see ipython/ipython#7798
Jason Grout
@jasongrout
Feb 16 2015 20:13
@minrk - I'm now having some issues with IPython/html/base/zmqhandlers.py.
I do see some serialize and deserialize functions in there that deal with buffers. Do you know what the zmqhandlers.py file is for?
Min RK
@minrk
Feb 16 2015 20:15
it's where the base websocket<->zmq stuff is defined
Jason Grout
@jasongrout
Feb 16 2015 20:16

hmm...for some reason, in serialize_binary_messages, I'm now getting an error on line 54:

        return b''.join(buffers)
    TypeError: sequence item 2: expected string, memoryview found

I'll try to track it down

Min RK
@minrk
Feb 16 2015 20:17
since that code wasn't using zero-copy, it assumed bytes
Jason Grout
@jasongrout
Feb 16 2015 20:17
but I guess my changes to session probably change the assumption, since now buffers are memoryviews
right?
Min RK
@minrk
Feb 16 2015 20:17
In master, when copy=True, frames are bytes, when copy=False, frames are zmq.Frame objects
Jason Grout
@jasongrout
Feb 16 2015 20:18
This message was deleted
Min RK
@minrk
Feb 16 2015 20:18
Sorry, that's backward, yes
(fixed)
Jason Grout
@jasongrout
Feb 16 2015 20:18
(hehe...)
deleted
Min RK
@minrk
Feb 16 2015 20:19
Buffers have really been an internal API, so they haven't been thought about too carefully.
I think it would make sense to do either:
  1. preserve the current pattern, but return memoryview on copy=False
  2. handle copy internally, and always return a memoryview
My guess is your PR intends to do 1., but ends up doing 2. because it doesn't check whether the buffer frames are already bytes.
I think both are equally reasonable, we just have to pick.
Jason Grout
@jasongrout
Feb 16 2015 20:22
my PR ignores copy
Min RK
@minrk
Feb 16 2015 20:23
Do you think always returning a memoryview makes the most sense, or let the copy behavior determine the return type?
Jason Grout
@jasongrout
Feb 16 2015 20:23
I personally like type stability
Min RK
@minrk
Feb 16 2015 20:23
I think that makes sense for message buffers
Jason Grout
@jasongrout
Feb 16 2015 20:24
But I can make a copy if copy=True (by using the frame.bytes)
Min RK
@minrk
Feb 16 2015 20:24
if copy=True, frame will be bytes
since zmq returns Frame for copy=False, bytes for copy=True
so the first call to view(frame) is actually view(bytes)
Jason Grout
@jasongrout
Feb 16 2015 20:25
zmq copy != session.deserialize(copy=...)
Min RK
@minrk
Feb 16 2015 20:25
...False?
Jason Grout
@jasongrout
Feb 16 2015 20:25
I'm talking about the copy argument to session.deserialize
this is above the zmq layer copy parameter
Min RK
@minrk
Feb 16 2015 20:26
But it should always have the same value
It's the same parameter
Jason Grout
@jasongrout
Feb 16 2015 20:27
ah, right, because of how deserialize is called
Min RK
@minrk
Feb 16 2015 20:32
So I think the only fix that should be needed is the telling the binary serialization in zmqhandlers to expect memoryviews
basically, b''.join([ bytes(buf) for buf in buffers ])
Jason Grout
@jasongrout
Feb 16 2015 20:33
I don't get the logic at https://github.com/ipython/ipython/blob/2a8a2c87afc9cc2d1f81ee82b5c18d050b83726d/IPython/kernel/zmq/session.py#L820 It seems to create a copy of the non-buffer message parts if copy is False.
but it doesn't seem to do anything if copy is True
Min RK
@minrk
Feb 16 2015 20:33
Correct, those are the front of the message - JSON parts, etc.
Jason Grout
@jasongrout
Feb 16 2015 20:34
so create a copy if copy is False?
Min RK
@minrk
Feb 16 2015 20:34
copy only really affects buffers
but the arg to zmq affects the whole multipart message
in order to pass it to JSON, we get the bytes out.
Jason Grout
@jasongrout
Feb 16 2015 20:34
ah, so if copy=True, the whole message will be copied before we get it in deserialize. If copy is False, then we still want to copy the first 5 parts.
Min RK
@minrk
Feb 16 2015 20:35
yes
well, less "want to copy", more "want bytes objects", but the result is the same.
Jason Grout
@jasongrout
Feb 16 2015 20:35
I'm going to add an explanatory remark to the code...
Min RK
@minrk
Feb 16 2015 20:35
excellent
much of that internal stuff hasn't been revisited since it was experimental in 2011
Jason Grout
@jasongrout
Feb 16 2015 20:36
so if copy is True, then the memoryview(msg_list[6]) line should just work, because it will be a bytes object
it only wasn't working before because copy apparently was False
Min RK
@minrk
Feb 16 2015 20:40
right, the reserialization expected bytes objects
it just needs to expect memoryviews, which it can do by casting at any point
or maybe there's an efficient way to do it with memoryviews
Jason Grout
@jasongrout
Feb 16 2015 20:44
I'm looking into it
Sylvain Corlay
@SylvainCorlay
Feb 16 2015 20:59

So in manager.js, Line 181, we have

        model.state_change = model.state_change.then(function() {

            return utils.load_class(model.get('_view_name'), model.get('_view_module'),
            WidgetManager._view_types).then(function(ViewType) {

where model.get('_view_name') and model.get('_view_module') return undefined. It seems to be due to the update message being processed after the display message

Jason says he fixed already.
Jason Grout
@jasongrout
Feb 16 2015 20:59
Friday.
sylvain and I should talk more :)
Sylvain Corlay
@SylvainCorlay
Feb 16 2015 21:00
(at the moment, we shout at a distance)
Jason Grout
@jasongrout
Feb 16 2015 21:44
@minrk - b''.join([memoryview, memoryview]) works fine in python3, and I think is efficient as can be expected (there is one copy, but probably only one copy).
ufortunately, it doesn't work in python2
Jason Grout
@jasongrout
Feb 16 2015 21:51
@minrk - is it allowed to make python 2/python 3 branches of code?
Min RK
@minrk
Feb 16 2015 22:27
yes, definitely
Jason Grout
@jasongrout
Feb 16 2015 22:31
what's the recommended thing to test to determine python2/python3?
Min RK
@minrk
Feb 16 2015 22:51
IPython.utils.py3compat.PY3