These are chat archives for spyder-ide/public

7th
Nov 2018
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 07:52 UTC
@ccordoba12 Thansk for the reply. Spyder's variable explorer is indeed the best implementation out there and that's why I'm sugesting to improve it. As you may follow the discussion in the Numpy mailing list or the stackoverflow post , numpy arrays IMHO are not tabular data structures but rather nested lists. I think showing them in the form of tables is a little bit misleading. plus when going to higher dimentions then showing in the form of matrix is not easy. What I have sugested in those forums is to use unicode in the conventional consols or Markdown/LaTeX/HTML functionality in IPython/Jupyter to get a better representation. Think of the Pandas represents dataframes or Sympy pretty printing. I imlemented the unicode version myself and you may see it here and the HTML version by somebody else here.
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:03 UTC
@CAM-Gerlach is it ok if I issues a feature request here : https://github.com/spyder-ide/spyder/issues ?
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:04 UTC
A feature request for what novel feature, specifically?
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:04 UTC
@CAM-Gerlach mainly better representation of numpy ndarrays
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:04 UTC
Better how?
@fsfarimani_twitter

> As you may follow the discussion in the Numpy mailing list or the stackoverflow post , numpy arrays IMHO are not tabular data structures but rather nested lists.

If you indeed follow the discussion you linked, you'll note in direct reply to your comment that

> My understanding numpy ndarrays are not exactly multidimensional arrays as we know in mathematics but rather advanced python lists

The individual whom you replied to before, who apparently gave you that impression, replied with the following

> The data storage for ndarray is totally different from a list. docs.scipy.org/doc/numpy-1.15.0/reference/arrays.html. A 0d array is not quite the same as an array scalar which isn't quite the same as Python scalar

Therefore, the idea that numpy arrays are just "nested lists" as opposed to arrays is simply not correct. Furthermore, if we examine the reference they linked, nowhere in it is a list mentioned anywhere, and the only place in e.g. [the basic description of arrays](https://docs.scipy.org/doc/numpy-1.15.0/reference/arrays.ndarray.html) linked as the first section of that document where it is mentioned is under the list of conversion functions, ``ndarray.tolist()`` which converts the array to a nested list and returns it, among many other functions to convert it to different forms.

Furthermore, even Numpy arrays were just nested Python lists under the hood, it would all be quite immaterial since that's just a hidden implementation detail that may change at any point, while an array as a conceptual data structure, i.e. the high level presented to the user, is exactly as it is shown in Spyder, an n-dimensional matrix of scalar data. Only an incredibly small proportion of users, i.e. perhaps a few dozen Numpy core developers, would ever need to work with this implementation detail, whereas virtually all other users would only be concerned with how arrays are supposed to presented, i.e. as, well, arrays.
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:07 UTC
@CAM-Gerlach well, there are extensive discussions going on in the numpy maling list. but TL;dr : numpy ndarras are not matrices but rather homogeneous uniform nested lists. this shoul dbe considred in their representation. I have tried to show some examples here in this post.
2018-11-07_09-07-37.gif
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:13 UTC
Read my above post; I have read yours including the comments which dispute that notion. Even if that's exactly what they are under the hood (and it is very unclear from what you've presented that its actually the case), its completely immaterial to how they should be presented to the user, because it has nothing to do with the abstraction they represent. Its the same thing with a pandas DataFrame—suppose its represented under the hood as a list of numpy arrays. Would we want to show it as such, or as, well, an actual tabular dataframe, as we do?
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:13 UTC
@CAM-Gerlach Thansk a lot. I appreciate your considerations.
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:18 UTC
If you think showing them as tables is misleading, then really everything we show in the Variable Explorer—or, heck, print in the Console—is misleading since variable names/symbols themselves are just pointers to the actual objects, which are themselves just a series of bytes, which are a series of bits, which are a series of electrical pulses. At some point, one has to draw the line at the level of abstraction appropriate for the context. As Spyder is fundamentally an IDE for scientists, engineers and data analysts, not for the minute proportion of people who are CS majors specializing in data structures, it makes sense to show arrays (and DataFrames, lists, dicts, sets, tuples etc) at a level of abstraction appropriate for how they are used by a typical such individual, or really virtually any non-highly-specialized low-level core dev.
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:24 UTC

What I have sugested in those forums is to use unicode in the conventional consols

Spyder's "conventional consoles" are IPython/Jupyter kernels, since they offer a strict superset of the functionality of plain Python for scientific and analysis applications.

I imlemented the unicode version myself and you may see it here

Sorry, but I don't see how your output, this:

#┌─────────────┐
# [111 122 133]
# [21  22  23 ]
# [31  32  33 ]
#└─────────────┘

Offers any functional advantages over this:

image.png
And with regard to your above HTML example, sure that's cute for Jupyter I guess that doesn't offer anything better, but again, what meaningful improvement does it offer for non-toy-size arrays over our version above?
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:26 UTC
chrome_2018-11-07_09-25-37.png
Dear @CAM-Gerlach in above image you may see the diffrence. I would like to emphasise on the fact that numpy arrays are not matrices but nested lists. of course the way they are stored in the memory is different from python (I did not knew that from the begining). In my humble opinion representing numpy arrays in the form of matrices is not only limited but also misleading. I'm sorry if my idea is not apealing to you. I hope I could explain it better. thanks for your support.
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:39 UTC
@fsfarimani_twitter Okay, so I assume you're implying a more compact representation. Great! To note, the gap narrows considerably if you hit the "Resize" button in the Spyder ArrayEditor window to narrow the size by ~3x, and we have some improvements in default column sizing for Spyder 4; even as is with "Resize" turned out, arrays are displayed slightly more compactly than the pretty HTML widget. Your console printing representation offers more benefit in that regard and could fill somewhat of a different role than Spyder's variable explorer, but it would make much more sense implemented as an improved numpy default print() method, a specific array.pretty_print() method, or a feature in IPython or QtConsole; controlling what prints to the console in response to a specific function is the domain of something at those levels in the stack, so more people benefit and it really isn't something under the control of Spyder itself. Therefore, if there is interest in this, I'd encourage you to submit it as a PR to one of those places (Numpy, IPython, QtConsole; I'd start with the lowest level—NumPy—and work your way up) and see where it leads. Best of luck!
Foad Sojoodi Farimani
@fsfarimani_twitter
Nov 07 2018 08:45 UTC
@CAM-Gerlach well, compactness of the unicode prototype is actually not intended but an appreciated side effect. To better understand my issue with the current Spyder representation please consider the [1], [[1]] and [[[1]]] arrays. how do you represent them in the current Spyder variable explorer? in my implementation they are showed as :
chrome_2018-11-07_09-45-21.png
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:52 UTC

I'm sorry if my idea is not apealing to you.

No, your basic ideas for pretty-printing arrays are quite fine and sound like they've generated a lot of good discussion on the Numpy mailing list from what I've been reading (thanks for the link). I look forward to seeing an implementation in Numpy, IPython or both, depending on where they best end up, and any such improvements will of course propagate right up the line to be available in Spyder's consoles. I'm looking forward to seeing the results, and appreciate your work over there.

CAM Gerlach
@CAM-Gerlach
Nov 07 2018 08:58 UTC
Where I think the misunderstanding lies is twofold: First, with regard to either of the two representations, either in printing directly from Numpy to the console or the interactive IPython HTML representation, both of those exist at much lower levels than Spyder in the stack, in the original package being used itself (Numpy), in the interpreter (IPython) or in the console emulator (QtConsole). These are all great complements to Spyder's built-in Variable Explorer for specific usecases of pretty-printing arrays in a quick, compact and lower-level way, and not really at all replacements for Spyder's fuller-featured, higher-level Variable Explorer.
CAM Gerlach
@CAM-Gerlach
Nov 07 2018 09:10 UTC

Second, your continued instance on the notion that, at least with regard to the conceptual basis for what the data type represents, and how a user interacts with it

I would like to emphasize on the fact that numpy arrays are not matrices but nested lists. [...] In my humble opinion representing numpy arrays in the form of matrices is not only limited but also misleading

which the NumPy core developers who have actually designed and implemented said abstraction have repeatedly and thoroughly debunked. Even if it were unambiguously true, again, should we be representing DataFrames as lists of arrays? Or lists themselves as a series of pointers to objects and to the next linked item in the list? Of course not; we represent objects at the level of abstraction appropriate for the user to view and interact with the object as the thing it is intended to represent, not as individual bytes ordered exactly how they are in memory. If the user wanted to see that very low -level implementation, they would be manually inspecting the AST or the data in memory with a low-level debugger, not looking at a high-level representation in a data science IDE.

If that's lying to the user, then I don't see how anything short of printing the raw bytes in memory isn't, since everything above that is a successive layer of conceptual abstraction, just like displaying a numpy multidimensional array as a multidimensional array, rather than as the underlying low-level implementation.

CAM Gerlach
@CAM-Gerlach
Nov 07 2018 09:16 UTC
So again, to be clear, as soon as something like that gets implemented in Numpy, it would immediately be availibe in any Python interpreter, and that includes inside of Spyder and anywhere else where Numpy can be used. So you don't have to worry about us liking it or not :)