Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Quansight
    @quansightai_twitter
    [Ralf Gommers] there are also sanely designed ndarray subclasses, like in astropy, the only real problem for those is that due to np.matrix there are so many places where subclasses are coerced to ndarrays. in principle there’s no fundamental other issue with subclassing (although indeed composition can often be a better solution)
    Quansight
    @quansightai_twitter
    [Aaron Meurer] I'm not sure if Liskov is properly applied there. Taken to its logical conclusion, you cannot have any nontrivial subtypes, because they will create behavior that isn't true for the original type
    [Michael Eaton] @ralf.gommers np.matrix coercion?
    [Michael Eaton] @asmeurer abstract base classes would be a rather obvious, but unenlightened example.
    [Ralf Gommers] @meaton most functions start with x = np.asarray(x). subclasses gone
    [Aaron Meurer] which isn't to say that I disagree that float should not be a rational in the numbers heirarchy (the stdlib numbers heirarchy is a bit misguided IMO)
    Quansight
    @quansightai_twitter
    Quansight
    @quansightai_twitter

    [Stefan Krah] @asmeurer This is how Liskov is generally used, at least on python-dev it serves the purpose of rejecting subclasses that one does not like. :)

    But seriously, how can the principle hold for nontrivial subclasses?

    [Ralf Gommers] An array subclass with units is a good example. It carries around units, updates them for multiplications etc, and raises exceptions when units don't add up. It extends behavior but does not change how existing methods/attributes work. The problem with things like np.matrix are that it changes behavior (like 0-D return values become 2-D)
    [Aaron Meurer] I don't know. Wikipedia uses the phrase "desirable property".
    [Aaron Meurer] imo np.matrix has more sane semantics, which is a good thing (though only for multiplication)
    Quansight
    @quansightai_twitter
    [Stefan Krah] But then I think it is correctly applied, since the subclass violates quite a fundamental property (exact results) of the superclass.
    [Aaron Meurer] perhaps that's more an issue of "matrix semantics on array are bad", which doesn't necessarily imply that it is good to correct them in a subclass
    [Aaron Meurer] I'm curious where does np.matrix break? Doesn't broadcasting take care of 0-D vs. 2-D for the most part?
    [Aaron Meurer] of course broadcasting isn't great semantics for matrix in the first place. That's why I said "only for multiplication". np.matrix([1]) + np.matrix([[1, 2], [3, 4]]) works, which is not good
    [Ralf Gommers] definitely not, it breaks in all sorts of ways (indexing to start with). anyway, np.matrix is terrible, that's not the interesting discussion. the interesting one is which subclasses are and aren't a good idea. the units one is easy, but there's always a gray area. making fixed rules for those is not trivial. certainly just quoting a principle isn't enough
    [Aaron Meurer] maybe not interesting to you :)
    Quansight
    @quansightai_twitter
    [Aaron Meurer] I disagree that np.matrix is terrible. I've had code where I had problems with np.array that went away when I switched to np.matrix, because matrix has better semantics (always being 2-D being one of them).
    [Aaron Meurer] So I am interested in the other side of that
    [Aaron Meurer] my biggest thing is that I want proper type checking—for things to fail when shapes mismatch. The above behavior for 1x1 + 2x2 is one issue with that. The way numpy handles 1D @ 2D is another (which is solved by np.matrix).
    Quansight
    @quansightai_twitter
    [Ralf Gommers] terrible as a subclass I meant. if you do linalg, by all means create a matrix object. just don't make it an ndarray subclass. Liskov does apply there
    [Aaron Meurer] a matrix type that doesn't attempt to be a subclass could be good actually. Then it could be much more strict about semantics, rather than automatically inheriting everything from array
    [Aaron Meurer] I don't know if something like that already exists
    Quansight
    @quansightai_twitter
    [Stefan Krah] Type checking in xnd isn't done on the object level. You can typedef "matrix" as N * M * Scalar and get type-checking in both xnd.xnd and xnd.array.
    [Stefan Krah] You could define xnd.matrix with a constructor that only allows construction of values that match this pattern.
    [Ralf Gommers] not sure about a non-subclass matrix, but are you aware of PyLops?
    [Aaron Meurer] no. I'll take a look
    Quansight
    @quansightai_twitter
    [Stefan Krah] No, I bookmarked it.
    Dave Hirschfeld
    @dhirschfeld

    [Stefan Krah] We could be fancy and check for the narrowest type among the input arguments and return that, but this sort of thing slows down small array operations.

    I'm pretty interested in keeping the overhead as minimal as possible so if that means having to specify the type (cls) then so be it however perhaps we can get the best of both worlds by skipping the type guessing heuristics in the case that cls is not None?

    [Pearu Peterson] Similarly for multiple argument cases, the first argument type defines the type of the output objects (default behavior).

    I'd be fine with this choice - so long as it's consistent it can be accomodated. I guess the other option would be return the type of the first common base but I assume that would have a much larger performance impact.

    Dave Hirschfeld
    @dhirschfeld
    As to whether or not sin should be a function or a method - this is an issue I see also in numpy in that the boundary for whats a method vs what's a function is somewhat blurred and opaque.
    For the most part I like the method-call version as it can be clearer if you're chaining several operations together
    I often get bitten by the fact that there is no abs method - only the function version
    I quite like pandas' concept of function namespaces - e.g. df.dt, df.str which hold functions for datetime and string operations respectively
    Dave Hirschfeld
    @dhirschfeld
    xnd.array could possibly use a similar concept where the xnd.array.ufunc "namespace" exposed the gumath.functions namespace with the first argument substituted for self
    Along with Pearu's suggestion that cls=type(args[0]) by default, this would ensure that the method-call versions returned the type of the instance
    Dave Hirschfeld
    @dhirschfeld
    e.g.
    class A(xnd): 
        pass
    
    class B(xnd):
        pass
    
    a = A([1])
    b = B([2])
    
    c = gumath.functions.add(a, b)
    assert type(c) == type(a)
    
    c = a.ufunc.add(b)
    assert type(c) == type(a)
    
    c = gumath.functions.add(b, a)
    assert type(c) == type(b)
    
    c = b.ufunc.add(a)
    assert type(c) == type(b)
    Dave Hirschfeld
    @dhirschfeld
    ...but cls can still be overridden:
    c = gumath.functions.add(a, b, cls=type(b))
    assert type(c) == type(b)
    
    c = a.ufunc.add(b, cls=type(b))
    assert type(c) == type(b)
    Anyway, just a thought. On the downside xnd.array.ufunc is a bit of a mouthful ¯_(ツ)_/¯
    Quansight
    @quansightai_twitter
    [Travis Oliphant] It’s true that you can’t just quote a principle nor is it always obvious whether inheritance or composition is a good idea. But, but, I wish I (or anyone else on the NumPy list at the time) had known about the Liskov substitution principle because it would have definitely led to np.matrix not being a sub-class of ndarray
    [Travis Oliphant] Because of the fundamental changes to the interface that were made (which were not merely additive). '
    [Travis Oliphant] Adding units (reasonable use of subclass)
    [Travis Oliphant] Masked arrays are a gray area for me. They can reasonably be subclasses.
    Quansight
    @quansightai_twitter
    [Hameer Abbasi] If we define Liskov in the following way: For a deterministic Operation on the parent class, data produced could have a child class, but the parent’s attributes and operations on that child should be identical to if the parent was passed in; or themselves satisfy Liskov.
    Quansight
    @quansightai_twitter

    [Hameer Abbasi] Then xnd.array is a valid subclass of xnd.xnd, but this doesn’t hold for MaskedArrays or matrix objects.

    astropy.Quantity is valid sometimes and sometimes errors, assuming consistency of units and no conversion. If inconsistent units are supplied, it violates Liskov in this form.

    If we have a much more relaxed definition, such that the dtype and the shape have to be consistent (type in xnd-speak), and errors are allowed, then MaskedArray and unit become valid.

    Quansight
    @quansightai_twitter
    [XND]
    Event starting in 15 minutes:
    XND meeting
    May 10th, 2019 from 9:00 AM to 10:00 AM GMT-0400
    Quansight
    @quansightai_twitter
    [Travis Oliphant] Good illustration that Liskov alone does not resolve the question -- but it does provide useful scaffolding for the conversation that will help you make a decision you will be happy with later.
    Quansight
    @quansightai_twitter
    [XND] There is 1 event this week
    XND meeting
    May 17th, 2019 from 9:00 AM to 10:00 AM GMT-0400
    Quansight
    @quansightai_twitter
    [XND]
    Calendar event was cancelled
    XND meeting
    May 17th, 2019 from 9:00 AM to 10:00 AM GMT-0400
    [Hameer Abbasi] @stefan.krah Is it possible to keep the meeting open by any chance? I had some productive discussion at BIDS I’d like everyone to hear.