Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Henry Schreiner
    @henryiii
    Ahh, the breakage happens with the final line, not the earlier part. Good, faith in my testing has been restored. I was sure I tested somethign like the hist[…] line. Okay, interesting; this code wasn’t supposed to have triggered here. Working on fixing now.
    Technically, this is still “unimplemented” behavior - see scikit-hep/boost-histogram#276 - but if it used to happen to work, it should continue to. And it was planned anyway. Sorry, thought you were using hist.view() /= 2.
    Henry Schreiner
    @henryiii
    Found the bug. 🤦
    Henry Schreiner
    @henryiii
    Static typing would have caught this. (Even just by forcing me to be aware of the possible arguments to __setitem__)
    alexander-held
    @alexander-held
    Thanks a lot Henry!
    Henry Schreiner
    @henryiii
    Boost-histogram 0.12 is out. It’s mostly a bugfix release, fixing several important bugs, with a few minor things. The current develop branch has PlottableProtocol, which needs a little more time before being released, but feel free to try it out if you like the bleeding edge!
    Hans Dembinski
    @HDembinski
    :-D
    Hans Dembinski
    @HDembinski
    Hi all, there is currently a push in scipy to include Boost as a dependency. This is initiated by the wish to replace current implementations in scipy.special and scipy.stats with those in Boost.Math. Someone else (not me!) then brought up that one could then also base histograms and the scipy.stats.binned_statistics on Boost.Histogram.
    I then told them about boost-histogram and the possible performance increases. Not anything I could work on anytime soon, but exciting to see interest in Scipy about this sort of thing.
    Someone also mentioned the problem with incremental filling, which is currently not efficiently possible with numpy.histogram, which boost-histogram solves.
    I was told that scipy.stats.binned_statistics is a pure python implementation currently, so basing it on Boost.Histogram would surely speed up that code.
    Jim Pivarski
    @jpivarski
    Cool! I hope that happens! Might that mean that boost-histogram (the Python interface) would become part of SciPy? That would strongly serve to standardize histogramming in Python.
    Hans Dembinski
    @HDembinski
    It does not seem impossible
    Eduardo Rodrigues
    @eduardo-rodrigues
    Excellent news @HDembinski! Looking forward to further news on that front.
    Henry Schreiner
    @henryiii
    Sorry, currently, it is impossible. Boost.Histogram requires C++14, and SciPy ships manylinux1 wheels. You can’t build the C++14 code in Boost.Histogram with GCC 4.8.2. However, the PyPA plans to drop manylinux1 this summer, so at some point this year, in theory, SciPy/NumPy/etc. will all stop shipping manylinux1 wheels, and suddely the minimum compiler anyone cares about will be much higher (8.3.1 for manylinux2010), making this possible for the first time!
    It’s going to be a big deal, because pip 9 can’t download manylinux2010 wheels, which is the main reason they still ship manylinux1 wheels today. And recently I found if you use Ubuntu 18.04, and use the python-3.8 package, you still get pip 9 even on Python 3.8!!! (which is totally unsupported). I hate distro packaging sometimes…
    Henry Schreiner
    @henryiii
    NumPy has already taken the first baby step by dropping the manylinux1 wheel for Python 3.9
    Hans Dembinski
    @HDembinski
    Good point, Henry, but I think that this is not something to happen on a short time-scale anyway
    They are currently debating whether they want to depend on Boost at all (although this seems to have some support...)
    Boost.Histogram was only brought up because Boost.Math is so intertwined with other Boost libs that it is not feasible to extract only Boost.Math, they then have to depend on all of Boost anyway.
    Interestingly, the point about C++14 may also come up when they start to use Boost.Math. The implementations in Boost.Math depend on varying versions of the C++ standard, because the maintainers leave it to the contributors which standard they prefer (which is not the best idea IMHO)
    Henry Schreiner
    @henryiii
    The PlottableHistogram Protocol is now released in boost-histogram 0.13.0, hist 2.1.0, uproot 4, and mplhep 0.2.16!
    Henry Schreiner
    @henryiii
    Good bye, Python 2.7 and 3.5! scikit-hep/boost-histogram#512 :tada:
    Matthew Feickert
    @matthewfeickert
    Nice! Congrats @henryiii and @HDembinski!
    Hans Dembinski
    @HDembinski
    @henryiii deserves all the credit :)
    Henry Schreiner
    @henryiii
    Boost-histogram 1.0 and Hist 2.2 have been released! :tada:
    Eduardo Rodrigues
    @eduardo-rodrigues
    BIG congrats on this great milestone :+1: !
    Hans Dembinski
    @HDembinski
    Also from me!
    Jan Pipek
    @janpipek
    Nice! Congrats! (and sorry that I am not able to follow everything and did not even find the time to make physt compatible with some of the Protocols you established yet)
    Henry Schreiner
    @henryiii
    In the very near future, I probably could lend a hand. :)
    alexander-held
    @alexander-held

    Congratulations on boost-histogram 1.0! I'm adopting the new API for subclassing, and saw in https://boost-histogram.readthedocs.io/en/latest/usage/subclassing.html that family=object() is recommended when only overriding Histogram. What is the difference between object() and object? While trying to understand this, I noticed that object is object is True and object() is object() (are those instances?) is False. Is the latter part an issue given the following?

    It just has to support is, and be the exact same object on all your subclasses.

    Henry Schreiner
    @henryiii
    object is a class; classes are singletons, there’s just one. object() is an instance, and you can make as many as you want, each will live in a different place in memory, check with id(). Basically, family= can be anything that supports is which is literally everything, with the exception of the Module boost_histogram (as that’s already taken by boost-histogram). The Module hist would be a bad choice too, as then your axes would come out randomly Hist's or your own. The old way works fine, FAMILY = object() at the top of the file, then use family=FAMILY when you subclass. But for most users, a handy existing object is the module you are in, that is, “hist” or “boost_histogram”. It’s unique to you, and is descriptive. You can use family=None (or the object class, anything works), you just don’t want some other extension to also use the same one - then boost-histogram won’t be able to distinguish between them when picking Axis, Storage, etc. If all you use is Histogram, though, then it really doesn’t matter.
    One use for object() is to make a truly unique object. For example, if I make NOTHING=object(), then use def f(x =NOTHING): if x is NOTHING, I can now always tell if someone passed a keyword argument in. They can’t make NOTHING, they have to pull NOTHING out of my source and using it from there, you can’t “remake” it accidentaly.
    Henry Schreiner
    @henryiii

    The ideal way would have been the following:

    class Hist(bh.Histogram):class Regular(bh.axis.Regular, parent=Hist)

    The problem with this would have been it is very hard to design without circular imports, as Histogram almost always has Axis usages in it. It can be done, but would have requried changes to boost-histogram and user code, which also has to follow this strict regimen. Using a token is much simpler; it doesn’t require as much caution in user code (or boost-histogram).

    alexander-held
    @alexander-held
    Thanks! I was wondering whether somehow an instance of the class inheriting from Histogram would create a new object() and then not match the object() in the family definition, but from what I understand now this is not what happens - this object is created once when the class is defined, and any other class that may also inherit defined in my code with family=object() would pick up a different object and be unique too.
    Henry Schreiner
    @henryiii
    The idea of family=object() is only valid if you don’t have any custom Axes, as you can’t (without knowing that family is stored as ._family, anyway) access the family after you’ve made it inline there.
    alexander-held
    @alexander-held
    Thanks Henry!
    Henry Schreiner
    @henryiii

    If I added a default for family for Histogram, it would have been object(). I could special case None, that is, if family=None, it just makes an object() for you.

    I could also make that the default for Histogram, and only require family= on the other subclasses. But if you have an Axis or other subclass, you have to go back and add family= on the Histogram; that’s why I force it to always be delt with on Histogram, it prepares you for also subclassing other components. I didn’t really think too much about only subclassing Histogram.

    By the way, can’t you do

    import cabinetry
    class Histogram(bh.Histogram, family=cabinetry): ...

    ? That would allow to easily add subclasses for axes eventually if you needed to customize them later.

    alexander-held
    @alexander-held

    Yes, I could use that too. I was looking at object() following the documentation:

    If you only override Histogram, just use family=object().

    The additions in my histogram class are rather lightweight and I don't expect to go deeper and subclass axes. On the other hand I see no downside of family=cabinetry either.

    Henry Schreiner
    @henryiii
    I’ll at least update the docs a bit in the future; the None update should be simple too.
    4 replies
    Matthew Feickert
    @matthewfeickert

    @henryiii @jpivarski Can you tell me if this is a hist Issue or a uproot Issue or neither? https://gist.github.com/matthewfeickert/ab6ac8677aad2e04738111d0af3e0549

    (There's a Binder link in the Gist if you want to play with it in browser)

    10 replies
    Henry Schreiner
    @henryiii
    @matthewfeickert Shouldn't that be np.sqrt(hist.values())?
    Nicholas Smith
    @nsmith-
    I remember hearing bh has a sparse storage option, but I can't find it in the docs, is that something implemented in the python binding?
    19 replies
    Matthew Feickert
    @matthewfeickert

    @henryiii @jpivarski Another followup question on moving from root files to hist.Hist histograms via uproot: Is there any way to be able to use uproot's .to_hist() API to get a hist.Hist with storage=hist.storage.Weight()? Or at the moment should I just write a little converter like I did here?

    https://github.com/matthewfeickert/heputils/issues/24#issuecomment-800867686

    Jim Pivarski
    @jpivarski
    Currently, the storage depends on whether the ROOT histogram has a Sumw2 in it. If you're not getting weighted storage, then your histogram must not have one (barring bugs, of course).
    3 replies
    The Uproot interface is supposed to be minimal, just a bridge to get you into the boost-histogram it hist package. If you need a function that creates trivial weights or specified weights for a histogram with no weights, that sounds like something the histogramming libraries should cover.
    Henry Schreiner
    @henryiii
    boost-histogram 0.13.1, 1.0.1, and hist 2.2.1 released.
    Henry Schreiner
    @henryiii
    Boost.Histogram team is @HDembinski, congats to him :)