Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Angus Hollands
    @agoose77:matrix.org
    [m]
    Hmm - does anyone know of a long-term design doc for Hatch? I'm curious as to whether there's a plan to add a hatch add command (or something similar).
    Jonas Rübenach
    @jrueb
    I want to fill a histogram many times with the same observations, the only thing that is different is the weights (systematic uncertainties). Instead of simply filling the histogram repeatedly, I assume one could save computation time here because finding the bin indices needs to be one only once. Is this possible in boost_histogram?
    Angus Hollands
    @agoose77:matrix.org
    [m]
    Henry Schreiner
    @henryiii
    I think you are waiting for boostorg/histogram#211 ?
    Angus Hollands
    @agoose77:matrix.org
    [m]
    Oh yes, I think I'm over-simplifying the problem!
    Jonas Rübenach
    @jrueb

    I think you are waiting for boostorg/histogram#211 ?

    Thanks, that seems to be it

    Hans Dembinski
    @HDembinski
    As an intermediate solution, you can have several different weights in the same histogram by adding an integer or category axis which enumerates the different weight counters
    The idea behind boostorg/histogram#211 is just to make this a bit more efficient
    Henry Schreiner
    @henryiii
    Histogram talk in 30 mins: https://indico.cern.ch/event/1133099/
    Eduardo Rodrigues
    @eduardo-rodrigues
    :+1:
    Henry Schreiner
    @henryiii
    A few small fixes in Hist 2.6.1, just released. It won’t try to produce a fancy repr if it can’t make a fast one now, and just falls back on standard reprs. Also doesn’t import SciPy unless you need it, important for something else we are working as well as using Hist in WebAssembly - SciPy is huge and takes several seconds to load in a browser (more with a bad connection) (it does cache, like pip).
    Blaise Delaney😷
    @BlaiseDelaney_twitter

    Hello experts, I am trying to use the Mean storage using Hist in order to save the average of the weights in each bin. I currently am using the following Hist.

        # fill hist with all variables
        h = (
        Hist.new.Reg(
            30, 
            np.min(predictions["TwoBody_PT"]), 
            np.max(predictions["TwoBody_PT"]), 
            name="bpt"
            )
        .Reg(
            30, 
            np.min(predictions["TwoBody_FDCHI2_OWNPV"]), 
            np.max(predictions["TwoBody_FDCHI2_OWNPV"]), 
            name="fd"
        )
        .Double() # would like to replace with .Mean()
        .fill(
            bpt=np.array(predictions.TwoBody_PT),
            fd=np.array(predictions.TwoBody_FDCHI2_OWNPV), 
            weight=np.array(predictions.preds_per_cand) # the weight is really the response of a NN
            )
        )

    However, when I try using .Mean(), I get the following error: ValueError: Sample array must be 1D. Could you help me understand what am I doing wrong? Many thanks in advance.

    Hans Dembinski
    @HDembinski
    Hi Blaise, sorry for the late reply, I assumed that Henry is going to answer.
    It is sad to see you and many others using Hist instead of boost-histogram, the underlying library. The hist syntax goes against common design principles, namely that one should not make half-constructed objects.
    Regarding the error, you need to check the shape of what you pass to weight=
    This should be a 1D array, not 2D
    Hans Dembinski
    @HDembinski
    The Mean storage supports weights, so I am not sure why it should work with Double but not with Mean
    Angus Hollands
    @agoose77:matrix.org
    [m]
    I'm not sure that that rule-of-thumb applies here: the Hist.new method returns a construction proxy, which doesn't have invalid (aka half-constructed) states (it's just the builder pattern). One might prefer to construct the axes and construct the hist with them, à la boost-histogram, but I think that's a matter of API preference.
    Henry Schreiner
    @henryiii
    Sorry, I had a response but it never got sent. I don’t check Gitter often enough these days. You have to provide a sample=. That’s what the mean is over. We could probably improve the error message.
    Henry Schreiner
    @henryiii
    Angus is right - the construction proxy is not a partially contructed histogram. There are no methods other than adding a new axis and finalizing the proxy with the selection of a storage type - that produces the histogram. It’s not “bad API”, it’s just an alternative choice. Hist is intetend for users, and boost-histogram is intended to be built on and is not really intended to be used directly - you’ve said so yourself. Some users really enjoy the QuickConstruct system - it’s quite nice on the command line since you don’t need extra imports and can just type as you think. But Hist fully suppports classic contruction as well (and I tend toward that one if I’m not typing live).
    Angus Hollands
    @agoose77:matrix.org
    [m]
    @henryiii: I know this is fairly bad as bug reports go, but I'm wondering if I've overlooked something w.r.t to the density argument of mplhep.histplot? I expect it to normalise the area (np.sum(np.diff(e) * b)) but it seems to be doing something else.
    5 replies
    Angus Hollands
    @agoose77:matrix.org
    [m]

    (not urgent)

    Could I get a sanity check that I'm seeing a bug rather than misusing Hist?

    Henry Schreiner
    @henryiii:matrix.org
    [m]
    Second one will slice and sum on the slice (no underflow bin). First one will sum over the flow bins too, which includes everything you removed with the first cut
    0:len:sum will avoid flow bins
    Angus Hollands
    @agoose77:matrix.org
    [m]
    Ah ok, I did wonder about the flow.
    Thanks!
    Angus Hollands
    @agoose77:matrix.org
    [m]
    Try again - should project be ignoring my slice?
    I thought I'd messed that up, but I think the example is correct.
    Alexander Held
    @alexander-held

    How would I best go about scaling a histogram in a way that depends on a categorical axis?

    import hist
    
    h = hist.Hist.new.Reg(3, 0, 3).StrCat(["a", "b"], name="cat").Double()
    h.fill([1,1,2], cat="a")
    h.fill([0], cat="b")
    
    h[:, "a"] *= 2

    This is what I naively tried out, which is hitting a TypeError: Not supported yet via boost-histogram. My use case in practice here is that I gather the same kind of distributions for a lot of different processes, and want to scale the distributions in a process-dependent manner.

    2 replies
    Farouk Mokhtar
    @faroukmokhtar
    hi, i have a hist question... i have hist axis as follows hist2.axis.Regular(bins, range[0], range[1], name=var, label=var, overflow=False)
    it is producing a histogram with 0 counts... but when i change overflow=True the histogram gets filled
    Screen Shot 2022-04-21 at 10.47.14 AM.png
    Screen Shot 2022-04-21 at 10.47.46 AM.png
    these are the results
    Farouk Mokhtar
    @faroukmokhtar
    what i don't understand is the 0 counts histogram when overflow=False, because i can see that there are counts that are within the range so it should fill something
    Angus Hollands
    @agoose77:matrix.org
    [m]
    @alexander-held: you can directly modify the contents, e.g
    h[...] = h.values() * [[2, 1]]
    Alexander Held
    @alexander-held
    That works, thanks @agoose77:matrix.org! For the example above, that means
    h[:, "a"] = h[:, "a"].values() * 2
    1 reply
    Henry Schreiner
    @henryiii
    I’ll try to get to these by early next week, maybe sooner, am off for spring break. The values trick should work and making it nicer is in the plans.
    heatherrussell
    @heatherrussell
    how does one plot a 2d histogram with weighted data in boost_histogram? the example does't work in this case
    1 reply
    Henry Schreiner
    @henryiii
    With Hist you can call .plot(). What example? You likely need .values() instead of .view() or native conversion? That’s a guess.
    Yes, that’s it. Use .values(). Or use Hist (built in plotting) or use mplhep (native support for histograms)
    The view is a structured array
    Henry Schreiner
    @henryiii
    def plothist2d(h):
        return plt.pcolormesh(*h.axes.edges.T, h.values().T)
    heatherrussell
    @heatherrussell
    perfect, thanks!
    L61
    @L61
    Is there a way to make Hist1/Hist2 behave like TEfficiency? Or alternatively, a way to edit variances? I've seen this mentioned in a couple of issues but I'm unsure of the best way to bypass the problem