hatch add
command (or something similar).
weight
argument cover your needs? https://boost-histogram.readthedocs.io/en/latest/usage/storage.html?highlight=weight#double
Hello experts, I am trying to use the Mean
storage using Hist
in order to save the average of the weights in each bin. I currently am using the following Hist.
# fill hist with all variables
h = (
Hist.new.Reg(
30,
np.min(predictions["TwoBody_PT"]),
np.max(predictions["TwoBody_PT"]),
name="bpt"
)
.Reg(
30,
np.min(predictions["TwoBody_FDCHI2_OWNPV"]),
np.max(predictions["TwoBody_FDCHI2_OWNPV"]),
name="fd"
)
.Double() # would like to replace with .Mean()
.fill(
bpt=np.array(predictions.TwoBody_PT),
fd=np.array(predictions.TwoBody_FDCHI2_OWNPV),
weight=np.array(predictions.preds_per_cand) # the weight is really the response of a NN
)
)
However, when I try using .Mean()
, I get the following error: ValueError: Sample array must be 1D
. Could you help me understand what am I doing wrong? Many thanks in advance.
weight=
Hist.new
method returns a construction proxy, which doesn't have invalid (aka half-constructed) states (it's just the builder pattern). One might prefer to construct the axes and construct the hist with them, à la boost-histogram
, but I think that's a matter of API preference.
density
argument of mplhep.histplot
? I expect it to normalise the area (np.sum(np.diff(e) * b)
) but it seems to be doing something else.
(not urgent)
Could I get a sanity check that I'm seeing a bug rather than misusing Hist?
0:len:sum
will avoid flow bins
project
be ignoring my slice?
How would I best go about scaling a histogram in a way that depends on a categorical axis?
import hist
h = hist.Hist.new.Reg(3, 0, 3).StrCat(["a", "b"], name="cat").Double()
h.fill([1,1,2], cat="a")
h.fill([0], cat="b")
h[:, "a"] *= 2
This is what I naively tried out, which is hitting a TypeError: Not supported yet
via boost-histogram
. My use case in practice here is that I gather the same kind of distributions for a lot of different processes, and want to scale the distributions in a process-dependent manner.
overflow=True
the histogram gets filled
h[...] = h.values() * [[2, 1]]