Fixed in scikit-hep/uproot4#58; with a short list of exceptions (things like TDirectory, TTree, TBranch...), objects from a ROOT file are now detached from the original file. Thus, it wouldn't be possible to use these objects to read more data (which is why TDirectory, TTree, etc. are exceptions). But this means that the detached objects can be pickled and don't contain any transients, like locks or threads.
You can even save an object from a ROOT file into a pickle file and read it back in a new Python process, even though that object's class was derived from data in the ROOT file. (We pickle enough derived quantities from the TStreamerInfo to reconstitute the class object.)
Does anyone have an answer to this: https://stackoverflow.com/questions/63813448/writing-boost-histograms-with-uproot
It would have to be Uproot3, since Uproot4 doesn't write anything yet.
I just looked into it and found a physt example, which handles variances:
It creates classes with the right names and the right fields, which Uproot 3 recognizes when assigning to a key of an output file (in __setitem__
). The recognition happens in
Considering how complicated this looks, it's a toss-up whether it's valuable to do it now, so that Uproot 3 will recognize and write boost-histogram and hist objects, or if it would be better to wait a month or two for me to add the file-writing to Uproot 4. The new interface would be more formal than this.
Maybe more than two months—I've claimed file-writing in Uproot 4 as a milestone for December 1, though.
Hist 2.0.0 is out! This is the result of the work @LovelyBuggies and I have been doing for Google Summer of Code 2020. Changes since Beta 1:
fig
dropped, new figures only created if needed.new.Reg(...).Double()
; not as magical but clearer types and usage.pip install "hist[plot]"
to request.The following new features were added:
flow=False
shortcut added.See more details at https://github.com/scikit-hep/hist
@LovelyBuggies @henryiii I think I already asked Henry this (sorry) but can you remind me how one can take two hist
objects with the same binning and add them? I see that in the example notebooks I have here https://github.com/matthewfeickert/heputils that just trying to fill a histogram
root_file = uproot.open("example.root")
mass_hists = [
heputils.convert.uproot_to_hist(root_file[key]) for key in root_file.keys()
]
stack_hist = mass_hists[0].copy()
stack_hist.reset()
for hist in mass_hists:
stack_hist.fill(hist)
stack_hist.plot1d()
mass_hists[0].plot1d()
is apparently not the right way to do things.
sum(mass_hists[1:], mass_hists[0])
would add them all.
mass_hists[0] + mass_hists[1]
results in
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-4-7bab2e7a86b3> in <module>
12 # mass_hists[0]
13 # help(stack_hist)
---> 14 mass_hists[0] + mass_hists[1]
/srv/conda/envs/notebook/lib/python3.7/site-packages/boost_histogram/_internal/hist.py in __add__(self, other)
222 def __add__(self, other):
223 result = self.copy(deep=False)
--> 224 return result.__iadd__(other)
225
226 def __iadd__(self, other):
/srv/conda/envs/notebook/lib/python3.7/site-packages/boost_histogram/_internal/hist.py in __iadd__(self, other)
227 if isinstance(other, (int, float)) and other == 0:
228 return self
--> 229 self._compute_inplace_op("__iadd__", other)
230
231 # Addition may change the axes if they can grow
/srv/conda/envs/notebook/lib/python3.7/site-packages/boost_histogram/_internal/hist.py in _compute_inplace_op(self, name, other)
261 def _compute_inplace_op(self, name, other):
262 if isinstance(other, Histogram):
--> 263 getattr(self._hist, name)(other._hist)
264 elif isinstance(other, _histograms):
265 getattr(self._hist, name)(other)
ValueError: axes not mergable
This is all helpful and makes me appreciate the concept of metadata more. I guess the good news is that uproot
will handle things in a more intelligent manner than I have been once some issues have been resolved, where what I have been (hackily) doing is the following https://github.com/matthewfeickert/heputils/blob/3dd0e858f002041a7ae15fc310d92ddb0ea4fe26/src/heputils/convert.py#L7-L32
If I just don't even attempt to handle the name then things work with the following:
import numpy as np
import uproot4 as uproot
import mplhep
import hist
from hist import Hist
import functools
import operator
mplhep.set_style("ATLAS")
def uproot_to_hist(uproot_hist):
values, edges = uproot_hist.to_numpy()
_hist = hist.Hist(
hist.axis.Regular(len(edges) - 1, edges[0], edges[-1]),
storage=hist.storage.Double(),
)
_hist[:] = values
return _hist
root_file = uproot.open("example.root")
mass_hists = [
uproot_to_hist(root_file[key]) for key in root_file.keys()
]
stack_hist = functools.reduce(operator.add, mass_hists)
stack_hist.plot1d()
for hist in mass_hists:
hist.plot1d()
ax._ax.metadata
!
__dict__
is not used.
It is explained at the end of the overview section in the Boost Histogram docs:
https://www.boost.org/doc/libs/develop/libs/histogram/doc/html/histogram/guide.html#histogram.guide.axis_guide.overview
It is something to add to the Rationale, too, I think.
https://www.boost.org/doc/libs/develop/libs/histogram/doc/html/histogram/rationale.html
Dear Hist/boost-histogram developers,
thank you for this great histogramming library, it is a pleasure to work with it :)
I have a question to you regarding fancy indexing on a StrCategory
axis.
My (Hist) histogram looks as follows:
In [38]: h
Out[38]:
Hist(
StrCategory(['GluGluToHHTo2B2VTo2L2Nu_node_cHHH2p45'], growth=True),
StrCategory(['ee', 'mumu', 'emu'], growth=True),
StrCategory(['nominal'], growth=True),
Regular(40, 0, 200, name='MET', label='$p_{T}^{miss}$'),
storage=Weight()) # Sum: WeightedSum(value=2608.44, variance=47.3505) (WeightedSum(value=2775.4, variance=50.5706) with flow)
Now I would like to "group" e.g. the "ee" and "emu" category together, which means that I'd like to do something like:
h[{"dataset": "GluGluToHHTo2B2VTo2L2Nu_node_cHHH2p45", "category": ["ee", "emu"], "systematic": "nominal"}]
Unfortunately this does not work, as the ["ee", "emu"]
is not a valid indexing operation.
Is there a way to do such a fancy indexing on StrCategory
axes? In case this is not supported, is there a nice workaround?
(I am basically looking for something, which works similar to coffea.Hist.group
method: https://github.com/CoffeaTeam/coffea/blob/master/coffea/hist/hist_tools.py#L1115)
Thank you already a lot in advance!
Best, Peter
Hist
, if this can be handled a bit more conveniently. What do you think?np.sum(h[{"dataset": "GluGluToHHTo2B2VTo2L2Nu_node_cHHH2p45", "systematic": "nominal"}].view()[h.axes["category"].index(["ee", "emu"]), ...], axis=0)