All right, next problem (sorry if I'm overly bugging you all): histogram fitting. I can do this fairly easily on ROOT, but I'm having a lot of trouble on Python. For whatever reason, I can't seem to find a tutorial that includes this. All the fitting tutorials give errors when I try to fit a 2D histogram. I'm making the histogram as I did in the above post:
plt.hist2d(mipVref, mipVref, bins=[150, 50], cmap=plt.cm.get_cmap("afmhot"))
I've tried curve_fit and Model, but to no avail. Any pointers to a specific method to fit 2D histos? Thanks!
I checked out a few things, and am hitting a snag when it comes to getting the function portion down. Here's my relevant code (much of which I adapted from others on Stack Exchange):
H, xedges, yedges = np.histogram2d(mipVref, mipVref, bins=[100, 100]) def centers(edges): return edges[:-1] + np.diff(edges[:2])/2 xcenters = centers(xedges) ycenters = centers(yedges) pdf = interp2d(xcenters, ycenters, H) plt.pcolor(xedges, yedges, pdf(xedges, yedges), cmap=plt.cm.get_cmap("hot"))
Sorry for the long post!
New question: I have a working model using Keras with Tensorflow background. My goal is to get the final function coded into ROOT as that's what we would need for the metric we're making. So, how do I make sense of the weights? Here's the relevant code:
model = Sequential() model.add(Dense(16, input_dim=16, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(data[:, 1:], data[:, 0], epochs=30, batch_size=50)
I'm trying it with only one relu layer with 2 neurons in order to get a feel for how it all works, and it returns the following array sizes:
(16, 16), (1, 16), (16, 1), (1, 1)
Just from dimensional analysis (I'm using 16 inputs to get 1 output), I would think that the 16 inputs will combine with the (16, 16) array to give a (1, 16) array which adds the (16, 1) array elements and then the last (16, 1) gives a single element to use in the sigmoid (with the (1, 1) value being the additional term). Taking weights times inputs as w.x, and added to their modifiers. So this would be:
w1.x+b1 = x1, for all b1 indexed
w1 -> (16, 16), b1 -> (1, 16)
w2 -> (16, 1), b2 -> (1, 1)
Do I understand this correctly? I'm trying to hand-reproduce the final predictions from the model so I can run them in ROOT.
@henryiii I think that's what I'm looking for; I was trying to reconstruct the mathematics of the handoffs, but I think I was missing a layer or two in the exchange. It seems it's not quite as simple as I had it (thought reLU would be like a delta plus and sigmoid like a Boltzman). I'm reading through that git now. Thanks!
@tunnell It's not nearly as ambitious as running TensorFlow on ROOT; I'm just looking for a way to execute the prediction model that was generated. I thought it could be exported in a simple, mathematical formula (as all it's doing, in the end, is putting weights on inputs and using those for an output). I have the shape and weights printed out; basically, I'm looking to reconstruct the prediction algorithm (not further refine it or anything like that; I'm considering all training done once I get out of Python) for use in ROOT. If it's just a mathematical formula, which I should think it would have to be, it ought to be readily programmable to any language without much fuss, no?
datetimefirst? It may be an issue with your install. You could try uninstalling and reinstalling NumPy.
(h1 - h2)/(h1 + h2). Boost-histogram and Hist have basic math for histograms with the same axis: I know they have addition and subtraction, and they probably have division as well. The histograms are assumed to have Poisson statistics, and though ROOT's documentation doesn't say it, I would assume that
h2are assumed to be independent.
That is indeed true - but a look at the source code reveals that in reality there is also weighting being applied to the result. I was wondering if it had already been implemented somewhere before I "reinvent the wheel", so to speak!
I think it would hopefully see some use, @henryiii. I work in hadron structure, and a lot of observables boil down to some sort of beam-spin asymmetry!
(Sorry that I'm reposting this everywhere; I want everyone to be warned.)
The Awkward/Uproot name transition is done, at least at the level of release candidates. If you do
pip install "awkward>=1.0.0rc1" "uproot>=4.0.0rc1"
you'll get Awkward 1.x and Uproot 4.x. (They don't strictly depend on each other, so you could do one, the other, or both.)
If you do
pip install "awkward1>=1.0.0rc1" "uproot4>4.0.0rc1"
you'll get thin awkward1 and uproot4 packages that just bring in the appropriate awkward and uproot and pass names through. This is so that
uproot4.whatever still works.
If you do
pip install awkward0 uproot3 # or just uproot3
you'll get the old Awkward 0.x and Uproot 3.x that you can
import ... as .... This also brings in
uproot3-methods, which is a new name just to avoid compatibility issues with old packages that we saw last week.
All of the above are permanent; they will continue to work after Awkward 1.x and Uproot 4.x are full releases (not release candidates). However, the following will bring in old packages before the full release and new packages after the full release.
pip install awkward uproot
So it is only the full release that will break scripts, and only when users
pip install --update. I plan to take that step this weekend, when there might be fewer people actively working. It also gives everyone a chance to provide feedback or take action with
import ... as ....
(Sorry for the reposting, if you saw this message elsewhere.)
Probably the last message about the Awkward Array/Uproot name transition: it's done. The new versions have moved from release candidates to full releases. Now when you
pip install awkward uproot
without qualification, you get the new ones. I think I've "dotted all the 'i's of packaging" to get the right dependencies and tested all the cases I could think of on a blank AWS instance.
pip install awkward0 uproot3returns the old versions (Awkward 0.x and Uproot 3.x). The prescription for anyone who needs the old packages is
import awkward0 as awkwardand
import uproot3 as uproot.
pip install awkward1 uproot4returns thin wrappers of the new ones, which point to whatever the latest
uprootare. They pass through to the new libraries, so scripts written with
import awkward1, uproot4don't need to be changed (though you'll probably want to, for simplicity).
uproot-methodsno longer causes trouble because there's an
uproot3-methodsin the dependency chain:
uproot3. The latest
uproot-methods(no qualification) now excludes Awkward 1.x so that they can't be used together by mistake.
@dano0014 What happens when you try
puls.array() # get it as an Awkward Array (fast, if possible)
puls.array(library="np") # get it as a NumPy array of Python objects
That is, try to use the default interpretation (which can be seen without the table limitations using
Oh, I see: the
Pulse class failed to get an interpretation, but instead of raising an error (as it should), it returned
None. That would be a bug. I can investigate it if you post it as a GitHub Issue with the original file (
Hi there, I have a question regarding cuts on trees in uproot. I find that if I have some tree with a number of branches, some cuts work and some do not: looks as if only for the same physics object...
tree = uproot.open(<filename>:<treename>) df1 = tree.arrays("el_pt","(el_pt>100000)") df2 = tree.arrays("mu_pt","(el_pt>100000)")
df1 is produced fine; df2 produced the following error:
File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/uproot/behaviors/TBranch.py", line 1181, in arrays self.object_path, File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/uproot/language/python.py", line 480, in compute_expressions output[name] = output[name][cut] File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/awkward/highlevel.py", line 993, in __getitem__ tmp = ak._util.wrap(self.layout[where], self._behavior) ValueError: in ListArray64 attempting to get 0, index out of range (https://github.com/scikit-hep/awkward-1.0/blob/1.4.0/src/cpu-kernels/awkward_ListArray_getitem_jagged_apply.cpp#L46)
Is there an alternative way to apply such a cut, without loading all branches of the TTree into a Pandas dataframe which I want to avoid for memory reasons.
mu_pt are jagged arrays (different number of pT values in each event) and they are different from each other (event i can have a different number of electrons than muons), then this cut is not valid for the reason shown:
el_pt>1000000 is a cut on electrons, "select electrons for which the electron pT is greater than 1000000 GeV", which can't be applied to muons because there isn't necessarily a muon for each electron.
If you intended to cut events, perhaps "select events for which some electron pT is greater than 1000000 GeV," then that would be
ak.any(el_pt > 1000000)
There are other choices, like "select events in which all electron pTs are greater than 1000000," "select events in which the average electron pT is greater than 1000000," etc. Those other possibilities would be constructed using other Awkward Array functions. (In case you haven't found the Awkward Array documentation, here's the page on
I'd encourage you to do this interactively, either in a terminal or in a Jupyter notebook, and to not use the cut string option in Uproot. That would give you more insight into what's going on because you'd be able to look at the
el_pt array before and after calling
> 1000000 on it, and also look at the
mu_pt. Noticing a difference in the length of the first event might have been a clue as to what was going on: this shouldn't be like a black box. (I've been having a lot of misgivings about the
cut option in
TTree.arrays because it's hiding information from people. It was supposed to be a shortcut.) To do it without the
cut option, start with
print(tree.keys(filter_name=["el_pt", "mu_pt"])) # tests a filter events = tree.arrays(filter_name=["el_pt", "mu_pt"]) # reads everything that passes the filter el_pt = events["el_pt"] mu_pt = events["mu_pt"]
(See TTree.arrays for more on the filter arguments.)
The cut can be constructed as
good_events = ak.any(el_pt > 1000000) el_pt[good_events] mu_pt[good_events] events[good_events] # can also apply it to the whole package in one step
Interactively, you should see a distinction between
el_pt > 1000000 (a jagged array of booleans) and
ak.any(el_pt > 1000000) (a flat array of booleans).