Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
Activity
Skagamaster
@Skagamaster
Thank you!
Skagamaster
@Skagamaster

All right, next problem (sorry if I'm overly bugging you all): histogram fitting. I can do this fairly easily on ROOT, but I'm having a lot of trouble on Python. For whatever reason, I can't seem to find a tutorial that includes this. All the fitting tutorials give errors when I try to fit a 2D histogram. I'm making the histogram as I did in the above post:

plt.hist2d(mipVref[0], mipVref[1], bins=[150, 50],
cmap=plt.cm.get_cmap("afmhot"))

I've tried curve_fit and Model, but to no avail. Any pointers to a specific method to fit 2D histos? Thanks!

Doug Davis
@douglasdavis
Check out the returns of hist2d (here); you get (h, xedges, yedges, image); you can define your function f(x, y) and make the points h the dependent variable in your fit using curve_fit (don't forget to convert xedges and yedges to the center of the bins); here's an example of a 2D gaussian fit.
Skagamaster
@Skagamaster

I checked out a few things, and am hitting a snag when it comes to getting the function portion down. Here's my relevant code (much of which I adapted from others on Stack Exchange):

H, xedges, yedges = np.histogram2d(mipVref[0], mipVref[1], bins=[100, 100])

def centers(edges):
return edges[:-1] + np.diff(edges[:2])/2

xcenters = centers(xedges)
ycenters = centers(yedges)

pdf = interp2d(xcenters, ycenters, H)

plt.pcolor(xedges, yedges, pdf(xedges, yedges), cmap=plt.cm.get_cmap("hot"))

The issue is the plot is almost an inverse relationship to the actual plot. It should look like this:
image1
But it looks like this:
image2

Skagamaster
@Skagamaster
That worked! OK, so that's the "incorrect" image. Here's the "correct" one:
Skagamaster
@Skagamaster

Sorry for the long post!
New question: I have a working model using Keras with Tensorflow background. My goal is to get the final function coded into ROOT as that's what we would need for the metric we're making. So, how do I make sense of the weights? Here's the relevant code:

model = Sequential()
model.fit(data[:, 1:], data[:, 0], epochs=30, batch_size=50)

I'm trying it with only one relu layer with 2 neurons in order to get a feel for how it all works, and it returns the following array sizes:
(16, 16), (1, 16), (16, 1), (1, 1)

Just from dimensional analysis (I'm using 16 inputs to get 1 output), I would think that the 16 inputs will combine with the (16, 16) array to give a (1, 16) array which adds the (16, 1) array elements and then the last (16, 1) gives a single element to use in the sigmoid (with the (1, 1) value being the additional term). Taking weights times inputs as w.x, and added to their modifiers. So this would be:
w1.x+b1 = x1, for all b1 indexed
w1 -> (16, 16), b1 -> (1, 16)
1/(1+exp(-w2.x1+b2))
w2 -> (16, 1), b2 -> (1, 1)

Do I understand this correctly? I'm trying to hand-reproduce the final predictions from the model so I can run them in ROOT.

To clarify: the array sizes returned there are for the code I shared. Please ignore the "I'm trying it with only one relu layer with 2 neurons in order to get a feel for how it all works." I'll be working that to reproduce initially from a smaller data set to get a feel for the code; I should have omitted that. Sorry!
Skagamaster
@Skagamaster
The above doesn't recover what the model predicts for smaller sets.
Chris Tunnell
@tunnell
As the question is more about how to get things into ROOT, you might have more luck on a ROOT channel. There are ways to export your models (https://machinelearningmastery.com/save-load-keras-deep-learning-models/) but I don't remember ROOT's neural network codes being anywhere nearly as advanced as TensorFlow so you might hit issues.
It lets you print out things about the shapes too.
The math is sort of hard to parse in the chat, but when you just print the model description, it should tell you about the shapes at each step
I'm sort of lost on the question though
@Skagamaster
Henry Schreiner
@henryiii
@Skagamaster that general idea should work. this is what is done here, if you would like an example: https://github.com/lwtnn/lwtnn
There they use Eigen to do the matrix calcs to avoid ROOT dependencies. Model format is JSON.
Skagamaster
@Skagamaster

@henryiii I think that's what I'm looking for; I was trying to reconstruct the mathematics of the handoffs, but I think I was missing a layer or two in the exchange. It seems it's not quite as simple as I had it (thought reLU would be like a delta plus and sigmoid like a Boltzman). I'm reading through that git now. Thanks!

@tunnell It's not nearly as ambitious as running TensorFlow on ROOT; I'm just looking for a way to execute the prediction model that was generated. I thought it could be exported in a simple, mathematical formula (as all it's doing, in the end, is putting weights on inputs and using those for an output). I have the shape and weights printed out; basically, I'm looking to reconstruct the prediction algorithm (not further refine it or anything like that; I'm considering all training done once I get out of Python) for use in ROOT. If it's just a mathematical formula, which I should think it would have to be, it ought to be readily programmable to any language without much fuss, no?

Raahul Singh
@Raahul-Singh
Hi everyone!
I'm a sophomore computer science and engineering undergrad from the Indian Institute of Information Technology Sri City, India.
I'm new to Open Source Contributions. Though I really wish to do my part and contribute in anyway to CERN.
Could anyone please guide me here? đ
Chris Tunnell
@tunnell
Thereâs more to HEP than CERN :) and more to particle physics than HEP :) is look at google summer of code. Or look at HEP repositories on Github to see if any issues marked âgood for newcomersâ. Typically doing something small makes people more willing to engage
Here is partial list of projects https://github.com/hsf-training/PyHEP-resources
Eduardo Rodrigues
@eduardo-rodrigues
All good points. For Google Summer of Code hints, see info from the last edition at https://hepsoftwarefoundation.org/activities/gsoc.html âŚ
V Abhijith Rao
@VANRao-Stack
Hello! My name is V Abhijith Rao and I'm am currently working towards the GSoC project on hist plotting. If anybody isn't aware it's a histogramming like library similar to Physt, but faster as it's based of off Boost Histogram. I was wondering whether anyone had any suggestions for the project, like any tools they would like to see integrated into the hist class or any shortcut they would like to see. If so, please do contact me and let me know about the same. Thanks in advance.
Henry Schreiner
@henryiii
@HDembenski might be able to help you before I can (moving to a house today), but Iâll try to check in the near future. Can you try manually importing NumPy before importing iminuit?
rajeevneutrino
@rajeevneutrino
Thank you @henryiii. Befor iminuit, i did 'import numpy' but the same error occurs (when I do in the terminal). On the other hand in the IDLE, 'import numpy' works but not the iminuit. By the way, I am not sure that I can ask such questions here!. Please let me know.
Jim Pivarski
@jpivarski
I would recommend asking this question on iminuit's GitHub Issues page. Then it would get its own threadânot get lostâand be linked to pull requests if this is a bug that needs to be fixed. (I just wrote a big post on #HSF/PyHEP about whether we should be redirecting questions like this right away.)
Henry Schreiner
@henryiii
A little investigation before opining an issue doensât hurt, expecially for newcomers. But I was about to suggest itâs now worth an issue. :)
2 replies
I see similar errors (in fact, this looks very familiar) : numpy/numpy#14377 & numpy/numpy#14474 & NVIDIA/DIGITS#299
Can you try importing datetime first? It may be an issue with your install. You could try uninstalling and reinstalling NumPy.
naidoo88
@naidoo88
Hi all, I was wondering if someone could advise me on whether something along the lines of the TH1F:GetAsymmetry method in ROOT anaylsis framework exists in any of Scikit-HEP packages?
Jim Pivarski
@jpivarski
According to ROOT's documentation, the "asymmetry" between h1 and h2 is (h1 - h2)/(h1 + h2). Boost-histogram and Hist have basic math for histograms with the same axis: I know they have addition and subtraction, and they probably have division as well. The histograms are assumed to have Poisson statistics, and though ROOT's documentation doesn't say it, I would assume that h1 and h2 are assumed to be independent.
Henry Schreiner
@henryiii
A shortcut for this might be a worthwhile additon to hist.
naidoo88
@naidoo88

Hi @jpivarski,
That is indeed true - but a look at the source code reveals that in reality there is also weighting being applied to the result. I was wondering if it had already been implemented somewhere before I "reinvent the wheel", so to speak!

I think it would hopefully see some use, @henryiii. I work in hadron structure, and a lot of observables boil down to some sort of beam-spin asymmetry!

naidoo88
@naidoo88
Hi again all,
I am currently playing around with boost_hist. Is it possible to use the project method to project only a range of bins?
Specifically, I would like to select N-bins of the x axis, and project this slice of the total sample to the y-axis.
Henry Schreiner
@henryiii
@naidoo88 Yes, use h[3:8:sum, :] - this will sum over bins 3-7 of the x axis and remove it, and will leave the y axis.
naidoo88
@naidoo88
Thank you @henryiii, it was sum I was missing. Appreciate it. Just wanna say the more I explore the package the more I appreciate it. Great work!
Jim Pivarski
@jpivarski

(Sorry that I'm reposting this everywhere; I want everyone to be warned.)

The Awkward/Uproot name transition is done, at least at the level of release candidates. If you do

pip install "awkward>=1.0.0rc1" "uproot>=4.0.0rc1"

you'll get Awkward 1.x and Uproot 4.x. (They don't strictly depend on each other, so you could do one, the other, or both.)

If you do

pip install "awkward1>=1.0.0rc1" "uproot4>4.0.0rc1"

you'll get thin awkward1 and uproot4 packages that just bring in the appropriate awkward and uproot and pass names through. This is so that uproot4.whatever still works.

If you do

pip install awkward0 uproot3    # or just uproot3

you'll get the old Awkward 0.x and Uproot 3.x that you can import ... as .... This also brings in uproot3-methods, which is a new name just to avoid compatibility issues with old packages that we saw last week.

All of the above are permanent; they will continue to work after Awkward 1.x and Uproot 4.x are full releases (not release candidates). However, the following will bring in old packages before the full release and new packages after the full release.

pip install awkward uproot

So it is only the full release that will break scripts, and only when users pip install --update. I plan to take that step this weekend, when there might be fewer people actively working. It also gives everyone a chance to provide feedback or take action with import ... as ....

Jim Pivarski
@jpivarski

(Sorry for the reposting, if you saw this message elsewhere.)

Probably the last message about the Awkward Array/Uproot name transition: it's done. The new versions have moved from release candidates to full releases. Now when you

pip install awkward uproot

without qualification, you get the new ones. I think I've "dotted all the 'i's of packaging" to get the right dependencies and tested all the cases I could think of on a blank AWS instance.

• pip install awkward0 uproot3 returns the old versions (Awkward 0.x and Uproot 3.x). The prescription for anyone who needs the old packages is import awkward0 as awkward and import uproot3 as uproot.
• pip install awkward1 uproot4 returns thin wrappers of the new ones, which point to whatever the latest awkward and uproot are. They pass through to the new libraries, so scripts written with import awkward1, uproot4 don't need to be changed (though you'll probably want to, for simplicity).
• uproot-methods no longer causes trouble because there's an uproot3-methods in the dependency chain: awkward0 â uproot3-methods â uproot3. The latest uproot-methods (no qualification) now excludes Awkward 1.x so that they can't be used together by mistake.
Henry Schreiner
@henryiii
If anyone finds this useful, I recently wrote and taught this: https://henryiii.github.io/level-up-your-python
Eduardo Rodrigues
@eduardo-rodrigues
HSF PyHEP Topical Meetings
As discussed at the PyHEP 2020 workshop, we're starting a series of topical meetings, loosely organized around a different Python module each month. So far, we have the following lined up:
â˘ February 3, 2021: Numba presented by Jim Pivarski
â˘ March 3, 2021: JAX presented by Hans Dembinski
â˘ April 7, 2021: pyhf presented by Giordon Stark, Lukas Heinrich, and/or Matthew Feickert
â˘ continuing on the first Wednesday of each month.
Each of these will be one hour, starting at 16:00 Central European time (CERN), which is 10am U.S. Eastern, 7am U.S. Pacific, midnight in Tokyo, and 8:30pm in India.
Next Wednesday's Numba tutorial will be presented on Zoom (Indico agenda) with an interactive Jupyter notebook in Binder (GitHub repo). No registration is required; just show up if you're interested!
(See the intro slides and notebook to get a sense of what is planned!)
dano0014
@dano0014
Hello all, I am having trouble with accessing a specific piece of data in a root file while using uproot. Within the ROOT file there are a few TTrees, but the only one that is relevant is the Event TTree. Within that Event TTree there is a group that shows up with the typename "std::vector<ChannelData> channels" as each event has some # of channels. Each channel class has a member, "pulse". The pulse member shows up with the typename "std::vector<Pulse> pulses". I am looking for the information in the Pulse group, but it is showing up as a pointer (I think). The interpretation says "AsObjects(AsArray(True,False,None" but that is all that I can see. In ROOT the code would look like "event->channels[0].pulses[0].nph" and I would like to know how to convert that to python. I have also sent the Jupyter Notebook so you can better understand what I mean. Thank you for any help you can provide.
Jim Pivarski
@jpivarski

@dano0014 What happens when you try

puls.array()       # get it as an Awkward Array (fast, if possible)

or

puls.array(library="np")   # get it as a NumPy array of Python objects

That is, try to use the default interpretation (which can be seen without the table limitations using puls.interpretation).

Oh, I see: the Pulse class failed to get an interpretation, but instead of raising an error (as it should), it returned None. That would be a bug. I can investigate it if you post it as a GitHub Issue with the original file (rawdaq_1810251534.root).

https://github.com/scikit-hep/uproot4/issues/new?assignees=&labels=bug+%28unverified%29&template=bug-report.md&title=

dano0014
@dano0014
@jpivarski thank you for the speedy response. The bug report is #288 in github now
Jim Pivarski
@jpivarski
Since Gitter automatically made a link to an HSF repo, the actual bug report is at scikit-hep/uproot4#288.
naidoo88
@naidoo88
Hi all,
I am trying to figure out if I can divide two histograms with boost_histogram. I can see scalar division in the documentation, but nothing for dividing one histogram by another.
25 replies
Skagamaster
@Skagamaster
@jpivarski I was at the tutorial for the STAR collaboration, and I recall you had listed some sources for using vector and hist classes in Python, but I can't find them. Might you have a link to some info on those?
Jim Pivarski
@jpivarski
@Skagamaster There were some examples in the tutorial notebook itself, but beyond that, I don't have anything but the documentation for these projects (also linked to from the tutorial). This is a good place to ask, as there may be other people here with a good answer for you.
Ethan Simpson
@els285

Hi there, I have a question regarding cuts on trees in uproot. I find that if I have some tree with a number of branches, some cuts work and some do not: looks as if only for the same physics object...

tree = uproot.open(<filename>:<treename>)
df1 = tree.arrays("el_pt","(el_pt>100000)")
df2 = tree.arrays("mu_pt","(el_pt>100000)")

df1 is produced fine; df2 produced the following error:

  File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/uproot/behaviors/TBranch.py", line 1181, in arrays
self.object_path,
File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/uproot/language/python.py", line 480, in compute_expressions
output[name] = output[name][cut]
File "/home/linuxbrew/.linuxbrew/opt/python/lib/python3.7/site-packages/awkward/highlevel.py", line 993, in __getitem__
tmp = ak._util.wrap(self.layout[where], self._behavior)
ValueError: in ListArray64 attempting to get 0, index out of range

(https://github.com/scikit-hep/awkward-1.0/blob/1.4.0/src/cpu-kernels/awkward_ListArray_getitem_jagged_apply.cpp#L46)

Is there an alternative way to apply such a cut, without loading all branches of the TTree into a Pandas dataframe which I want to avoid for memory reasons.

Jim Pivarski
@jpivarski

@ethansimpson285 If el_pt and mu_pt are jagged arrays (different number of pT values in each event) and they are different from each other (event i can have a different number of electrons than muons), then this cut is not valid for the reason shown: el_pt>1000000 is a cut on electrons, "select electrons for which the electron pT is greater than 1000000 GeV", which can't be applied to muons because there isn't necessarily a muon for each electron.

If you intended to cut events, perhaps "select events for which some electron pT is greater than 1000000 GeV," then that would be

ak.any(el_pt > 1000000)

There are other choices, like "select events in which all electron pTs are greater than 1000000," "select events in which the average electron pT is greater than 1000000," etc. Those other possibilities would be constructed using other Awkward Array functions. (In case you haven't found the Awkward Array documentation, here's the page on ak.any: https://awkward-array.readthedocs.io/en/latest/_auto/ak.any.html)

I'd encourage you to do this interactively, either in a terminal or in a Jupyter notebook, and to not use the cut string option in Uproot. That would give you more insight into what's going on because you'd be able to look at the el_pt array before and after calling > 1000000 on it, and also look at the mu_pt. Noticing a difference in the length of the first event might have been a clue as to what was going on: this shouldn't be like a black box. (I've been having a lot of misgivings about the cut option in TTree.arrays because it's hiding information from people. It was supposed to be a shortcut.) To do it without the cut option, start with

print(tree.keys(filter_name=["el_pt", "mu_pt"]))   # tests a filter
events = tree.arrays(filter_name=["el_pt", "mu_pt"])   # reads everything that passes the filter
el_pt = events["el_pt"]
mu_pt = events["mu_pt"]

(See TTree.arrays for more on the filter arguments.)

The cut can be constructed as

good_events = ak.any(el_pt > 1000000)
el_pt[good_events]
mu_pt[good_events]
events[good_events]   # can also apply it to the whole package in one step

Interactively, you should see a distinction between el_pt > 1000000 (a jagged array of booleans) and ak.any(el_pt > 1000000) (a flat array of booleans).