## Where communities thrive

• Join over 1.5M+ people
• Join over 100K+ communities
• Free without limits
##### Activity
Hans Dembinski
@HDembinski
Another weird thing: Numba has a typed list, numba.typed.List, which basically does this already and the performance seems to be good. The problem is that Numba does not provide a way to view this list as a numpy array, even though this should be trivial.
There is an open issue for this, but it has "low priority"
numba/numba#4355
Eduardo Rodrigues
@eduardo-rodrigues

Hi @eduardo-rodrigues and @henryiii just wanted to let you know that I love using particle

Thanks a lot @HDembinski, that's great feedback :-)!

BTW, I should say that I will be adding basic info on all nuclei by the end of the week (finally, since a request).
Hans Dembinski
@HDembinski
Awesome :D
The iminuit logo is finally online, could the Scikit-HEP website be updated to use the logo as well?
https://github.com/scikit-hep/iminuit
Henry Schreiner
@henryiii
Of course!
Henry Schreiner
@henryiii
Eduardo Rodrigues
@eduardo-rodrigues
iminuit logo - nice to see it out there!
Eduardo Rodrigues
@eduardo-rodrigues
Re: nuclei in Particle - here's the kind of thing you will be able to do once the PR on nuclei goes in:
>>> print(Particle.dump_table(filter_fn=lambda p: p.pdgid.is_nucleus and p.pdgid.A==32, tablefmt='rst',exclusive_fields=['pdgid', 'name', 'latex_name', 'mass', 'charge']))
===========  ======  ===========================  =============  ========
pdgid  name    latex_name                            mass    charge
===========  ======  ===========================  =============  ========
1000100320  Ne32    ^{32}\mathrm{Ne}             29844.986969         10
-1000100320  Ne32~   ^{32}\mathrm{\overline{Ne}}  29844.986969         10
1000110320  Na32    ^{32}\mathrm{Na}             29826.1148987        11
-1000110320  Na32~   ^{32}\mathrm{\overline{Na}}  29826.1148987        11
1000120320  Mg32    ^{32}\mathrm{Mg}             29807.0192697        12
-1000120320  Mg32~   ^{32}\mathrm{\overline{Mg}}  29807.0192697        12
1000130320  Al32    ^{32}\mathrm{Al}             29796.7448898        13
-1000130320  Al32~   ^{32}\mathrm{\overline{Al}}  29796.7448898        13
1000140320  Si32    ^{32}\mathrm{Si}             29783.7301475        14
-1000140320  Si32~   ^{32}\mathrm{\overline{Si}}  29783.7301475        14
1000150320  P32     ^{32}\mathrm{P}              29783.5057133        15
-1000150320  P32~    ^{32}\mathrm{\overline{P}}   29783.5057133        15
1000160320  S32     ^{32}\mathrm{S}              29781.7950523        16
-1000160320  S32~    ^{32}\mathrm{\overline{S}}   29781.7950523        16
1000170320  Cl32    ^{32}\mathrm{Cl}             29794.4804277        17
-1000170320  Cl32~   ^{32}\mathrm{\overline{Cl}}  29794.4804277        17
1000180320  Ar32    ^{32}\mathrm{Ar}             29805.6313435        18
-1000180320  Ar32~   ^{32}\mathrm{\overline{Ar}}  29805.6313435        18
1000190320  K32     ^{32}\mathrm{K}              29828.2293902        19
-1000190320  K32~    ^{32}\mathrm{\overline{K}}   29828.2293902        19
===========  ======  ===========================  =============  ========
(Masses are in MeV as HEP standard unit.)
Hans Dembinski
@HDembinski
@henryiii Thank you that was super quick!
@eduardo-rodrigues It took me a while because I wanted to replace the original font in the logo with a free font
That table looks very impressive and it is markdown-compatible, isn't it?
I was not aware of "dump_table" but that sounds very handy
for my current project, I wanted a table of all long-lived particles (life-time > 30 picoseconds). I could have used this, I think
Hans Dembinski
@HDembinski
Ah, I see, it is restructuredText
And you can tell it which format to use, nice
The tabulate package seems pretty cool
Eduardo Rodrigues
@eduardo-rodrigues
That's right, the example above is restructuredText. But tabulate allows you to print out in many other formats.
I had presented Particle.dump_table(...) at PyHEP 2019, though I ran (too) quickly over a lot of material ...
Henry Schreiner
@henryiii
Tabulate is an optional dependency of pandas, and is used by .to_markdown(), just noticed that earlier when writing a post
Eduardo Rodrigues
@eduardo-rodrigues
Here is what you wanted to do, @HDembinski :
>>> from hepunits import ps
>>> from particle import Particle
-----------  -------  --------------------------
11  e-                inf
-11  e+                inf
13  mu-              2196.98034989
-13  mu+              2196.98034989
21  g                 inf
22  gamma             inf
130  K(L)0              51.1431197705
211  pi+                26.0327460626
-211  pi-                26.0327460626
310  K(S)0               0.0895429002893
321  K+                 12.3793859591
-321  K-                 12.3793859591
2112  n        879374684632
-2112  n~       879374684632
2212  p                 inf
-2212  p~                inf
3112  Sigma-              0.147912798078
-3112  Sigma~+             0.147912798078
3122  Lambda              0.263179508775
-3122  Lambda~             0.263179508775
3222  Sigma+              0.0801817458213
-3222  Sigma~-             0.0801817458213
3312  Xi-                 0.16373431628
-3312  Xi~+                0.16373431628
3322  Xi0                 0.289961212091
-3322  Xi~0                0.289961212091
3334  Omega-              0.0815628192623
-3334  Omega~+             0.0815628192623
1000000010  n        879374684632
-1000000010  n~       879374684632
1000010010  p                 inf
-1000010010  p~                inf
The lifetime is given in ns, the standard HEP unit of time.
Jim Pivarski
@jpivarski

I was playing around with Numba a lot these days for a project and wanted to share some surprising results.
In this notebook, I am testing several low-level ways of making a dynamically growing numpy array inside a Numba-accelerated function. https://github.com/HDembinski/testing_grounds/blob/master/notebook/Growing%20Array%20in%20Numba.ipynb

@HDembinski Actually, this makes a good demo of lowering in Numba: it requires a lot of undocumented features. I made an example here: https://gist.github.com/jpivarski/7bc83e5aa70d5e3dd8483eb49800885c

And I guess I ought to put a lot of comments in it. But this could make a good teaching example, since it touches on just about everything.

But the punchline is
buf = GrowableBuffer(float, initial=10)
buf.append(1.1)
buf.append(2.2)
buf.append(3.3)

@numba.njit
def test6(x):
x.append(4.4)
x.append(5.5)

test6(buf)
assert numpy.asarray(buf).tolist() == [1.1, 2.2, 3.3, 4.4, 5.5]
Jim Pivarski
@jpivarski
I just added a bunch of comments to make it more pedagogical. I think an "Extending Numba" tutorial could be written around it.
Henry Schreiner
@henryiii
Jim Pivarski
@jpivarski
Possibly, although the vector one has Awkward dependencies, and this one isn't connected to anything else.
Hans Dembinski
@HDembinski
But how fast is it?
     newbuffer = numpy.zeros(reservation, dtype=self._buffer.dtype)
numpy.empty would be better here, since you are overriding the memory in the next line anyway
I am sceptical that this is going to be faster. It looks more or less like "ArrayBuilder" in my notebook and that was pretty slow, although I used @jitclass, which lowers the whole class
Hans Dembinski
@HDembinski

def __init__(self, dtype, initial=1024, resize=2.0):

The default should rather be 1.5, like in std::vector
https://stackoverflow.com/questions/1100311/what-is-the-ideal-growth-rate-for-a-dynamically-allocated-array

For those who didn't look into my notebook, the fastest version (the last one) was 141 times as fast as the version based on @jitclass https://github.com/HDembinski/testing_grounds/blob/master/notebook/Growing%20Array%20in%20Numba.ipynb
Henry Schreiner
@henryiii
@HDembinski Can you copy-and-paste it and try it on the same hardware?
Hans Dembinski
@HDembinski

Sure. I pasted Jim's code into my notebook. It seemed to work, but running

@nb.njit
def fill(x):
b = GrowableBuffer(float, 0)
for xi in x:
b.append(xi)
return b.__array__()
fill(x)

%timeit fill(x)

got me this error

---------------------------------------------------------------------------
TypingError                               Traceback (most recent call last)
<ipython-input-13-88dccecb2396> in <module>
5         b.append(xi)
6     return b.__array__()
----> 7 fill(x)
8
9 get_ipython().run_line_magic('timeit', 'fill(x)')

399                 e.patch_message(msg)
400
--> 401             error_rewrite(e, 'typing')
402         except errors.UnsupportedError as e:
403             # Something unsupported is present in the user code, add help info

342                 raise e
343             else:
--> 344                 reraise(type(e), e, None)
345
346         argtypes = []

666             value = tp()
667         if value.__traceback__ is not tb:
--> 668             raise value.with_traceback(tb)
669         raise value
670

TypingError: Failed in nopython mode pipeline (step: nopython frontend)
Untyped global name 'GrowableBuffer': cannot determine Numba type of <class 'type'>

File "<ipython-input-13-88dccecb2396>", line 3:
def fill(x):
b = GrowableBuffer(float, 0)
Jim Pivarski
@jpivarski
@HDembinski Ideal resize is 1.5, rather than 2.0? That's great to know (I'll start using that everywhere).
Your error comes from the fact that __array__ isn't a lowered function, but it could be.
It would be similar to the _buffer property, except trimmed with the trim function.
About numpy.zeros: yes, it was originally numpy.empty, but I switched to zeros while debugging. I'll switch it back in the gist.
Jim Pivarski
@jpivarski
In cases where I've tried @jitclass, the performance was underwhelming. The same is true of iterators in Numba: it's possible, but it's considerably faster to write what would be idiomatic C than idiomatic Python. I hope this lowered GrowableBuffer works for you, since that's what I would use for a performance-critical application like this, rather than @jitclass.
Oh, sorry: your first error is trying to create a GrowableBuffer inside of the function, which is another function we could add. (Have to switch into Python mode to make it, just like _ensure_reserved, so it would be another slow/rarely called function like that.) The second error you would encounter is .__array__() unless we add that. But these are both embellishments; to test it, you can create the GrowableBuffer outside the function and convert it to a NumPy array outside as well.
Hans Dembinski
@HDembinski
Could you post a snippet that would work, so I can try the benchmark?
I could accept that @jitclass is not doing well yet, because it is a rather new feature, but the surprising thing was that just putting the growth code into a separate jitted function also made the core much slower. That should not happen, if numba used inlining properly. AFAIK they do inline jitted code, and in other tests numba did spectacularly well, even in comparison with pybind11 C++ code, so... I am really puzzled by this.
Whatever is going on, my benchmark is particularly sensitive to this effect. In my real code, the work that has to be done before each call to .append is significant, so the performance hit shouldn't be so bad
Jim Pivarski
@jpivarski
(Actually, @jitclass has been around for a couple of years, and I looked into its implementation: it's like List[T], carrying a mutable buffer for the class instance attributes, and it's all specialized, so the only thing that might be an issue is excessive reference-counting. I don't know what's going wrong.)
This ought to work:
@nb.njit
def fill_loop(x, b):
for xi in x:
b.append(xi)

def fill(x):
b = GrowableBuffer(float)
fill_loop(x, b)
return b.__array__()

%timeit fill(x)
Jim Pivarski
@jpivarski

The initial parameter can't be zero because it's the initial reservation, not the initial length. I had an

assert resize > 1.0

to sanity-check the resize parameter, but we also need to have

assert initial > 0

I don't know how much of a "cheat" it would be to make the initial very large—then you wouldn't be measuring the resize operation, just filling an array.

I thought that initial=1024 would be reasonable (small enough that if you have a lot of barely filled arrays for some reason it's not going to be a problem, but big enough that the exponential growth gets started soon: no ramp-up through 1 item, 2 items, 4 items, etc.).

(Except resize` is now 1.5; in Awkward as well. Interesting to hear about the golden ratio!)