Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Luke Kreczko
    @kreczko
    I see. So things should start to improve now as many in HEP are moving to CentOS 7
    Henry Schreiner
    @henryiii
    Or CentOS 8 - have no idea how long the building process will take, but am quite excited to have reasonably modern GCC :)
    Avoiding Python 2.6 is a good reason to get off of SLC 6, though. It’s becoming really hard to support in Plumbum.
    Luke Kreczko
    @kreczko
    @henryiii I did not realise you are the #1 maintainer of plumbum (https://github.com/tomerfiliba/plumbum/graphs/contributors)! Thx a lot! It is a very useful library
    BenGalewsky
    @BenGalewsky
    Does anyone have experience using the arrow serialization in awkward array? I have what should be a simple example, that I cannot get to work
    Albert Puig
    @apuignav

    Dear all,
    as part of our work in zfit, we have released a standalone package for multibody phasespace generation à la TGenPhaseSpace. It's pure python, based of tensorflow. On top of simple phasespace generation, it can build more complex decay chains, handle resonances, etc, all in a very simple manner.

    The package is called phasespace (https://github.com/zfit/phasespace), it's well documented and fairly easy to use, so no excuses for not trying :-)

    Let us know what you think, we highly appreciate any feedback and suggestions from the software community here

    Jonas+Albert

    Hans Dembinski
    @HDembinski
    Good, I guess.
    For those who are forced to use Windows
    Matthieu Marinangeli
    @marinang
    Hi, does any of you know if the "moment morphing method" described in this paper https://www.sciencedirect.com/science/article/pii/S0168900214011814 is implemented in python and outside of ROOT? This is used to interpolates pdf shape, when you want to do scans for instance for me a LLP search with different masses and lifetimes. Or maybe there is a better technique nowadays?
    Hans Dembinski
    @HDembinski
    @marinang I don't know this method and I didn't read the paper yet, but at first glance it seems inferior to Alexander Read's interpolation https://inspirehep.net/record/501018/
    Which I independently rediscovered... ten years later
    For simple distributions it works very nicely
    Hans Dembinski
    @HDembinski
    Ok, after looking into the paper, i can see that the authors claim the moment interpolation is better than the Read method
    I am not aware of a Python implementation of either method
    This would be a worthwhile project
    Matthieu Marinangeli
    @marinang
    Yeah except the RooMomentMorph I haven't seen anything
    A case with 1 observable and 1 morphing parameter is actually very easy to reproduce.
    Nicholas Smith
    @nsmith-
    is that paper not describing the RooFit version? I see W. Verkerke in authors
    Matthieu Marinangeli
    @marinang
    Yes it does, this is RooMomentMorph https://root.cern.ch/doc/master/classRooMomentMorph.html
    Hans Dembinski
    @HDembinski
    A contributor works on ASCII display of 1D histograms for Boost.Histogram.
    boostorg/histogram#74
    We are discussing two different versions of the display, what is your preference?
    Hans Dembinski
    @HDembinski
    Hey all, iminuit 1.3.7 is out! 🥳 This time really with wheels, so it is installed in a flash and you don't need a compiler. Big thanks go to @henryiii who developed the Azure Pipeline that generates the wheels, which was a big amount of work. Special thanks go to @eduardo-rodrigues who tested the packages before the release.
    Luke Kreczko
    @kreczko

    stupid question. I have two numpy arrays, a, and b and I want to group elements in b by the unique elements in a, i.e.

    a = [1, 12, 1, 50]
    b = [10, 20, 30, 40]
    result == [[10, 30], [20], [50]]

    Is there a numpy function that does that? It seems I am missing the right keyword in my searches

    Luke Kreczko
    @kreczko
    The current solution is using a for-loop which I would like to get rid off:
    unique_a = np.unique(a)
    result = []
    for u in unique_a:
        result.append(b[a == u].tolist())
    Chris Burr
    @chrisburr
    @kreczko In pandas this is known as groupby:
    import numpy as np
    import pandas as pd
    a = np.arange(10)+0.1
    b = np.random.randint(4, size=10)
    df = pd.DataFrame({'a': a, 'b': b})
    list(df.groupby('b')['a'].apply(list))
    I can't think of a numpy equivalent but I'd guess searching numpy groupby will get you there
    Luke Kreczko
    @kreczko
    thanks @chrisburr, yes, that's what I thought at first which lead me to https://stackoverflow.com/questions/38013778/is-there-any-numpy-group-by-function which I then lead me to my for-loop solution.
    pandas groupby is nice, I will need to test perfomance, I guess.
    Alternatively, there is always numba
    benkrikler
    @benkrikler

    I think awkward array might be able to help instead of pandas (not tested):

    reorder = np.argsort(a)
    _, counts = np.unique(a[reorder], return_counts=True)
    result = awkard.JaggedArray.fromcounts(counts, b[reorder])

    The only thing I'm unsure of there is the order of the unique counts, I'm assuming the unique method returns things in the order they're first seen, but I suspect that's not true.

    Luke Kreczko
    @kreczko
    np.unique returns them sorted
    benkrikler
    @benkrikler
    Then I guess you need to use the inverse array somehow (with return_inverse)
    Luke Kreczko
    @kreczko
    that's a good point, did not realize my end-result is basically a awkward.JaggedArray
    ok, I will finish the unit tests and then try to change the implementation
    Jelle Aalbers
    @JelleAalbers

    For a pure numpy solution,

    a = np.array([1, 12, 1, 1, 1, 12, 12, 50])
    b = np.arange(len(a))
    
    si = np.argsort(a)
    np.split(b[si], np.where(np.diff(a[si]))[0] + 1)

    seems to work. Haven't tested it in detail though.

    Eduardo Rodrigues
    @eduardo-rodrigues

    Dear colleague,

    We are pleased to announce the second "Python in HEP" workshop organised by the HEP Software Foundation (HSF). The PyHEP, "Python in HEP", workshops aim to provide an environment to discuss and promote the usage of Python in the HEP community at large.
    PyHEP 2019 will be held in Abingdon, near Oxford, United Kingdom, from 16-18 October 2019.

    The workshop will be a forum for the participants and the community at large to discuss developments of Python packages and tools, exchange experiences, and steer where the community needs and wants to go. There will be ample time for discussion.

    The agenda will be composed of plenary sessions, a highlight of which is the following:
    1) A keynote presentation from the Data Science domain.
    2) A topical session on histogramming including a talk and a hands-on tutorial.
    3) Lightning talks from participants.
    4) Presentations following up from topics discussed at PyHEP 2018.

    We encourage community members to propose presentations on any topic (email: pyhep2019-organisation@cern.ch). We are particularly interested in new(-ish) packages of broad relevance.

    The agenda will be made available on the workshop indico page (https://indico.cern.ch/event/833895/) in due time. It is also linked from the PyHEP WG homepage http://hepsoftwarefoundation.org/activities/pyhep.html.

    Registration will open very soon, and we will provide detailed travel and accommodation information at that time.
    Travel funds may be available at a modest level. To be confirmed once registration opens.

    You are encouraged to register to the PyHEP WG Gitter channel (https://gitter.im/HSF/PyHEP) and/or to the HSF forum (https://groups.google.com/forum/#!forum/hsf-forum) to receive further information concerning the organisation of the workshop.

    Looking forward to your participation!
    Eduardo Rodrigues & Ben Krikler, for the organising committee

    Luke Kreczko
    @kreczko
    R.I.P. rootpy documentation:
    http://rootpy.org/
    NOTICE: This domain name expired on 7/11/2019 and is pending renewal or deletion.
    Eduardo Rodrigues
    @eduardo-rodrigues
    Hi @kreczko, try getting in touch with Noel Dawe, see https://github.com/ndawe. He's probably still responsible for the site - just my best guess.
    Nicholas Smith
    @nsmith-
    @kreczko here's an option that also delays the reindex of b and preserves the order of first-seen values:
    import numpy as np, awkward
    a = np.array([1, 12, 1, 10, 50, 10])
    b = np.array([10, 20, 30, 40, 50, 60])
    arg = a.argsort(kind='stable')
    offsets, = np.where(np.r_[True, np.diff(a[arg]) > 0])
    output = awkward.JaggedArray.fromoffsets(offsets.flatten(), awkward.IndexedArray(arg, b))
    in other news, np.where([0, 1, 0, 0, 1])[0].base is surprisingly 2d (hence the flatten)
    Nicholas Smith
    @nsmith-
    thanks to that nerd-snipe I fixed a bug in awkward
    Luke Kreczko
    @kreczko
    nice!
    Luke Kreczko
    @kreczko

    since the knowledge in this channel proved invaluable before, another question :)
    I have

    group_1 = np.array([(1, 2), (3, 3), (5, 7), (4, 4)])
    test_elements = np.array([(1, 2), (3, 3), (3, 5)])

    and would like to test if the elements in test_elements are in group_1. I expect the result

    [True, True, False]

    as I take the tuples as unique objects.

    Numpy has the function isin where

    np.isin(group_1, test_elements)

    will return

    [[True, True], [True, True], [True, False], [False, False]]

    OK, so this is inverse to what I want, fine.

    np.isin(group_1, test_elements) 
    # returns 
    [[True, True], [True, True], [True, True]]

    Clearly it compares element by element and since both 3 and 5 are contained, therefore (3,5) should be as well, right?
    Well, not in my case. Is there a way to do this comparison for each 2-vector instead of element-wise? For loop (even with numba) is quite slow

    Jonas Eschle
    @jonas-eschle

    Yes, you can do that. The idea is: make a comparison of all possible combinations of each element with each other element. This gives you a rank three boolean object with: number of elements in the group, number of elements to test, dimension of an element. Then make two reduce operations: 1. reduce all on the axis of the tuple, requiring that true is if in a tuple everything is true and 2. a reduce any on the axis of all the possible combinations, since at least one tuple has to be fully contained.

    For example (may change the axis for convenience):

    test_elements_expanded = np.expand_dims(test_elements, axis=1)
    entries_equal = group_1 == test_elements_expanded
    tuple_equal = np.all(entries_equal, axis=2)
    tuple_contained = np.any(tuple_equal, axis=0)
    Luke Kreczko
    @kreczko
    interesting. Thank you, I will try this out!
    The dimension of entries_equal is off :(
    Luke Kreczko
    @kreczko

    For the first example the result is

    E         - [True, True, False, False]
    E         ?             -------
    E         + [True, True, False]

    for the 2nd (group_2 = np.array([(0, 0), (1, 2), (2, 2), (3, 3)])):

    E         - [False, True, False, True]
    E         + [True, True, False]
    Jonas Eschle
    @jonas-eschle
    ah sorry, wrong axis! the last reduce is along the possible combinations which are in axis 1, not 0. So change the last line to:
    tuple_contained = np.any(tuple_equal, axis=1)
    Axis 0 lists all test samples. In axis 1 are the possible combinations. In axis 2 is the tuple itself. Reducing axis 2 with all means that an entry is true where all elements in a tuple are true, otherwise false, reducing axis 1 with any means that an entry with at least one matching tuple is true. So your left with the axis 0.
    Luke Kreczko
    @kreczko
    that works!