Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Jelle Aalbers
    @JelleAalbers

    For a pure numpy solution,

    a = np.array([1, 12, 1, 1, 1, 12, 12, 50])
    b = np.arange(len(a))
    
    si = np.argsort(a)
    np.split(b[si], np.where(np.diff(a[si]))[0] + 1)

    seems to work. Haven't tested it in detail though.

    Eduardo Rodrigues
    @eduardo-rodrigues

    Dear colleague,

    We are pleased to announce the second "Python in HEP" workshop organised by the HEP Software Foundation (HSF). The PyHEP, "Python in HEP", workshops aim to provide an environment to discuss and promote the usage of Python in the HEP community at large.
    PyHEP 2019 will be held in Abingdon, near Oxford, United Kingdom, from 16-18 October 2019.

    The workshop will be a forum for the participants and the community at large to discuss developments of Python packages and tools, exchange experiences, and steer where the community needs and wants to go. There will be ample time for discussion.

    The agenda will be composed of plenary sessions, a highlight of which is the following:
    1) A keynote presentation from the Data Science domain.
    2) A topical session on histogramming including a talk and a hands-on tutorial.
    3) Lightning talks from participants.
    4) Presentations following up from topics discussed at PyHEP 2018.

    We encourage community members to propose presentations on any topic (email: pyhep2019-organisation@cern.ch). We are particularly interested in new(-ish) packages of broad relevance.

    The agenda will be made available on the workshop indico page (https://indico.cern.ch/event/833895/) in due time. It is also linked from the PyHEP WG homepage http://hepsoftwarefoundation.org/activities/pyhep.html.

    Registration will open very soon, and we will provide detailed travel and accommodation information at that time.
    Travel funds may be available at a modest level. To be confirmed once registration opens.

    You are encouraged to register to the PyHEP WG Gitter channel (https://gitter.im/HSF/PyHEP) and/or to the HSF forum (https://groups.google.com/forum/#!forum/hsf-forum) to receive further information concerning the organisation of the workshop.

    Looking forward to your participation!
    Eduardo Rodrigues & Ben Krikler, for the organising committee

    Luke Kreczko
    @kreczko
    R.I.P. rootpy documentation:
    http://rootpy.org/
    NOTICE: This domain name expired on 7/11/2019 and is pending renewal or deletion.
    Eduardo Rodrigues
    @eduardo-rodrigues
    Hi @kreczko, try getting in touch with Noel Dawe, see https://github.com/ndawe. He's probably still responsible for the site - just my best guess.
    Nicholas Smith
    @nsmith-
    @kreczko here's an option that also delays the reindex of b and preserves the order of first-seen values:
    import numpy as np, awkward
    a = np.array([1, 12, 1, 10, 50, 10])
    b = np.array([10, 20, 30, 40, 50, 60])
    arg = a.argsort(kind='stable')
    offsets, = np.where(np.r_[True, np.diff(a[arg]) > 0])
    output = awkward.JaggedArray.fromoffsets(offsets.flatten(), awkward.IndexedArray(arg, b))
    in other news, np.where([0, 1, 0, 0, 1])[0].base is surprisingly 2d (hence the flatten)
    Nicholas Smith
    @nsmith-
    thanks to that nerd-snipe I fixed a bug in awkward
    Luke Kreczko
    @kreczko
    nice!
    Luke Kreczko
    @kreczko

    since the knowledge in this channel proved invaluable before, another question :)
    I have

    group_1 = np.array([(1, 2), (3, 3), (5, 7), (4, 4)])
    test_elements = np.array([(1, 2), (3, 3), (3, 5)])

    and would like to test if the elements in test_elements are in group_1. I expect the result

    [True, True, False]

    as I take the tuples as unique objects.

    Numpy has the function isin where

    np.isin(group_1, test_elements)

    will return

    [[True, True], [True, True], [True, False], [False, False]]

    OK, so this is inverse to what I want, fine.

    np.isin(group_1, test_elements) 
    # returns 
    [[True, True], [True, True], [True, True]]

    Clearly it compares element by element and since both 3 and 5 are contained, therefore (3,5) should be as well, right?
    Well, not in my case. Is there a way to do this comparison for each 2-vector instead of element-wise? For loop (even with numba) is quite slow

    Jonas Eschle
    @mayou36

    Yes, you can do that. The idea is: make a comparison of all possible combinations of each element with each other element. This gives you a rank three boolean object with: number of elements in the group, number of elements to test, dimension of an element. Then make two reduce operations: 1. reduce all on the axis of the tuple, requiring that true is if in a tuple everything is true and 2. a reduce any on the axis of all the possible combinations, since at least one tuple has to be fully contained.

    For example (may change the axis for convenience):

    test_elements_expanded = np.expand_dims(test_elements, axis=1)
    entries_equal = group_1 == test_elements_expanded
    tuple_equal = np.all(entries_equal, axis=2)
    tuple_contained = np.any(tuple_equal, axis=0)
    Luke Kreczko
    @kreczko
    interesting. Thank you, I will try this out!
    The dimension of entries_equal is off :(
    Luke Kreczko
    @kreczko

    For the first example the result is

    E         - [True, True, False, False]
    E         ?             -------
    E         + [True, True, False]

    for the 2nd (group_2 = np.array([(0, 0), (1, 2), (2, 2), (3, 3)])):

    E         - [False, True, False, True]
    E         + [True, True, False]
    Jonas Eschle
    @mayou36
    ah sorry, wrong axis! the last reduce is along the possible combinations which are in axis 1, not 0. So change the last line to:
    tuple_contained = np.any(tuple_equal, axis=1)
    Axis 0 lists all test samples. In axis 1 are the possible combinations. In axis 2 is the tuple itself. Reducing axis 2 with all means that an entry is true where all elements in a tuple are true, otherwise false, reducing axis 1 with any means that an entry with at least one matching tuple is true. So your left with the axis 0.
    Luke Kreczko
    @kreczko
    that works!
    thanks!
    Jelle Aalbers
    @JelleAalbers

    You can also map the tuples to scalars (easy if you have some idea what the values are going to be), e.g:

    def squash(x):
        return 10000 * x[:,0] + x[:,1]
    
    np.in1d(squash(test_elements), squash(group_1))

    This should be more memory-efficient and faster if both arrays are large.

    If you are going to do membership testing repeatedly on the same array, it might be even better to convert it to a set, dictionary or some other object backed by a hash table, so membership tests are a constant time operation.

    Luke Kreczko
    @kreczko
    Comparing the two solutions:
    yours: 28366.0 microseconds
    stackoverflow: 10999.0 microseconds
    @JelleAalbers Where does the 10000 come from?
    it's interesting how many solutions exist for the same operation
    Jelle Aalbers
    @JelleAalbers
    It's just a placeholder, you can put in whatever the maximum value you expect to be is. If your numbers are e.g. arbitrary-sized floats I guess this solution doesn't work. Though you can probably replace 'squash' with some other function (if you want to go really overboard you can use some cryptographic hash function, though then forget about speed :-)
    Jonas Eschle
    @mayou36

    Comparing the two solutions:
    yours: 28366.0 microseconds
    stackoverflow: 10999.0 microseconds

    So essentially same speed for this exact problem. At this point, it matters, if: you call it once or a million times? How big is your array really? That's when things like presorting can make the difference. My advice: use which ever method you understand/like better (not from the speed, from the concept) and try only to improve on it if it proves to be a bottleneck.

    Luke Kreczko
    @kreczko
    speed is important :). Your solution sped up the function by almost a factor 300 :)
    benkrikler
    @benkrikler
    Exciting news everyone:

    Registration is now open for PyHEP 2019, in Abingdon, UK, from the 16th to 18th of October! The registration fee for the 2.5 days has been set at £80; it includes the venue, lunches, dinners, and refreshments. We also have about 46 rooms at Cosener’s House, available on a first-come-first-served basis. The actual payment system will not be online for a few more days, however, so you’ll only be able to complete registration then including the room booking.

    The agenda is also shaping up with talks confirmed on topics ranging from histogramming, statistical methods, distributed workflows, visualisation, and even GPU-programming. Several speakers from industry are confirmed, including our keynote speaker on the PyViz library.

    Since the PyHEP series is all about growing a “Python in High Energy Physics” community, this year we’re also including a session of lighting talks where 30 people can present any topic of their choosing for 3 minutes with a single slide as a way for everyone, especially newcomers and early careers researchers, to introduce themselves.

    Community members can also propose presentations on any topic (email: pyhep2019-organisation@cern.ch). We are particularly interested in new(-ish) packages of broad relevance.

    More details can be found on the indico page (https://indico.cern.ch/e/PyHEP2019) or from the PyHEP WG homepage http://hepsoftwarefoundation.org/activities/pyhep.html. You can also join the HSF forum (https://groups.google.com/forum/#!forum/hsf-forum) to get more information about the workshop and community

    Help us spread the word! :slight_smile:
    Hans Dembinski
    @HDembinski
    pyhepmc-ng 0.4.2 was released today
    Hans Dembinski
    @HDembinski
    @benkrikler I tried to complete my registration today, but during checkout I was not offered any payment options. There is a combobox which is supposed to show the options, but it is empty in my case. Is this a problem on my end or ...?
    Chris Burr
    @chrisburr
    we are still finalising the payment system and will let know when this is available at the email address you use to register
    Hans Dembinski
    @HDembinski
    Yes, but that was 12 days ago :) And Ben said: "however, so you’ll only be able to complete registration then including the room booking."
    Not true in my case...
    I can't book any rooms.
    benkrikler
    @benkrikler
    Thanks Hans and Chris. Chris is correct. The payment system is still not set up (the company we've had to use have been extremely difficult to work with, I've been trying to reach them by the phone every working day for the last week). We'll let you know straight away, I expect to have it in the next day or two.
    Hans Dembinski
    @HDembinski
    I am sorry to hear that Ben :(
    It is no problem for me, I was just surprised, thanks for clearing this up!
    Eduardo Rodrigues
    @eduardo-rodrigues
    Kind reminder on the PyHEP workshop: the available slots are being filled up at a nice pace, so don't delay registration too much, if you intend to come and participate - we hope you do! See https://indico.cern.ch/e/PyHEP2019
    We kindly ask you to broadcast information of the workshop to your communities/groups/colleagues. Many thanks!
    Chris Tunnell
    @tunnell
    @/all Interested in a Pythonic Postdoc? The XENON Dark Matter experiment software stack is in Python and there's a job to work in that direction (with ML component): https://jobs.rice.edu/postings/20856
    If you know somebody who might be interested, feel free to share.
    revkarol
    @revkarol

    Hi @all We are looking for PhD students in in physics, computer science, and data science to attend a three-day OpenHack in September to analyze real physics data from the LHCb experiment at CERN using Microsoft AI technologies.

    An OpenHack is challenge rather than instruction-based. Students will work directly with physicists from CERN and Cloud Advocates from Microsoft. They will progress through these challenges to analyze data from LHCb and search for the “unexpected” in particle collisions:

    Data exploration and visualization
    Classification and anomaly detection
    Source control and automation
    AML experimentation
    AML for hyperparameter tuning
    Real-world application of data

    The OpenHack will be held Sept. 11-13 in northern Italy at Fondazione Bruno Kessler, a scientific research institute affiliated with CERN. Students need pay only for their travel and lodging – there is no registration fee for the OpenHack itself. We will help find lodging.

    The registration form is here. Please encourage your students to attend this unique training event and to contact monicar@microsoft.com with any questions.

    Eduardo Rodrigues
    @eduardo-rodrigues
    Hi @revkarol,FYI I've just sent this information to the HSF forum mailing list, and it got through (was afraid that it bounced as with my previous attempt).
    Are there any other contacts apart from the one above, from Microsoft?
    Eduardo Rodrigues
    @eduardo-rodrigues

    To @all:
    Registration for the PyHEP 2019 workshop has been extended to September 15th.

    As a reminder, the registration fees for the 2.5 days has been set at £80. It includes the venue, lunches, dinners, and refreshments.
    We still have rooms available at Cosener’s House, the venue, available on a first-come-first-served basis.

    The agenda is also shaping up with talks confirmed on topics ranging from histogramming, statistical methods, distributed workflows,
    visualisation, and even GPU-programming. Two speakers from industry are confirmed, including our keynote speaker on the PyViz visualisation project.

    Since the PyHEP series is all about growing a community, this year we’re also including a session of lighting talks
    where 30 people can present any topic of their choosing for 3 minutes, with a single slide, as a way for everyone,
    especially newcomers and early careers researchers, to introduce themselves.

    Community members can also propose presentations on any topic (email: pyhep2019-organisation@cern.ch).
    We are particularly interested in new(-ish) packages of broad relevance.

    Note that partial travel support for some U.S. participants (in particular, students and early-career postdocs)
    may be available from the IRIS-HEP institute. Please contact Peter Elmer (Peter.Elmer@cern.ch) to enquire about details.

    More details can be found on the indico page https://indico.cern.ch/e/PyHEP2019
    or from the PyHEP WG homepage http://hepsoftwarefoundation.org/activities/pyhep.html.
    You can also join the PyHEP WG Gitter channel (https://gitter.im/HSF/PyHEP) and/or
    the HSF forum (https://groups.google.com/forum/#!forum/hsf-forum) to get more information about the workshop and community.

    Hope to see you there!
    Eduardo Rodrigues & Ben Krikler, for the organising committee

    Eduardo Rodrigues
    @eduardo-rodrigues

    HSF PyHEP WG topical meeting on fitting tools, Sep. 11th @ 17h CET

    Dear Python enthusiasts,

    The HSF PyHEP WG is restarting activities post-Summer with topical meetings (not to be confused with the workshop in the UK ;-)).

    The first one will be on the hot and important topic of fitting (tools)! It will take place on Wednesday September 11th at 17h CET.
    The agenda, which you can find at https://indico.cern.ch/event/834210/, contains 2 presentations,
    one from HEP, and one from an astroparticle physics community colleague:

    • The zfit project, Jonas Eschle (Universitaet Zuerich)
    • Numpy-based Python fitting frameworks Astropy & Sherpa, Christoph Deil (MPI for Nuclear Physics, Heidelberg)

    Take this opportunity of cross-exchange to come and discuss needs, technical design, functionality requirements, etc.!

    Hoping to see you there!
    Eduardo, for the PyHEP WG conveners

    P.S.: Note that a second topical meeting on fitting tools will likely happen as a follow-up.

    benkrikler
    @benkrikler
    Has anyone here ever been involved with Hacktoberfest: https://hacktoberfest.digitalocean.com/ ?
    Luke Kreczko
    @kreczko

    Has anyone here ever been involved with Hacktoberfest: https://hacktoberfest.digitalocean.com/ ?

    have two t-shirts that say "yes, I have"

    Pratyush Das
    @reikdas
    @benkrikler Yes :)
    benkrikler
    @benkrikler
    Cool! I'm definitely going to join this year. And promise not to contribute to just my own project :p
    benkrikler
    @benkrikler
    I've just heard through the UK's Software Sustainability Institute of the US' Better Scientific Software community. They have a fellowship scheme for researchers that are affiliated to a US institute which lasts for a year and provides funds for specific activities. The application for 2020 is now open until mid-October: https://bssw.io/. Share around and let's see if we can't get some particle physicists on it :)
    benkrikler
    @benkrikler
    It's also open for all career stages from PhD students to senior professionals