Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Henry Schreiner
    @henryiii
    I think there are some useful parts, it would probably be better than starting from scratch. I don’t remember all the details, though. Yes, I can move it to github.
    Matthieu Marinangeli
    @marinang

    Hi all, the first release of scikit-stats, part of the scikit-hep toolset, has just been published.

    It contains 2 submodules:

    • modeling: with the Bayesian Blocks algorithm that moved from scikit-hep to scikit-stats.

    • hypotest: aims to provide tools do likelihood-based hypothesis tests such as discovery test, computations of upper limits or confidences intervals. Currently discovery test
      using asymptotic formulae is available. More functionalities will be added in the future.

    Quick documentation is available in the README and notebook examples in binder , a proper documentation will be added in future releases.

    Suggestions are welcome, and feel free to give it a try.

    Henry Schreiner
    @henryiii
    As a community, we need to decide on our support for Python versions, especially Python 2. The maintainers of Matplotlib, scikit-learn, IPython, Jupyter, yt, SciPy, NumPy, and scikit-image have come together and agreed on a plan for Python and Numpy version support. I hope to discuss this plan within Scikit-HEP so we can come to a recommendation in the near future. For now, here is the community policy those packages have adopted: Numpy NEP 29
    Eduardo Rodrigues
    @eduardo-rodrigues
    It makes complete sense to start a discussion soon-ish, for some planning. But we cannot afford to be as aggressive as some, because HEP still relies on Python 2 for so much, unfortunately. Otherwise we will loose users. There will be a compromise for everything that relies on numpy & co. As ever, there's no trivial solution. On the positive side, we can be more forward-looking at stick to Python 3 for high-level analysis tools, those which are likely to be loosely connected with experiments software stacks.
    (I will stop here as otherwise we will have the whole discussion just now LOL.)
    Henry Schreiner
    @henryiii

    Otherwise we will loose users.

    I would argue not many or maybe any. If we add a new package, regardless of the requirements, we can’t loose a user. If we add a new release of a package that is Python 3 only, existing users will still get the old Python 2 version, and again, won’t be lost. If experiments are using our code, they would fall into this category. The only ones I think we would risk loosing would be a new users, who are Python 2 only and want to start using our code - but a) I don’t think we have that many, b) they will still be able use older versions, and c) they already have to be using older versions of numpy, SciPy, matplotlib, and IPython, so our packages would just be one more thing.

    Keep in mind, the cost of keeping Python 2 compatibility is non-zero. It requires more complex code, limits use of time saving features, adds extra checks, increases binary build time, etc. It can also hamper the package API and user experience in Python 3. We could use the time we spend fiddling with Python 2 code or writing code in a way to be compatible with both instead developing new libraries and features.

    Obviously, the final decision for any package has done on a per-case basis by the core maintainers of that package. I still support Python 2.6 in Plumbum, for example.

    (For those not in the mailing list, I'm being a bit more aggressive here than in the plan I've suggested)

    But we cannot afford to be as aggressive as some

    I don't think we can afford to be less aggressive than numpy - I don't think the HEP community should take over maintaining numpy. Similarly with Python - Over time, the old versions will stop working with anything new, or on new systems, etc. I already had to drop Pandas from Particle/DecayLanguage, because they are missing some variants of Python 2 wheels and will not produce them - their hands are washed clean of Python 2 already.

    Eduardo Rodrigues
    @eduardo-rodrigues
    I'm not saying (1) that I plan to support Py2.7 for long and (2) that HEP will even for a second maintain numpy. I think we pretty much agree on most points, actually, as you should know. Just saying we cannot say now "well, after Christmas all Python 2.7 support is gone. That's it.". We should rather do things smoothly having users in mind, balancing the requirements it imposes on us maintainers.
    Henry Schreiner
    @henryiii
    Agreed, we should be using python_requires in our setups, so that pip 9+ will automatically find the last supported Python 2 version, etc. The idea is not to cause Python 2 to stop working, but to stop producing new features for Python 2 (more or less). It will be very interesting after January, though, to see how long things hold together...
    Eduardo Rodrigues
    @eduardo-rodrigues
    :+1:
    Henry Schreiner
    @henryiii
    My talk from PyHEP on Python 3.8 is now a post: https://iscinumpy.gitlab.io/post/python-38/
    Doug Davis
    @douglasdavis
    Thanks! Small correction, cp38 wheels for NumPy have been up for about a week (started with 1.17.3)
    Henry Schreiner
    @henryiii
    The original talk was written before that. :) Thanks, will update. GitHub Actions is supposed to be 2-3 days away from supporting Python 3.8 now, too
    I didn't know the two snakes in the Python logo make up a P and a Y.
    Matthew Feickert
    @matthewfeickert
    Do they? If so that seems kinda forced. :/
    Hans Dembinski
    @HDembinski
    I am looking at the PyHEP survey, really nice to have such feedback and quite useful for planning the next workshop.
    It reduces the amount of guessing what people want/need.

    I think we needed to give one more minute to the lightning talks and have fewer of them.

    I disagree with this, I think the lightning talks were just fine and we should have had more of them.

    For me the lightning talks are not only about the content, but they also give everyone an opportunity to introduce themselves
    Hans Dembinski
    @HDembinski

    Deconstructing existing monoliths: focus on small tools with specific purpose instead

    Yes :D

    DON'T USE PYTHON 2!

    Lol

    this field has evolved from enthusiast working in isolation 2 real interactivity in just a year or two. It's amazing how quickly that happened.

    :D

    Luke Kreczko
    @kreczko
    the last one is what made a real difference though - e.g. I've seen many plotting tools rise and fall in isolation. Now we can rally around a few focused projects and stop wasting time ;)
    Matthew Feickert
    @matthewfeickert
    agreed. It is nice to have projects that we can all be developers and contributors to (even if I just do some drive-by-developing PRs every few months)
    Henry Schreiner
    @henryiii
    I was happy to see how many mentioned histogramming :)
    Henry Schreiner
    @henryiii
    By the way, if you run python in macOS Catalina, the header at the top of Python’s startup says:
    WARNING: Python 2.7 is not recommended.
    This version is included in macOS for compatibility with legacy software.
    Future versions of macOS will not include Python 2.7.
    Instead, it is recommended that you transition to using 'python3' from within Terminal.
    Henry Schreiner
    @henryiii
    Python just adopted a 12 month release cycle. Not sure why. 18 months was already too short IMO...
    Maybe they were envious of Ruby’s annual Christmas Day release? :P
    Eduardo Rodrigues
    @eduardo-rodrigues
    Indeed I'm also super happy with the very interesting feedback we got! Thanks again to everyone who spared a few minutes. It WILL help organising future activities such as workshops.
    Yes the lightning talks were a lot about a way to get everyone to introduce itself to everyone else. I was hoping for more such talks but in the end I guess people were a bit shy … maybe because the session was a first? For sure we will do this again.
    Hans Dembinski
    @HDembinski

    Yes the lightning talks were a lot about a way to get everyone to introduce itself to everyone else. I was hoping for more such talks but in the end I guess people were a bit shy … maybe because the session was a first? For sure we will do this again.

    I think so, it is new and people don't yet know how to deal with this, but they will learn.

    Lukas
    @lukasheinrich
    hi.. i have a python library for reading LHE files that seems to have been useful for people (it's been cited a bunch of times) even though currently it's quite barebones: https://github.com/lukasheinrich/pylhe
    given that we have pyhepmc I thought we could fold this into scikit-hep
    (and clean it up a bit)
    if someone has somethign better i'm also happy to give up the name
    Hans Dembinski
    @HDembinski
    @lukasheinrich There is a reader for LHE files in HepMC3, which will also be wrapped in pyhepmc at some point.
    The advantage of that would be that you can use the existing HepMC3 interface to deal with this format.
    Eduardo Rodrigues
    @eduardo-rodrigues
    Sounds like a merge of functionality would be best, then, if anything is not yet in pyhepmc, coming from pylhe? Agree on the mentioned advantage.
    Lukas
    @lukasheinrich
    @HDembinski @eduardo-rodrigues sounds good to me. Is there any timeline for having this functionality in pyhepmc? One advantage I guess of the pylhe is that it doesn't require anything other than the python standard library to read the files, i.e. no build of HepMC is necessary
    (it just uses ElementTree to parse the files)
    Hans Dembinski
    @HDembinski
    Having something in pure Python is nice, although we can provide binary wheels these days and the disadvantage of having a compiled module goes away. I am not sure whether the speed of using ElementTree to parse will be competitive with a C++ library. We will only know for sure when both approaches are implemented. For a timeline, I don't have one, but seeing that this LHE files are getting traction, I should get started working on the bindings.
    Lukas
    @lukasheinrich
    @HDembinski yes LHE files are quite popular already with the LHC pheno crowd (for ATLAS & CMS at least) so having nice bindings would be nice
    Jim Pivarski
    @jpivarski
    I'm not sure what to think of this (I haven't done anything yet), but NumFOCUS, which sponsors NumPy, SciPy, Pandas, Matplotlib, Jupyter, Dask, and conda-forge, as well as some projects we don't frequently use, is for the first time asking for money, though they already have an impressive list of sponsors.
    Henry Schreiner
    @henryiii
    Are they simply allowing donations, or asking for money? Allowing user contributions is not bad, IMO. Asking for it might be different.
    Jim Pivarski
    @jpivarski

    @henryiii I found out about this through an email sent to PyData meetup members (PyData is under NumFOCUS). This is a fixed-duration campaign, like an NPR pledge break, not an open repository for donations, like Wikipedia.

    To be clear, I'm not saying it's good or bad—I'm just a little surprised. It could be an attempt to diversify their funding sources so that it's not all corporate. I just don't know, so I thought of sharing it here, given that we depend strongly on these projects.

    Jim Pivarski
    @jpivarski
    The use of StackOverflow has been taking off, and I'm fine being the only one writing answers until the community gets established. However, this morning there was a rather open-ended question about tricks for ML performance: https://stackoverflow.com/q/58817554/1623645 I gave some theoretical suggestions, but some of you might have actual experience with ML analyses in Python. If you contribute an answer, I'll upvote you! :)
    Andrzej Novak
    @andrzejnovak
    Fun, it kind of looks like my problem as well. Upvotes to anyone who figures out a faster way to read stuff than root_numpy :)
    Eduardo Rodrigues
    @eduardo-rodrigues
    @HDembinski, @lukasheinrich, on the pylhe & pyhepmc merging: could you Lukas then open an "issue" or 2 on the pyhepmc repo so that one can easily read what you would like to see available? Just so that it's easier to figure out what kind of user code one is talking about.
    From a quick look there are no tests in there. Would you have a .lhe file to provide for testing? Thanks :-).
    Nicholas Smith
    @nsmith-
    ah man can't we kill LHE? its so inefficient to use xml