by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Tai Sakuma
    @TaiSakuma
    that is interesting
    Chris Burr
    @chrisburr

    The Python fraction actually hasn't been increasing

    I wonder if this is caused by people making more repositories when using notebooks. For example looking at @jpivarski's GitHub there are 9 classified as Python and 17 as Jupyter

    benkrikler
    @benkrikler
    It could be nice to put something like this on the PyHEP web-page?
    When you look at the language of a repo, do you just consider the dominant language for that repository or do you add all languages used in that repo, weighted by fraction of the repo, or by the number of lines of code, etc?
    Eduardo Rodrigues
    @eduardo-rodrigues
    Why not. At least this deserves a little report at the next HSF coord meeting, as PyHEP WG input.
    benkrikler
    @benkrikler
    It would be really interesting to study language per commit as well. If a repository is 50 / 50 C++ / python, but activity on the python side has picked up in the last few months (without changing the python line count much) that would be nice to see. I realise that's a lot more work to unpick though, since you need to check commit diffs but would give an additional angle on this trend
    Henry Schreiner
    @henryiii
    I would also keep in mind Scientific Linux is disappearing
    Jim Pivarski
    @jpivarski

    I would also keep in mind Scientific Linux is disappearing

    Right, which is too bad, given this one useful feature of being able to identify physicists in pip downloads data!

    @benkrikler The "language" is whatever GitHub decides the dominant language is, according to its algorithms. You can see that a large chunk, maybe 15% (yellow) is "(unknown)". These might be mixed repos. In the JSON response to the curl request for all of a user's repositories, it provides a "languages_url" with a "percent by file" breakdown of a repo's files by language, which could be used to do a more fine-grained study, at the cost of more curl requests. (An authenticated user gets 5000 per hour; I'd have to divide that over a few hours.) In the original study in March, I did that—but the results were not much different from the coarse-grained study, so I didn't go into that detail again.
    Jim Pivarski
    @jpivarski
    @chrisburr By language, my repos are or will be predominantly "TeX/LaTeX", since every talk is a separate repo. This is a pattern I got into with Overleaf, and when Overleaf stopped hosting its own git repository, deferring us to GitHub instead, now I'm filling up my GitHub account with lots of tiny TeX repos.
    @eduardo-rodrigues I had been preparing this for PyHEP at Oxford (I wanted one plot for uproot/awkward usage), but this digression into Python usage overall would get off-point for that talk. I could present this at a coordination meeting, allowing both talks to be more on point. I missed the first coordination meeting (the one with zfit), so I'd like to get these on my calendar. When is the next one?
    Eduardo Rodrigues
    @eduardo-rodrigues
    Hi @jpivarski, the date of the next WG meeting is not yet decided. To be discussed …
    benkrikler
    @benkrikler
    Does anyone have or know of a good entry-level numpy and / or uproot tutorial for a student who is transitioning from C++ to python?
    Hans Dembinski
    @HDembinski
    Hey all, I did some experiments with allocation in pybind11, that I want to share:
    https://github.com/HDembinski/pybind11_allocation_cost_demo
    It is all the in readme, feel free to check out the code and try the benchmarks for yourself.
    tl;dr: avoid allocating temporary objects from the Heap if you can, but don't worry about it too much. In most cases, it won't make a big difference.
    Jim Pivarski
    @jpivarski
    benkrikler
    @benkrikler
    Thanks Jim, that's great! Love the "why python" bit of the first notebook too. Can I re-use some of that material (with appropriate citations) in some up-coming talks of mine?
    Jim Pivarski
    @jpivarski
    @benkrikler You can we use the material without citations. Except for the parts from Jake Vanderpl
    Jake Vanderplas—which I cited as "stolen from" because I didn't explicitly get permission from him. (I think you can do the same, with citation.) They are good graphics, though.
    (Gitter submits a message when you go into another app to check on the spelling of something...)
    Hans Dembinski
    @HDembinski
    I have a contributor for boost::histogram who wants to remain anonymous. It turns out that this is surprising difficult.
    https://opensource.stackexchange.com/questions/7147/declaring-copyright-anonymously?newreg=5ec7ccfb79d947bd8cdbd3208f2b880c
    Eduardo Rodrigues
    @eduardo-rodrigues
    To avoid the pain, can you now ask him to give you full ownership? Unless he is making a major contributor, I would not bother to handle this special case, being pragmatic.
    Hans Dembinski
    @HDembinski
    To my understanding, this doesn't solve the problem, because there has to be a legally valid documented way of transferring the copyright. Any legally valid transfer of ownership also requires the other party to be identified - to my understanding.
    Hans Dembinski
    @HDembinski
    Putting things in the public domain is not easy, because many jurisdictions by default give full copyright to the creator if no statement is made. One has to actively declare to waive the rights, but such a statement remains ambiguous if the person who waives their rights remains anonymous.
    If this was valid, one could copy some copyrighted code, then publish this anonymously in the public domain. It would still be illegal to do this, but there would no one to prosecute and hold accountable.
    Any party who uses such code would be at risk of getting charged with lawsuits still, without being able to shift the blame elsewhere. So companies would not risk to use such code.
    Eduardo Rodrigues
    @eduardo-rodrigues
    I understand that route is tricky. Brute force: if I email and give you a piece of code for you to commit as yourself, basically, then that's the end of it, no, since all seems effectively to be yours, and I'm agreeing? The fact that it came from me is irrelevant by construction. Otherwise it gets vicious, I take your comments ...
    Hans Dembinski
    @HDembinski
    If you send me some code that I publish for you under my name, I could be liable for copyright infringement. You could decide to later sue me or worse, your code could be intellectual property of a third party, which could then sue me.
    I can prevent the first case by maintaining a legal record of that transfer of the copyright, but this is additional hassle for me. I would still get in trouble for the second thing, even though I could defend myself with that legal record of transfer of copyright
    Hans Dembinski
    @HDembinski
    A copyright expert says: copyright notices are bogus https://lists.boost.org/Archives/boost/2015/09/225605.php
    Jim Pivarski
    @jpivarski

    I've started following the [uproot] tag on StackOverflow, and will follow an [awkward-array] tag if somebody creates a question about Awkward there.

    (Tags can only be created on a question about the topic. There happened to be an old question about uproot that I could add the [uproot] tag to; if anyone asks an Awkward question and doesn't have >1500 reputation to create a new tag, point me to it and I'll add the tag—and answer the question.)

    I've created a link on the uproot README pointing to uproot questions, and I'll do the same for awkward when it comes up. (https://stackoverflow.com/questions/tagged/uproot)

    Just like GitHub issues, I'll get an email when somebody asks a question, but unlike GitHub issues, StackOverflow is intended for usage questions and don't "go away" once answered.

    Lukas
    @lukasheinrich
    Hi
    what's the procedure to create a sub-community to PyHEP?
    we'd like to craete a PyHEP-fitting one to discuss some stats related issues
    cc @cranmer @mayou36 @feickert @kratsg
    Eduardo Rodrigues
    @eduardo-rodrigues
    Hi @lukasheinrich, HSF WG conveners are admins on the account. I can create the room for you if you want ...
    (No big deal, just that somebody needs to have the admin rights ;-).)
    Jonas Eschle
    @mayou36
    yes that would be good, thanks
    Eduardo Rodrigues
    @eduardo-rodrigues
    Jonas Eschle
    @mayou36
    Thanks! I would like to invite everyone to this channel who is interested to have more detailed discussions on the current efforts in the Python fitting tools such as (but not limited to!) zfit, pyhf and libraries around it, especially high level statistics/inference libraries.
    Jim Pivarski
    @jpivarski

    I've started following the [uproot] tag on StackOverflow, and will follow an [awkward-array] tag if somebody creates a question about Awkward there.

    A few questions have come in on [uproot] and one of them was actually about [awkward-array], so I took that opportunity to create the second tag. Apparently, I can only set up email notifications after the tag has existed for a little while (some database needs to sync), so I'm listening to [uproot] now and will be listening to [awkward-array] soon.

    Henry Schreiner
    @henryiii
    Was Python 3.8 released just for PyHEP? :)
    Eduardo Rodrigues
    @eduardo-rodrigues
    :D
    Matthew Feickert
    @matthewfeickert
    :)
    Henry Schreiner
    @henryiii
    I'm having issues with Gitter, trying again:
    What are the imports many working in HEP + Python might need? Not counting general ones, like numpy and tensor flow. Here's what I have so far:
    import boost_histogram as bh
    import uproot
    import zfit
    
    from particle import Particle
    from iminuit import Minuit
    from decaylanguage import DecFileParser
    
    import ROOT
    Matthieu Marinangeli
    @marinang
    hepunits, uncertainties ?
    Henry Schreiner
    @henryiii
    Maybe from hepunits import units as hu? I need imports that won't conflict with general tools too easily. And from hepunits import constants as hc
    Uncertainties is probably a good one. Pint should be there too, even if it not used in those packages explicitly