Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Jim Pivarski
    @jpivarski
    Yeah, I see it:
    >>> oldp4 = old.p4
    >>> oldp4
    <JaggedArrayMethods [[PtEtaPhiMassLorentzVector(pt=67.984, eta=2.4785, phi=2.541, mass=-0.099731) PtEtaPhiMassLorentzVector(pt=46.521, eta=1.8037, phi=-2.2109, mass=-0.013321) PtEtaPhiMassLorentzVector(pt=17.046, eta=1.4587, phi=0.81519, mass=-0.014069)] [PtEtaPhiMassLorentzVector(pt=61.624, eta=-0.098343, phi=-0.27698, mass=0.10571) PtEtaPhiMassLorentzVector(pt=20.262, eta=-1.7275, phi=-2.1987, mass=0.10571) PtEtaPhiMassLorentzVector(pt=10.989, eta=-0.36774, phi=1.5681, mass=-0.0023746)]] at 0x7ffae10783a0>
    >>> newp4 = ak.from_awkward0(oldp4)
    >>> newp4
    <Array [[{fPt: 68, ... fMass: 0.106}]] type='2 * var * struct[["fPt", "fEta", "f...'>
    >>> oldp4.pt
    <JaggedArray [[67.98397 46.520737 17.046204] [61.623806 20.262285 10.989361]] at 0x7ffae10789d0>
    >>> newp4.fPt
    <Array [[68, 46.5, 17], [11, 61.6, 20.3]] type='2 * var * float32'>
    alesaggio
    @alesaggio

    Actually, I am noticing the following. To build the leptons, I use the JaggedCandidateArray.candidatesfromoffsets function from coffea.analysis_objects. I create a lepton dictionary, then create the leptons object with this function and then I sort them by pt, like in the following

    leptons = Jca.candidatesfromoffsets(offsets, **lepton_dict)
    leptons = leptons[leptons.pt.argsort()]

    I am noticing that if I don't sort the leptons by pt, then the ordering is what is seen after the conversion to awkward1

    Jim Pivarski
    @jpivarski
    I've found the issue: in the implementation of ak.from_awkward0, I had forgotten that Awkward 0 "Table" combined the responsibilities of "RecordArray" and "IndexedArray", with the latter in a hidden field named _view. I'm making ak.from_awkward0 aware of Table._view right now.
    alesaggio
    @alesaggio
    Ah, glad you found it so quickly, thanks a lot!
    Jim Pivarski
    @jpivarski
    (The hardest thing is remembering how to use Awkward 0!)
    >>> import awkward0
    >>> import awkward as ak
    >>> old = awkward0.load("newTest.awkd")
    >>> oldp4 = old.p4
    >>> newp4 = ak.from_awkward0(oldp4)
    >>> oldp4.pt
    <JaggedArray [[67.98397 46.520737 17.046204] [61.623806 20.262285 10.989361]] at 0x7f1a5f88f8e0>
    >>> newp4.fPt
    <Array [[68, 46.5, 17], [61.6, 20.3, 11]] type='2 * var * float32'>
    alesaggio
    @alesaggio
    for me it's the other way around still, but gradually adjusting to awkward1 :P
    great!
    Jim Pivarski
    @jpivarski
    It's on its way: scikit-hep/awkward-1.0#573
    alesaggio
    @alesaggio
    Works like a charm, thanks :)
    Jim Pivarski
    @jpivarski

    (Sorry for the reposting, if you saw this message elsewhere.)

    Probably the last message about the Awkward Array/Uproot name transition: it's done. The new versions have moved from release candidates to full releases. Now when you

    pip install awkward uproot

    without qualification, you get the new ones. I think I've "dotted all the 'i's of packaging" to get the right dependencies and tested all the cases I could think of on a blank AWS instance.

    • pip install awkward0 uproot3 returns the old versions (Awkward 0.x and Uproot 3.x). The prescription for anyone who needs the old packages is import awkward0 as awkward and import uproot3 as uproot.
    • pip install awkward1 uproot4 returns thin wrappers of the new ones, which point to whatever the latest awkward and uproot are. They pass through to the new libraries, so scripts written with import awkward1, uproot4 don't need to be changed (though you'll probably want to, for simplicity).
    • uproot-methods no longer causes trouble because there's an uproot3-methods in the dependency chain: awkward0uproot3-methodsuproot3. The latest uproot-methods (no qualification) now excludes Awkward 1.x so that they can't be used together by mistake.
    aswanthkrishna
    @aswanthkrishna
    when trying to install awkward[cuda] with pip getting this error.
    ' Could not find a version that satisfies the requirement awkward-cuda-kernels (from versions: )
    No matching distribution found for awkward-cuda-kernels'
    how to fix?
    Jim Pivarski
    @jpivarski
    Which version is awkward?
    Also, note that the CUDA plugin is very alpha-stage right now.
    Jim Pivarski
    @jpivarski
    If it's listing zero possible versions, it could be because you're not on Linux. We're only developing the CUDA plugin for Linux. Macs don't have Nvidia GPUs and Windows is just generally difficult to support. The main target for the CUDA plugin is large-scale computing clusters.
    aswanthkrishna
    @aswanthkrishna
    awkward is version 1.0.2rc4. I am running it on linux AWS instance. my main motive is to use jagged arrays with cupy. is it possible at this point?
    Jim Pivarski
    @jpivarski
    It depends on what you mean by "use". Jagged arrays have been loaded on a GPU and some simple things have been done with it (ak.num and ufuncs).

    Could it be that your Linux doesn't satisfy "manylinux2014"?

    https://pypi.org/project/awkward-cuda-kernels/1.0.2rc4/#files

    Henry Schreiner
    @henryiii
    Pip needs to be pretty new to pick up manylinux2014 too
    Try to update pip or at least check the version. I’m thinking it’s 19.something for manylinux 2014.
    aswanthkrishna
    @aswanthkrishna
    that solved it. thank you very much :)
    aswanthkrishna
    @aswanthkrishna
    can i access a jagged array from a jitted cuda kernel with numba?
    Jim Pivarski
    @jpivarski

    Yes!

    >>> import awkward as ak
    >>> import numba as nb
    >>> import numpy as np
    >>> @nb.njit
    ... def manual_sum(array):
    ...     out = np.zeros(len(array), np.float64)
    ...     for index, sublist in enumerate(array):
    ...         for item in sublist:
    ...             out[index] += item
    ...     return out
    ... 
    >>> manual_sum(ak.Array([[1.1, 2.2, 3.3], [], [4.4, 5.5]]))
    array([6.6, 0. , 9.9])

    but only if the array is in main memory, not if it's on the GPU. (I only say this because you were looking into the CUDA plugin earlier. You can move an array from main memory to GPU and back with ak.to_kernels.)

    aswanthkrishna
    @aswanthkrishna
    Love it. Thanks for this awsome library! Is there plans to support numba cuda kernels on GPU arrays?
    Jim Pivarski
    @jpivarski
    Yes. I've had off-and-on conversations with Graham Markall at Nvidia about it.
    Numba's nb.cuda.jit currently cannot accept any extension types as arguments or return values, but he's working on adding that to support Numba-compiled UDFs in RAPIDS.ai's cuDF.
    Awkward Array may be the first outside-of-Nvidia project to take advantage of that new feature.
    Chris Lee-Messer
    @cleemesser
    Hello, yesterday I watched J.P.'s SciPy2020 presentation. Awkward looks amazing! Congrats. I would like to use awkward to access a datastructure on disk as an awkward array but I'm having trouble figuring it out. It is a sequence of blocks which I can easily mmap as a numpy array with dtype int16.
    import numpy as np
    arr = np.memmap(fname, np.int16, 0, (n_blocks, blocksize))
    Chris Lee-Messer
    @cleemesser

    Inside each block, there is additional structure to the integers. It is a ragged array with most usually the same length:
    something like

    [ 256 * int16,
      256 * int16,
      1 * int16,
      1 * int16]

    As I try to approximate the type/datashape formatting that I know I need to learn. All the blocks have the same format.
    I feel like this should be easy, perhaps using the ak.from_buffer interface. But I don't see a lot of examples of building an array in this situation. Can you give me a hint? I know ahead of time all the import dimensions ahead of time: the n_blocks, blocksize, and the dimensions of the ragged array inside the block.

    Thanks in advance if you can help
    Jim Pivarski
    @jpivarski
    There isn't an ak.from_buffer, but there's an ak.from_numpy. Any NumPy array can be wrapped as an Awkward Array. I haven't tried it on a memmapped array, but I don't see why it wouldn't work. If the Awkward Array has the same structure as a NumPy array, there isn't a strong advantage to using it, but if you're using it in a context that builds structure, then there's a good reason for it.
    Chris Lee-Messer
    @cleemesser

    Thank you. I can create the awkward array that mimics my numpy with no problems, but it is not clear to me how to add the structure portion.

    akarr = ak.from_numpy(arr)  # following on my example above, works fine but does not contain the block structure, just the information

    In the example on creating an HDF5 version of an awkward array, and reading it back, there is a ak.from_buffers() example uses an awkward Form to specify the layout of the data. I was hoping I could use something like that with my memmaped file.

    Chris Lee-Messer
    @cleemesser
    My guess was it would be something like this:
    import awkward as ak
    import numpy as np
    
    mm_arr = memmap("test.data", np.int16, 0, (n_blocks,blocksize)) # linear memmap
    mmref = np.memmap._mmap  # this might instead use np.int8 or byte
    datasize = 2 * n_blocks*blocksize
    form = """{ <insert the right form definition here to define the nested structure of this array>}"""
    # I don't understand what to point into the above form exactly though. 
    # the equivalent information to form = """ n_blocks * [ 256 * int16, 256*int16, 1 * int16, 1 * int16]"""
    akarr = ak.from_buffers(ak.forms.Form.fromjson(form), datasize, mmref)
    Chris Lee-Messer
    @cleemesser

    The example uses a ListOffsetArray64 with content of a NumpyArray:

    In[164]: form, length, container = ak.to_buffers(dld, container=group)
    In[165]: form
    {
        "class": "ListOffsetArray64",
        "offsets": "i64",
        "content": {
            "class": "NumpyArray",
            "itemsize": 8,
            "format": "l",
            "primitive": "int64",
            "form_key": "node1"
        },
        "form_key": "node0"
    }

    So I'm guessing I need to somehow make a nesting of ListOffsetArray64 with content ListOffsetArray with content NumpyArray

    Jim Pivarski
    @jpivarski
    Usually, we get data that already has a data structure that we have to deal with. In your case, you're starting with flat arrays and want to add structure. How you do that will depend strongly on what it is you're trying to do. ak.unflatten might be useful—it assumes that you have flattened data and numbers of items in each list for you to fill into lists. That will give you variable length lists. You mentioned variable length of regular lists: you might want to reshape the data in NumPy and then run ak.unflatten on that.
    Jim Pivarski
    @jpivarski

    Oh, ak.from_buffers is nothing like np.from_buffer: I hadn't noticed that similarity. (Fortunately, ours is spelled differently!) ak.from_buffers is a somewhat low-level function because it reveals the ListOffsetArray/NumpyArray/etc structure inside of a high-level ak.Array.

    Usually, you wouldn't be hand-crafting Forms and buffers to feed into ak.from_buffers (although I've suggested exactly that for some Python <--> C++ interface projects). ak.from_buffers is a tool that can be used to build backends.

    The example with HDF5 in the tutorial was about building such a backend: there wasn't anything special about HDF5, only that it groups collections of named arrays. Notice that in that example, it wasn't necessary to build a Form or the collection of buffers: ak.from_buffers was intended to be used with ak.to_buffers, for reading from and writing to the new backend.

    Chris Lee-Messer
    @cleemesser
    Ooo. ak.unflatten looks like ticket :-) I will start experimenting. thank you. Thank you for the insight on ak.from_buffers
    Jonas Rübenach
    @jrueb
    Why does ak.Array({})["a"] raise a ValueError and not a KeyError? Generally I would expect to get a KeyError here. ValueError is so general, it's hard to catch only the case where the field does not exist.
    3 replies
    Diego Ramírez García
    @ramirezdiego
    Hi,
    sorry in advance if I missed something trivial, but how should I do to print the awkward1 version I import?
    >>> import awkward1
    >>> awkward1.__version__
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    AttributeError: module 'awkward1' has no attribute '__version__'
    I experience the same issue with the most recent uproot4, by the way.
    Thank you for your help.
    alexander-held
    @alexander-held
    You may have a recent version of awkward1. In that case, it only is a wrapper around awkward (with the new API). The relevant version would then be awkward.__version__ (after importing awkward instead).
    see also this comment https://gitter.im/Scikit-HEP/awkward-array?at=5fc6d5a487cac84fcd01b55e for some more details about this transition
    Diego Ramírez García
    @ramirezdiego

    Thank you, @alexander-held! It makes then sense to set my environment to do

    pip install "awkward>=1.0.0rc1" "uproot>=4.0.0rc1"

    from now on.

    alexander-held
    @alexander-held
    yes, and I think you can drop the "rc1" now, since the proper versions have been released since:
    pip install "awkward>=1.0.0" "uproot>=4.0.0"
    Diego Ramírez García
    @ramirezdiego
    Perfect, thanks again.
    Jim Pivarski
    @jpivarski
    The new awkward1 is a thin wrapper around awkward, but I haven't tested that names starting with underscores get passed through. That's an oversight. Also, which version should it return? The version of the thin wrapper or the awkward library that it wraps? (Ultimately, it's better to use awkward directly; this stub was to avoid breaking scripts that spell the module name "awkward1".)
    Diego Ramírez García
    @ramirezdiego
    :thumbsup: !
    Henry Schreiner
    @henryiii
    I would return awkward.__version__; the __version__ is really just a shortcut for users and is not the canonical version of the package; the actual version of the package can be accessed via importlib.metadata
    Jonas Rübenach
    @jrueb
    Thanks for always fixing bugs to quickly. It really helps a lot.