Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 17:14
    lgtm-com[bot] commented #1144
  • 17:11
    franzpoeschel labeled #1150
  • 17:11
    franzpoeschel labeled #1150
  • 17:11
    franzpoeschel opened #1150
  • 16:44
    jeanbez commented #1144
  • 16:40
    jeanbez synchronize #1144
  • Nov 23 15:54
    franzpoeschel edited #1146
  • Nov 23 15:50

    franzpoeschel on dev

    Fix: Python Iteration Repr Typo… (compare)

  • Nov 23 15:50
    franzpoeschel closed #1149
  • Nov 23 15:47
    franzpoeschel commented #968
  • Nov 23 11:12
    franzpoeschel edited #1032
  • Nov 23 11:10
    franzpoeschel commented #1032
  • Nov 23 11:05
    franzpoeschel edited #1032
  • Nov 23 11:04
    franzpoeschel edited #1032
  • Nov 23 11:03
    pre-commit-ci[bot] synchronize #1032
  • Nov 23 11:03
    franzpoeschel synchronize #1032
  • Nov 21 04:29
    ax3l commented #968
  • Nov 20 08:56
    ax3l closed #967
  • Nov 20 08:56
    ax3l commented #967
  • Nov 20 08:55
    ax3l commented #967
Axel Huebl
@ax3l
please ping me when something is ready for review, last two weeks were extremely packed...
looks like there was still something to do on the close() PR, @franzpoeschel ?
Franz Pöschel
@franzpoeschel
Ah, I just forgot saying yesterday that I had done that
I think I'm ready for review now
Franz Pöschel
@franzpoeschel
Also, I am coming to a close with the streaming PR
Will need to write some documentation next week and do some cleanup here and there, but overall we can do review on that quite soon too probably
4 replies
Axel Huebl
@ax3l
Nice, will take a look at Iteration::close() PR.
The ADIOS2 paper is out @franzpoeschel @guj :) https://doi.org/10.1016/j.softx.2020.100561
Franz Pöschel
@franzpoeschel
Nice!
Franz Pöschel
@franzpoeschel
Hm, there seems to be a little issue with using steps in ADIOS2 anyway.
So far, I was assuming that a BP3/4 file written with steps (i.e. one step for each iteration) would be readable in random-access manner. (I came to that conclusion since ADIOS will show all variables from all steps when no step is opened yet. Upon opening a step, only the variables written during that step will be shown.)
I moved the openPMD plugin in PIConGPU to use ADIOS2 steps today (i.e. applied the streaming API) and noticed that the resulting files were only readable in step-based manner. This means that a file written with steps must also be consumed with steps.
I don't really like this since it breaks any reading application using the conventional openPMD API for files written with the new API. But we surely want to use steps – they're necessary for streaming and for performance (as @guj pointed out).
An easy (but not really satisfying) way out could be by adding a JSON parameter to the ADIOS2 backend that mandates whether we want to use steps or not. Any other ideas?
(I should probably open up an issue over at ADIOS2, but I wanted to ask here first since I have a feeling this will be nontrivial to fix in ADIOS2)
Franz Pöschel
@franzpoeschel
Further issue (which is likely related to the issue you were discussing w/ Norbert and Dmitry via Email) is that attributes are in no way bound to the step they are created in. When in step 0 in e.g. the BP4 engine, I can already read attributes that the writer has written in step 7. This kind of time travelling does luckily not happen in the SST engine where metadata needs to be sent in each step anew, but it currently prevents using the BP4 engine in the advertised on-the-fly manner for us.
Axel Huebl
@ax3l
I think an iterating data workflow is okay for streaming but for files it's indeed the expectation of users to be able to do random access. definitely something for the mail thread we have open with Norbert right now, and an issue. agreed.
yes, attributes are constant over time and space. I think they want us to transition to variables for everything that changes ("is variable")... another good point to add to the mail thread.
I reviewed your close() PR and it's nearly there. The naming/docs for the reader check of "did my writer close" as well as the verify functionality are not yet clear enough. please see my suggestions inline :)
Besides these very well done!! :)
Franz Pöschel
@franzpoeschel

they want us to transition to variables for everything that changes

in a step-based layout where attributes should become visible after a certain step, this would be.. everything

Axel Huebl
@ax3l
I know, not a big fan either... let's discuss with Norbert...
also: naming collisions, extra conventions on our side to mark "attributes", etc.
absence of a real data model in ADIOS (yet)...
Franz Pöschel
@franzpoeschel
yeah, I would like to avoid that too
I think for the other issue, I will first go with the little workaround that users should specifically opt-in into using steps via JSON until we have a better solution
will look into the close-iteration thing later today
Axel Huebl
@ax3l
:+1: Feel free to chime into the e-mail thread with regards to attributes, I have been a bit buried in work.
Franz Pöschel
@franzpoeschel
will do :)
Axel Huebl
@ax3l
feel free to add the arguments I mentioned a few lines earlier
Franz Pöschel
@franzpoeschel
I've added Python bindings for the streaming API now and documented things a bit
Let's see what the CI thinks of it, but I believe Windows builds are still hanging somewhat

Other topic: Maybe we should think about exposing some Variable::Span-like functionality in the openPMD API?
ADIOS2 already supports something like this, e.g.:

/**
     * Put signature that provides access to the internal engine buffer for a
     * pre-allocated variable including a fill value. Returns a fixed size Span
     * (based on C++20 std::span) so applications can populate data value after
     * this Put. Requires a call to PerformPuts, EndStep, or Close to extract
     * the Min/Max bounds.
     * @param variable input variable
     * @param bufferID if engine has multiple buffers, input 0 when this
     * information is not known
     * @param value provide an initial fill value
     * @return span to variable data in engine internal buffer
     */
    template <class T>
    typename Variable<T>::Span Put(Variable<T> variable, const size_t bufferID,
                                   const T &value);

This could allow to avoid some intermediate copies, e.g. when reading from one Series and writing to another one
(by passing the Variable::Span to the reading Series as destination buffer)

Axel Huebl
@ax3l
Good idea, would a shared buffer for the data not cut it?
Franz Pöschel
@franzpoeschel
I think the only difference between C++20-like std::span and shared buffers is that spans include information on their size which is probably helpful for correctness, but not strictly necessary.
The important part is that the buffer is allocated by the backend and not by the user, so you're right.
Axel Huebl
@ax3l
Hm, and a span has no idea about lifetime in return
Dang, this complex support is super hard
I think I get it in C++ half way, ADIOS1/2 do not have long double complex (ok), HDF5 has compounds but they seem bucked for long double complex (grrr.. ok)
Python and pybind11 are super fun, casting things randomly to array([1,2]) where one wants to pass a np.complex<N>
Axel Huebl
@ax3l
JSON would be interesting, I can only serialize them as real arrays, but that changes the shape on read-back
probably we need to cross-check the type
writing and reading works, just that the shape changes
(and the complex type becomes plain floats again)
Anyway, back to span: do you think span would be useful? even w/o lifetime? what about a shared ptr of span? :D
Or C++24 mdspan :D
Franz Pöschel
@franzpoeschel
oof, the complex thing sounds fun
I don't know how helpful shared ownership would be anyway, since the whole trick about this would be that the buffers are allocated by the backend (i.e. by ADIOS2 for example, not by the ADIOS2 backend in openPMD), so we can only hand out whatever the backend allocated. We don't own those buffers, so we cant really do much about their lifetime

Is there any reason not to serialize a complex number like this in JSON:

{"real": 1.0, "imaginary": 2.0}

?

Franz Pöschel
@franzpoeschel
Also, did I understand you correctly that we could give single global value variables for attributes a try?
Maybe we can add a field to the corresponding IOTask, indicating whether the attribute is constant (all attributes not belonging to a concrete iteration) or variable (all attributes belonging to an iteration).
Axel Huebl
@ax3l
Serialization as dict: possible, but a lot of entries for arrays. maybe as {"real": [all, numbers, here], "imaginary": [all, imag, parts]}?

Also, did I understand you correctly that we could give single global value variables for attributes a try?

Not sure about that, but if ADIOS2 proposes something standardized on their side we should try.

I'll be out of office for the next two weeks, just so you are aware :)
Franz Pöschel
@franzpoeschel

Serialization as dict: possible, but a lot of entries for arrays. maybe as {"real": [all, numbers, here], "imaginary": [all, imag, parts]}?

Sounds to me like the most logical way to go for vec<complex>. Generally, I would say that dicts are the most straightforward implementation for complex numbers in JSON, and especially the most "type safe" if that word makes sense in the same sentence as "JSON".

Also, did I understand you correctly that we could give single global value variables for attributes a try?

Not sure about that, but if ADIOS2 proposes something standardized on their side we should try.

So, let's first wait on their suggestions?

I'll be out of office for the next two weeks, just so you are aware :)

Then have a nice summer ;)

Axel Huebl
@ax3l
Thanks, cu soon!
Franz Pöschel
@franzpoeschel
In the Python API, should we replace iteration.close() with del iteration? I ran into trouble in the streaming implementation because it's easy to let an iteration outlive its series
Franz Pöschel
@franzpoeschel

should we replace iteration.close() with del iteration?

Thinking about this again, we should probably not since this would destroy workflows that access an iteration several distinct times before closing.
I tried reproducing the issue using the C++ API and stumbled upon openPMD/openPMD-api#765. Don't really know yet if it is related.

Axel Huebl
@ax3l
del iteration is equvalent to ~Iteration. I think in our current semantics, a ::close() is not equivalent to just stopping to tempoary write to an iteration
Axel Huebl
@ax3l
I created a slack workspace since that one integrates a bit better with what most people use for work right now. Please ping here if you like an invite :)