Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Martin Durant
    @martindurant
    We have no tests of the grib2 module… I can only suppose that when testing manually, I was accidentally reusing an fs instance?
    Martin Durant
    @martindurant
    I am running it locally, but it’s taking a while. We probably can’t add the exact contents of the example* functions to the tests.
    Lucas Sterzinger
    @lsterzinger
    That not should not have been a not, other nots notwithstanding :wink:
    Martin Durant
    @martindurant

    Finished

    In [8]: ds
    Out[8]:
    <xarray.Dataset>
    Dimensions:            (time: 9, y: 1059, x: 1799)
    Coordinates:
        heightAboveGround  float64 ...
        latitude           (y, x) float64 ...
        longitude          (y, x) float64 ...
        step               timedelta64[ns] ...
      * time               (time) datetime64[us] 2019-01-01T22:00:00 ... 2019-01-...
        valid_time         (time) datetime64[ns] ...
    Dimensions without coordinates: y, x
    Data variables:
        d2m                (time, y, x) float32 ...
        pt                 (time, y, x) float32 ...
        r2                 (time, y, x) float32 ...
        sh2                (time, y, x) float32 ...
        t2m                (time, y, x) float32 ...
    Attributes:
        Conventions:             CF-1.7
        GRIB_centre:             kwbc
        GRIB_centreDescription:  US National Weather Service - NCEP
        GRIB_edition:            2
        GRIB_subCentre:          0
        history:                 2021-09-02T16:57 GRIB to CDM+CF via cfgrib-0.9.9...
        institution:             US National Weather Service - NCEP
    
    In [9]: ds.time.values
    Out[9]:
    array(['2019-01-01T22:00:00.000000', '2019-01-01T23:00:00.000000',
           '2019-01-02T00:00:00.000000', '2019-01-02T01:00:00.000000',
           '2019-01-02T02:00:00.000000', '2019-01-02T03:00:00.000000',
           '2019-01-02T04:00:00.000000', '2019-01-02T05:00:00.000000',
           '2019-01-02T06:00:00.000000'], dtype='datetime64[us]’)

    (this is the original set of files, not the ones from your newer example)

    That not should have been a not but was not!
    Rich Signell
    @rsignell-usgs
    Essentially a sign error. I love it!
    I will try it!
    Martin Durant
    @martindurant
    How much data is that? I notice that the coordinate is valid_time, but the values of time are mostly missing
    Rich Signell
    @rsignell-usgs
    I was thinking about modifying the combined json to remove time and rename valid_time to be time
    Martin Durant
    @martindurant
    Sounds good. What’s the difference between them?
    Rich Signell
    @rsignell-usgs
    time is the time the model forecast was run. valid_time is the time that corresponds to the data being simulated.
    The size of this 1 day is about 1GB
    Martin Durant
    @martindurant
    OK, so indeed your rename appears right to me. I wonder why they are NaT, though.
    Rich Signell
    @rsignell-usgs
    Does it have something to do with reading only the first two time steps?
    It would be nice to have the time the model forecast was initialized included as a variable so that users could tell whether the forecast data was 1 hour out from initialization or 18, perhaps called forecast_time. While the time variable would just march forward with uniform 1 hour time steps, the forecast_time variable would have 1 hour time steps up to the latest forecast, and then 0 hour time steps (all forecast_time values of the last 18 data records would be the same). Does that make sense?
    Martin Durant
    @martindurant
    I suppose it’s not included in the cases set out in MultiZarrToZarr._build_output:
                # cases
                # a) this is accum_dim -> note values, dealt with above
                # b) this is a dimension that didn't change -> copy (once)
                # c) this is a normal var, without accum_dim, var.shape == var0.shape -> copy (once)
                # d) this is var needing reshape -> each dataset's keys get new names, update shape
    This would be a coordinate that DOES change (maps 1:1 with the accumulation dimension)
    Martin Durant
    @martindurant
    Got talk at PyData Global
    1 reply
    Lucas Sterzinger
    @lsterzinger
    🎉
    As more people start using reference maker (e.g. intake/fsspec-reference-maker#72), should we advertise this gitter in the README?
    Rich Signell
    @rsignell-usgs
    We should create a better name for the gitter if we do, since we know now this isn't just netcdf4
    Hopefully the PyData Global talk will get more attention on this!
    Martin Durant
    @martindurant
    Yes, I think so. There are still plenty of obvious holes like intake/fsspec-reference-maker#74 that had better be fixed.
    I am writing a FITS parser, where times are recorded as ISO strings.
    The coordinates, on the other hand… Also, should waveband be a coordinate or separate variables?
    2 replies
    Lucas Sterzinger
    @lsterzinger

    We should create a better name for the gitter if we do, since we know now this isn't just netcdf4

    Can't we create a gitter directly from that repo name? So it would be intake/fsspec-reference-maker

    Martin Durant
    @martindurant
    intake/ doesn’t have a gitter org (pangeo-data and continuumIO do). But yes, we could do whatever. I still think we might want to wait a bit, though.
    Chelle Gentemann
    @cgentemann
    um i was trying to change the channel name and it offered to change the avatar so i thought well maybe i can just do that but at least on my gitter now all the pangeo icons are gone.... oops.
    shhh don't tell ryan it was me!
    Rich Signell
    @rsignell-usgs
    I still see my Pangeo icons
    I think we should wait a bit also
    Chelle Gentemann
    @cgentemann
    i changed it back! but it is a little different... odd that I could just do that....
    2 replies
    Martin Durant
    @martindurant
    Ooh, the pangeo icon did change...
    Chelle Gentemann
    @cgentemann
    I know nothing.
    but really, WTF. I can accidently change all the icons in a group but in order to change the name of a room I have to email support@gitter??? is this really my fault?
    Martin Durant
    @martindurant
    Bad software design
    Martin Durant
    @martindurant
    400GB of solar imaging referenced in Intake catalog "gcs://mdtemp/SDO.yaml”
    Actually, 2-min cadence images are available in more filters across all times from 2010 to now at http://jsoc2.stanford.edu/data/aia/images/
    Chelle Gentemann
    @cgentemann
    is anyone working on creating the zarr mapping fo rhte solar data ? & a notebook demo-ing the solar image access 'normal' versus fsspec ref maker?
    Martin Durant
    @martindurant
    Above I posted an intake cat with a zarr mapping done. I haven’t done any further work; there no “normal” access to this kind of dataset using astropy, but there is ndcobe , like an alternate xarray, which I don’t have experience with. I don’t think it has a multi-FITS loader from remote, the broader sunpy docs talk about downloading everything.
    Alex Kerney
    @abkfenris
    I've been playing with reference maker and pangeo-forge-recipies and made a repo that uses Github Actions to generate references for OISST each day. https://github.com/gulfofmaine/OISST_intake (look in complete/ and preliminary/ for the references). It's also generating Intake catalogs, but it looks like the file object is pointed to the wrong spot for Intake to load it.
    Martin Durant
    @martindurant
    Instead of "./preliminary/reference.json”, use “{{ CATALOG_DIR }}/reference.json"
    You can open it by the HTTP raw link, or (and I like how this looks) using the fsspec GitHub backend:
    cat = intake.open_catalog("github://gulfofmaine:OISST_intake@/preliminary/reference.yaml”)
    Oh, and you shouldn’t have target_protocol or target_options - they will be derived from the eventual URL
    Martin Durant
    @martindurant
    In general, we don’t want skip_instance_cache: true either, since it can result in repeated work, but I don’t think it will hurt in this case and can be useful for testing.
    When you feel it is ready, please consider adding a setup.py with entry_points and releasing to pypi, so that people can just “install” the dataset and be sure of having the right dependencies. It then appears under the global intake.cat catalog.
    (or submitting it to pangeo-forge, of course)
    Alex Kerney
    @abkfenris
    Ok, I thought that using {{ CATALOG_DIR }}/ would be the right choice. I'm going to try to hand roll a catalog that doesn't have the extra kwargs.
    The reference.yaml's are coming from the recipe generation so those are all the defaults.
    Martin Durant
    @martindurant
    You mean pang-forge-recipes’ HDFReferenceRecipe? The intake stub is intended to be edited (no auomatic way to do that yet).
    Alex Kerney
    @abkfenris
    Yep