Discussion channel to work on creating metadata files that can provide zarr type access speeds to older formated data of many different types
sources:
nwm-reanalysis:
driver: intake_xarray.xzarr.ZarrSource
description: 'National Water Model Reanalysis, version 2.1'
args:
urlpath: 'reference://'
simple_templates: True
storage_options:
target_options:
anon: true
compression: 'zstd'
target_protocol: s3
fo: 's3://esip-qhub-public/noaa/nwm/nwm_reanalysis.json.zst'
remote_options:
anon: true
remote_protocol: s3
ds = xr.open_dataset(fs.get_mapper(""), engine='zarr')
/home/conda/store/896e738a7fff13f931bce6a4a04b3575ecd1f4cbd0e7da9d83afcc7273e57b60-pangeo/lib/python3.8/site-packages/xarray/conventions.py:512: SerializationWarning: variable 'qBtmVertRunoff' has multiple fill values {-9999000, 0}, decoding all values to NaN.
scale_factor
has round off error, and the _FillValue
is 0 instead of 999900.
In [10]: np.array(0.009999999776482582, dtype="f4")
Out[10]: array(0.01, dtype=float32)
$ ncdump -h 202001011100.CHRTOUT_DOMAIN1.comp | grep streamflow
int streamflow(feature_id) ;
streamflow:long_name = "River Flow" ;
streamflow:units = "m3 s-1" ;
streamflow:coordinates = "latitude longitude" ;
streamflow:grid_mapping = "crs" ;
streamflow:_FillValue = -999900 ;
streamflow:missing_value = -999900 ;
streamflow:scale_factor = 0.01f ;
streamflow:add_offset = 0.f ;
streamflow:valid_range = 0, 5000000 ;
valid_range
: https://cfconventions.org/Data/cf-conventions/cf-conventions-1.8/cf-conventions.html#missing-data