Discussion channel to work on creating metadata files that can provide zarr type access speeds to older formated data of many different types
rpath = 's3://esip-qhub/noaa/hrrr/jsons/20210901.t00z.wrfsfcf01.json'
s_opts = {'requester_pays':True, 'skip_instance_cache':True}
r_opts = {'anon':True}
fs = fsspec.filesystem("reference", fo=rpath, ref_storage_args=s_opts,
remote_protocol='s3', remote_options=r_opts)
m = fs.get_mapper("")
ds = xr.open_dataset(m, engine="zarr", backend_kwargs=dict(consolidated=False))
ds.data_vars
Data variables:
refd (y, x) float32 ...
si10 (y, x) float32 ...
u (y, x) float32 ...
u10 (y, x) float32 ...
unknown (y, x) float32 ...
v (y, x) float32 ...
v10 (y, x) float32 ...
Finished
In [8]: ds
Out[8]:
<xarray.Dataset>
Dimensions: (time: 9, y: 1059, x: 1799)
Coordinates:
heightAboveGround float64 ...
latitude (y, x) float64 ...
longitude (y, x) float64 ...
step timedelta64[ns] ...
* time (time) datetime64[us] 2019-01-01T22:00:00 ... 2019-01-...
valid_time (time) datetime64[ns] ...
Dimensions without coordinates: y, x
Data variables:
d2m (time, y, x) float32 ...
pt (time, y, x) float32 ...
r2 (time, y, x) float32 ...
sh2 (time, y, x) float32 ...
t2m (time, y, x) float32 ...
Attributes:
Conventions: CF-1.7
GRIB_centre: kwbc
GRIB_centreDescription: US National Weather Service - NCEP
GRIB_edition: 2
GRIB_subCentre: 0
history: 2021-09-02T16:57 GRIB to CDM+CF via cfgrib-0.9.9...
institution: US National Weather Service - NCEP
In [9]: ds.time.values
Out[9]:
array(['2019-01-01T22:00:00.000000', '2019-01-01T23:00:00.000000',
'2019-01-02T00:00:00.000000', '2019-01-02T01:00:00.000000',
'2019-01-02T02:00:00.000000', '2019-01-02T03:00:00.000000',
'2019-01-02T04:00:00.000000', '2019-01-02T05:00:00.000000',
'2019-01-02T06:00:00.000000'], dtype='datetime64[us]’)
(this is the original set of files, not the ones from your newer example)
forecast_time
. While the time
variable would just march forward with uniform 1 hour time steps, the forecast_time
variable would have 1 hour time steps up to the latest forecast, and then 0 hour time steps (all forecast_time
values of the last 18 data records would be the same). Does that make sense?
# cases
# a) this is accum_dim -> note values, dealt with above
# b) this is a dimension that didn't change -> copy (once)
# c) this is a normal var, without accum_dim, var.shape == var0.shape -> copy (once)
# d) this is var needing reshape -> each dataset's keys get new names, update shape