by

Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 02:19
    ccarouge commented #4428
  • 02:05
    keewis commented #4443
  • 00:32
    spencerkclark commented #4422
  • 00:29
    spencerkclark commented #4422
  • Sep 21 23:44
    pep8speaks commented #4187
  • Sep 21 23:43
    weiji14 synchronize #4187
  • Sep 21 23:04
    spencerkclark commented #4427
  • Sep 21 23:01
    benbovy commented #4230
  • Sep 21 23:00
    spencerkclark commented #4427
  • Sep 21 22:48
    abishekg7 commented #4445
  • Sep 21 21:21
    pep8speaks commented #4187
  • Sep 21 21:21
    weiji14 synchronize #4187
  • Sep 21 20:10
    aarondewindt commented #4176
  • Sep 21 20:09
    aarondewindt commented #4446
  • Sep 21 20:08
    aarondewindt closed #4446
  • Sep 21 20:04
    aarondewindt opened #4446
  • Sep 21 19:53
    dcherian commented #4176
  • Sep 21 19:50
    aarondewindt commented #4176
  • Sep 21 19:45
    aarondewindt commented #4176
  • Sep 21 19:30
    dcherian commented #4422
rpgoldman
@rpgoldman
@dcherian Thanks again -- I really appreciate the help. I must run now, but I am left with the warm feeling that I know what to do next!
Deepak Cherian
@dcherian
:+1:
rpgoldman
@rpgoldman
Very sorry -- I seem to be asking the dumb questions -- how do I select based on a range of coordinate values? I know how to do comparisons using where, but am less sure about coordinates.
Maybe I have just missed something, but I don't see in the indexing docs instructions for selecting, say values for variables in latitudes 12 < x < 15 which seems like something one would want to do all the time.
Deepak Cherian
@dcherian
@rpgoldman there are no dumb questions! only bad docs!
.sel(x=slice(12,15))
for dimension coordinates
Riley Brady
@bradyrx
@rpgoldman see http://xarray.pydata.org/en/stable/indexing.html#nearest-neighbor-lookups for additional details on this. This works on a rectilinear mesh, but if you have a curvilinear or unstructured mesh there are other ways you need to go about doing it.
rpgoldman
@rpgoldman
@bradyrx Thanks -- it hadn't occurred to me that this was a "nearest neighbor" thing.
Riley Brady
@bradyrx
Oops, sorry you want the section directly before that under header “Indexing with dimension names” point #2.
The one I linked to is additional info on if your coordinates don’t exactly match 12 and 15 for instance. It’ll grab the closest ones to that with the method=‘nearest’ argument.
rpgoldman
@rpgoldman
Sorry if I sound difficult, but doesn't this seem more complicated than rotating the dimensions to variables and then using where()? The where syntax seems a lot easier to understand then this nearest neighbor thing.
It's not the selection I'm talking about so much as just the syntax of selecting.
@dcherian Can one use an open-ended slice and just count on it being truncated at the upper limit of the coordinate?
Deepak Cherian
@dcherian
it depends on what you want. if x is a dimension coordinate, then using sel with slice is a lot more efficient
slice(lower_bound, None) will do that. slice(bound) is equivalent to slice(None, bound)
Riley Brady
@bradyrx

I think I made it more complicated by linking you to the wrong section! Nearest-neighbor is for if you’re trying to select exact points on a grid, but those exact points don’t exist.

The syntax is straight forward:

import numpy as np
import xarray as xr

time = xr.cftime_range(start='1990', freq='MS', periods=12)
x = [0.5, 1.5, 2.5, 3.5, 4.5]
da = xr.DataArray(np.random.rand(12, 5), dims=['time', 'x'], coords=[time, x])

# Select date range
da.sel(time=slice(‘1990-03’, ‘1990-5’))

# Select space range
da.sel(x=slice(1.5, 3.5))

# Both
da.sel(x=slice(1.5, 3.5), time=slice(‘1990-01’, ‘1990-04’))
rpgoldman
@rpgoldman
@dcherian Thanks! I had that wrong
@bradyrx Thanks, ok, I get it now.
Riley Brady
@bradyrx
.where() will be evaluating everything in the array to assess if it meets your criteria, and will replace the entry with nan if it doesn’t meet your criteria. As @dcherian mentioned, slice is more efficient from a computational perspective.
No problem! The key thing to get here is the difference between .isel() and .sel() though. I was confused by that at first. .isel() is like index selecting in numpy. .sel() references the coordinates.

So with the following example:

x = [-2, -1, 0, 1, 2]
da = xr.DataArray(np.random.rand(5), dims=’space’, coords=[x])

da.sel(space=0) returns the third entry. da.isel(space=0) returns the first entry.

rpgoldman
@rpgoldman
@bradyrx Yes, I figured that, but was having a hard time figuring out the details of using .sel. I appreciate both of you taking the time to clear it up for me
Riley Brady
@bradyrx
Great! Feel free to ask clarifying questions if they come up.
Ray Bell
@raybellwaves
Copying the rtd setup of xarray over at xskillscore. Curious on your settings on the rtd site for Admin -> Advanced Settings -> Requirements file. I'm opting to use a yaml version for the docs build but it's being finicky i.e. seeing ModuleNotFoundError: No module named 'xskillscore' in the docs build
Deepak Cherian
@dcherian
it's empty
Ray Bell
@raybellwaves
thx
Deepak Cherian
@dcherian
@raybellwaves see pydata/xarray#4350
Ray Bell
@raybellwaves
nice
That PR would have probably solved my original problem. I realized I had to tweak root = pathlib.Path(file).absolute().parent.parent and add an extra .parent like root = pathlib.Path(file).absolute().parent.parent.parent as the conf.py is in docs/source/ instead of docs/ like it is in xarray
James Stidard
@jamesstidard
Hi, I want to be able to quickly check within a *.nc file for what it contains, is there a way to just load an empty dataset where I can still access the .attrsmetadata on each variable? The files I'm working with are large and take a long time to load otherwise. I wasn't able to find anything in the docs, but thought I'd try my chances asking. Thanks
James Stidard
@jamesstidard
found my answer. open_mfdataset will do it with dask. Its pretty fast.
Benjamin Root
@WeatherGod
Another thing that can speed up an open_dataset call is to pass decode_cf=False as well as passing False to some of the other decode_* keyword arguments, depending on whether or not you actually need those features or not.
Riley Brady
@bradyrx
@jamesstidard I think the lowest cost thing on the command line is NCOs. You can do ncdump -h file.nc | grep ATTR for example. There might be an NCO command to directly probe specific attributes. I’m wondering in jupyter/python if you could do something like if !ncdump -h file.nc: where you can declare some attribute and see if the list is full or empty, etc.
Kurt Sansom
@kayarre_gitlab
I would like to extract the schema from a netcdf file read into xarray so that I can define a new dataset with similar or the same format with different data without have having that original netcdf file? is there a simple-ish way to do this?
Kurt Sansom
@kayarre_gitlab
Or I guess write some kind converter using xarray internals.
Riley Brady
@bradyrx
Do any of the devs know how to get the “Helpful resources” working when submitting an issue? See here:
92269289-2a6d9400-eee4-11ea-883b-d3b9f9a495f1.jpeg
Deepak Cherian
@dcherian
image.png
you probably need these:
Joe Hamman
@jhamman
Riley Brady
@bradyrx
We’ve got those, but they’re .rst. Maybe that’s the catch.
Joe Hamman
@jhamman
I think they have to be Markdown.
Riley Brady
@bradyrx
Thanks Joe and Deepak! Yep it looks like it seeks out that file name. We’ll convert our code of conduct and contribution guide over to markdown.
Noushin
@bnoushin7
Hi, I am trying to catalog some monthly datasets, and here is what I am using:
source = xr.open_mfdataset(file_path_name,combine='nested',concat_dim='time') src = source Use intake with xarray kwargs source = intake.open_netcdf(file_path_name,concat_dim='time',xarray_kwargs={'combine':'nested','decode_times':True})
But I am getting this error:

`
Traceback (most recent call last):
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 77, in _decode_cf_datetime_dtype
result = decode_cf_datetime(example_value, units, calendar, use_cftime)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 157, in decode_cf_datetime
dates = _decode_datetime_with_pandas(flat_num_dates, units, calendar)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 109, in _decode_datetime_with_pandas
delta = _netcdf_to_numpy_timeunit(delta)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 53, in _netcdf_to_numpy_timeunit
}[units]
KeyError: 'months'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/homes/nbehboud/COLA-DATASETS-CATALOG/development-codes/generate_catalog.py", line 102, in <module>
generate_catalog()
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/click/core.py", line 829, in call
return self.main(args, kwargs)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback,
ctx.params)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/click/core.py", line 610, in invoke
return callback(
args, kwargs)
File "/homes/nbehboud/COLA-DATASETS-CATALOG/development-codes/generate_catalog.py", line 49, in generate_catalog
source = xr.open_mfdataset(file_path_name,combine='nested',concat_dim='time')
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/backends/api.py", line 908, in openmfdataset
datasets = [open
(p,
openkwargs) for p in paths]
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/backends/api.py", line 908, in <listcomp>
datasets = [open
(p, **open_kwargs) for p in paths]
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/backends/api.py", line 538, in open_dataset
ds = maybe_decode_store(store)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/backends/api.py", line 453, in maybe_decode_store
use_cftime=use_cftime,
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/conventions.py", line 585, in decode_cf
use_cftime=use_cftime,
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/conventions.py", line 494, in decode_cf_variables
use_cftime=use_cftime,
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/conventions.py", line 336, in decode_cf_variable
var = coder.decode(var, name=name)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 426, in decode
dtype = _decode_cf_datetime_dtype(data, units, calendar, self.use_cftime)
File "/homes/nbehboud/.conda/envs/catalog-ing/lib/python3.6/site-packages/xarray/coding/times.py", line 86, in _decode_cf_datetime_dtype
raise ValueError(msg)
ValueError: unable to decode time units 'months since 1980-01-01 00:00' with the default calendar. Try opening your dataset with decode_times=False.
mld_monthly Failed !!!!!!!!!!!!!!!!!!!!!!!!!!!!
`

And I have tried with decode_times=False and got the same error. Any idea?
@dcherian
playertr
@playertr
I've been trying to open a 767-file set of monthly NetCDF data and write it to a Zarr store, but the to_zarr() command is failing with a Max Recursion Depth exceeded error. The weird thing is that the error occurs at different times -- it can happen at anywhere from 11% to 98% completion of the to_zarr() call. I tried increasing ulimit -n and file_cache_maxsize on a hunch, but that didn't work. Before I debug more, do these symptoms sound like a familiar problem with a known workaround?
Deepak Cherian
@dcherian
is the open_mfdataset call for the nc files succeeding? (is that what you're doing) THere are two open issues for recursion errors: one to do with pydap (pydata/xarray#4348) and one about concatenating CFTime and numpy Datetimes (pydata/xarray#3666 ).
Is one of these applicable?