Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
  • 09:24
    seth-p commented #3236
  • 09:23
    seth-p commented #3236
  • 09:22
    seth-p commented #3236
  • 08:43
    ulijh edited #3221
  • 08:41
    ulijh opened #3237
  • 05:54
    shoyer synchronize #3234
  • 00:08
    keewis commented #2956
  • 00:08
    pep8speaks commented #2956
  • 00:08
    keewis synchronize #2956
  • 00:00
    shoyer commented #1603
  • Aug 20 23:21
    shoyer commented #2956
  • Aug 20 23:15
    keewis commented #2956
  • Aug 20 23:07
    jthielen commented #2956
  • Aug 20 23:07
    keewis commented #2956
  • Aug 20 23:06
    keewis commented #2956
  • Aug 20 23:02
    dopplershift commented #2956
  • Aug 20 22:50
    shoyer commented #2956
  • Aug 20 22:25
    shoyer commented #3235
  • Aug 20 22:25

    shoyer on master

    Fix xarray's test suite with th… (compare)

  • Aug 20 22:25
    shoyer closed #3235
David Brochart
@davidbrochart
I can't even build the doc locally. When following http://xarray.pydata.org/en/latest/contributing.html#contributing-to-the-documentation, I get the following error: OSError: [UT_PARSE] Failed to open UDUNITS-2 XML unit database.
Thomas Moore
@Thomas-Moore-Creative

Aloha xarray community. Firstly hopefully this is a constructive place for this question? It seems to be but if stack overflow is the place this community addresses such questions please pull me into line.

I'm coding up some ocean diagnostics that require depth derivatives of both zonal currents and potential temperature. The model data I'm using is on a MOM B-grid so the two variables are on slightly different lat/lon grids . I'm simply trying to linearly interpolate the current data array (DA) onto the temperature DA coordinates. Both these DA's are 4D. I'm following the examples in the xarray docs > http://xarray.pydata.org/en/stable/interpolation.html?highlight=interp_like and attempting to use interp_like( ). From reading the documentation it seems interp_like should be straightforward to apply?

u=ds.u
t=ds.temp
u_on_t=u.interp_like(t)

However, while I get no errors, the u_on_t result has no change in coordinate names or values? Maybe I'm misunderstanding how interp_like can be used? Can anyone point me to any further examples of its use beyond the basic xarray documents? Thanks to many for all the efforts on these great tools.

Thomas Moore
@Thomas-Moore-Creative
!!! Yes, the problem was I expected magic mind reading from xarray. Of course the coordinate names for lat/lon in u differ from t so how is interp_like supposed to know which coordinates to interp on? Renaming the u coordinates to match the t coordinates before calling interp_like yields the desired result of current values interpolated onto the temperature lat/lon grid.
Alan D. Snow
@snowman2
What is the recommended method for reading/writing of mask bands for integer data?
Just curious if there is support or planned support for a mask band instead if just the default convert to float and add _FillValue option?
David Hoese
@djhoese
FYI @snowman2 (think I mentioned this to you in person at my scipy tutorial): Satpy has started a convention of using _FillValue with integer types and trying hard to preserve the integer type whenever possible.
Alan D. Snow
@snowman2
Yes, I remember that. That is actually what inspired this question. Mostly just wondering if xarray has plans to support mask bands outside of just the _FillValue option. This is particularly useful for dataset with a uint dtype and all the data values have meaning. For example, an rgb image.
David Hoese
@djhoese
@snowman2 I think there are many xarray issues on the subject (don't have any links right now)
Alan D. Snow
@snowman2
I did a rudimentary search with the word mask earlier - sounds like I missed some. I will see if I can do a better one and find some.
Ryan Abernathey
@rabernat
Xarray has a gitter, who knew?
Joe Hamman
@jhamman
maybe we should add a badge on the readme?
Ryan Abernathey
@rabernat
good first issue :wink:
Joe Hamman
@jhamman
Ryan Abernathey
@rabernat
I have created a new label in the xarray issues related to "arrays" https://github.com/pydata/xarray/issues?q=is%3Aopen+is%3Aissue+label%3Aarrays
Please add tag other related issues
Ryan Abernathey
@rabernat
FYI, I have started a hackmd doc to share at the sprint: https://hackmd.io/s/H1OlVNLZH
Ryan Abernathey
@rabernat
xarray is currently the most popular sprint at SciPy!
Ryan Abernathey
@rabernat
Joe Hamman
@jhamman
xarray is currently the most popular sprint at SciPy!
Whoop whoop!
Harman Deep Singh
@hdsingh

Hello everyone. I am facing the following issue with .sel:

.sel method gives error when it is used to select float32 values however it works fine for float64
Example:

import xarray as xr
import numpy as np

a = np.asarray([0.    , 0.111, 0.222, 0.333], dtype='float32')
ds = xr.Dataset(coords={'a': (['a'],a )})
ds.a.sel({'a': 0.111})

Error

KeyError                                  Traceback (most recent call last)
~/anaconda3/envs/check3/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   2656             try:
-> 2657                 return self._engine.get_loc(key)
   2658             except KeyError:

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.Float64HashTable.get_item()

xarray version: 0.12.1
numpy version: 1.16.3

Is this something that should be raised as an issue?

David Hoese
@djhoese
@hdsingh I can't speak for the xarray devs, but this sounds like something that should be submitted as an issue
Benjamin Root
@WeatherGod
this is probably because the literal 0.111 is a float64, but becomes something slightly different when downcasted to float32
one day, we'll gain the ability to properly index floating-point coordinates
Harman Deep Singh
@hdsingh
Thanks. I will create an issue for this.
Benjamin Root
@WeatherGod
but, it'll require a bunch of changes in pandas first, which I started last year, but ran out of time for it
cwerner
@cwerner
Hi all. Is it possible to use xr.reindex_like(method='nearest') and require it to NOT pick a NODATA value in the process. I'm using it to downsample from 0.083deg to 0.25deg resolution and would like a valid value if any of the 3x3 source coordinates has one.
Cheers
Zachary Barry
@zbarry

has anyone played around with dask distributed + zarr? it looks like i'm getting weird freezes when i queue things up through persist on the dask scheduler when my code contains open_zarr, and i'm seeing this warning continually pop up: distributed.utils_perf - WARNING - full garbage collections took 21% CPU time recently (threshold: 10%)

the code specifically looks like this:

framerate = 5
minutes_per_timebin = 2
num_frames_per_timebin = minutes_per_timebin * 60 * framerate

def timebin_zarr(filepath):
    zarr_ds = xr.open_zarr(str(filepath))
    zarr_df_f = zarr_ds['df_f'].chunk(dict(time=num_frames_per_timebin))

    num_bins = zarr_df_f.sizes['time'] // num_frames_per_timebin

    df_f_timebins = zarr_df_f.groupby_bins('time', num_bins, include_lowest=True, right=False)

    timebin_means = df_f_timebins.mean('time')

    return timebin_means

percentile_timebin_futures = [timebin_zarr(filepath) for filepath in percentile_zarr_files]
percentile_timebin_futures = [future.persist() for future in percentile_timebin_futures]
Zachary Barry
@zbarry
ok, for others' reference, it looks like my problem was i had divided up my large arrays into 28k chunks, and that doesn't seem to play well with the scheduler
Ryan Abernathey
@rabernat
Zachary Barry
@zbarry
@rabernat - appreciate it!
Filipe
@ocefpaf

Folks, issue pydata/xarray#2368 raised a few problems with datasets that cannot be loaded with xarray but it seems that the most common one is the:

MissingDimensionsError: 'lon' has more than 1-dimension and the same name as one of its dimensions ('lon', 'lat'). xarray disallows such variables because they conflict with the coordinates used to label dimensions.

I believe that PR pydata/xarray#2405 will help with that but I wonder if there is an xarray-foo to load the dataset and rename the variable at the same time. (iris has a similar functionality to "fix" bad CF-conventions, here it would be use to workaround CF-conventions.)

David Hoese
@djhoese
@ocefpaf I think you can tell it to not parse CF (parse_cf=False iirc) and then do things manually. I'm not sure if that effects coordinate creation though
Filipe
@ocefpaf

Is that an open_dataset kw @djhoese? If so I don't have it:

TypeError: open_dataset() got an unexpected keyword argument 'parse_cf'

I'm using xarray 0.12.3.

Kai Mühlbauer
@kmuehlbauer
@ocefpaf it's decode_cf
Ryan Abernathey
@rabernat
@ocefpaf - I don't think decode_cf will fix your problem. It's a kind of thorny challenge.
This has been one of @dopplershift's long-standing complaints about xarray. I tried to start fixing it but didn't have the time (it's my PR).
Ryan May
@dopplershift
That's still on my list to get to as well
Filipe
@ocefpaf

@ocefpaf it's decode_cf

Thanks. That one I know about and it is in one of my examples in the issue above :smile:

This has been one of @dopplershift's long-standing complaints about xarray. I tried to start fixing it but didn't have the time (it's my PR).

Yep. Mine too.

I'll take a look at the PR. I'm not too familiar with xarray code lately bit I can help test it.
Ryan Abernathey
@rabernat
Our best bet may be to just use our contract with Anaconda and ask someone there to tackle it. It seems unlikely to be resolved by volunteer effort.
Filipe
@ocefpaf
:+1: glad to work with whomever at Anaconda will tackle this.
Andrew Tolmie
@DancingQuanta
Hi, I encountered a situation where I wanted to slice along a dim with an array along other dim.
The slice object slice() does not accept non-scalar array. It only work with scalar data.
Joe Hamman
@jhamman
@DancingQuanta - take a look at xarray’s support for vectorized indexing: http://xarray.pydata.org/en/stable/indexing.html#vectorized-indexing
Andrew Tolmie
@DancingQuanta
Ah I have to create an proper indexer
Maximilian Gorfer
@magonreal
I am using xarray.Dataset.groupby and apply a custom function. Is it possible to have one loop of that function returning without a valid dataset? In some cases, it is not possible to get a valid result from the function and I would like to just skip this one operation and get a reduced output dataset after the groupby.
Andrew Tolmie
@DancingQuanta
There has been work on explicit indexes. I wonder whether this reform will allow hierarchy/mulitindex along a dimension?