Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Pete Pokrandt
@PTH1_twitter
I think there may be an error in the way that either netcdf-java or thredds is interpreting accumulated precip valid times. I know there's a weirdness already, that the NCEP GFS grib files have both a T0-now accumulation, and an either 3 or 6h accumulation (depending on if the valid time is divisible by 6 or not) and thredds/netcdf-java has no way of distinguishing between those two things, since they both use the same variable name and are valid at the same time. But..
Pete Pokrandt
@PTH1_twitter
I've been trying to plot accumulated precip valid at a time, using python pulling from a thredds server, where I am only including the T0 to now grib records in the file I'm pulling from, and the values were always wrong, when compared to a few other sources of plots. I think I just figured out that either thredds or netcdf-java is assuming the T0 to now accumulated precip is actually valid at the midpoint, e.g. (T0 to now)/2, because if I plot the precip at that time, it matches the other (gempak, etc) plots at T0-now. Has been driving me nuts for a long time.
Let me know if I can help demonstrate what I'm talking about or othewise help get this fixed.. Not sure if it's thredds or netcdf-java - I suspect the latter.
Seems like although it has a time start and time end, it's not an average value over that time, but rather valid at the end time.
Sean Arms
@lesserwhirls
Greetings Pete! Can you point me to the problematic dataset? When you pick a time to use for the plot, are you pulling from just the time variable, or are you looking at the the time variables "bounds" attribute to get the full picture of the time associated with the interval? Have you tried looking at the data through the IDV shudders?
Sean Arms
@lesserwhirls

Unfortunately, NCSS does not allow requesting multiple levels. (Correct me if I’m wrong @lesserwhirls )

Dang...sorry for the delay here @PTH1_twitter @dopplershift ! Currently in 5.0, NCSS can only pull out a single level, but all the variables in the request need to be on the same vertical coordinate system. For example, in GRIB collections, you'll often see coordinates like isobaric1, isobaric2, height_above_ground, altitude_above_msl, etc...the variables would all need to use the same coordinate in order to subset a single level from the vertical (e.g. all using isobaric1 and not isobaric2). Also, the value you'd use in the request needs to be of the same unit as the coordinate system is defined, as the NCSS API does not support supplying a unit. So, for GFS 0.25 degree forecast temperatures on isobaric surfaces, we'd have pressure in Pa, and would request something like vertCoord=85000.

Pete Pokrandt
@PTH1_twitter
It happens with any of the GFS 1 deg data sets actually.. on thredds-test.unidata.ucar.edu, thredds.unidata.ucar.edu, thredds.aos.wisc.edu
I can shoot you the notebook I was using to test if that helps. I did notice that if I pull from thredds-test.unidata.ucar.edu and ask for data valid at what would be time 0 of the data set, it gives me the 3h precip values, but marked with a time of 1.5h. If I request the same from thredds.unidata.ucar.edu or thredds.aos it bails with a time out of bounds type error.
But on all of those, if I request data valid at 3h, I get data that looks exactly like the 6h gempak/pivotal/wxp/whatever plots. The max value from thredds at 3h is identical to the gempak max value at 6h too
Pete Pokrandt
@PTH1_twitter
Crap. Now I'm running the notebook again and not seeing the same behavior. Let me keep looking and get back to you again.. grr. been chasing this down for months..
Pete Pokrandt
@PTH1_twitter
No, that's not right. it's still wrong. If I request data from the 12 UTC run for 18 UTC (a 6h forecast) the time listed in the ncss subset is 6h but the data actually matches the 12h forecast from gempak/pivitol/wxp/etc. thredds is assuming that the 6h time is in the middle of the begin/end periods, rather than at the end.
It doesn't show in the notebook I'm testing with, but it does in a python plotting program. It's not the cleanest looking program but I can send it to you if it would help.
Sean Arms
@lesserwhirls
Ok, I think I see what the TDS is doing here. It looks like the way the NCSS handles mixed interval variables changed between 4.6 and 5.0, so that's one reason why you are seeing different behavior between thredds.ucar.edu and thredds.aos (thredds-test, thredds-dev, etc.). However, we'll focus on 5.0 here.
It all comes down to trying to pick which interval to use when a requested time is contained within more than one grid. For example, let's suppose that we have two grids for Total_precipitation_surface_Mixed_intervals_Accumulation valid at 2019-11-30 18Z, but one is a 6 hour accumulation and one is a 3 hour accumulation.
Now, let's say a user uses NCSS requesting a single time, 2019-11-30T18:00:00. What should the server do?
I think the line of thought was that the user supplied one date/time, and generally expects to receive one grid in return. If we go with that assumption, then the question becomes which grid does the server return - the one representing a 3 hour total or the one representing a 12 hour total?
What the server does is compute the mid-point value from the bounds of the two accumulation totals, and chooses the one whose mid-point is closest to the requested time (effectively the one with the smallest accumulation time period).
What if a time range is given instead of a single value? Well, NCSS bombs out and returns a message that basically says "not yet implemented". What should we do there? Probably an extension of the same idea for a single value. But is that right, or should we return any grid that overlaps with the interval requested?
Sean Arms
@lesserwhirls
What could help here, for both cases, is for the NCSS API to allow a user to specify an interval size in addition to the time, something like time=2019-11-30T18:00:00&timeInterval=P6h, which would signal a 6 hour duration for the accumulation valid at 2019-11-30 18Z. Then, for a request with a time range, it would return any 6 hour accumulation intersecting the time_start and time_end parameters of the request. We'd still need to decide what to do by default if no timeInterval was specified in the request.
Pete Pokrandt
@PTH1_twitter
That might do it. I'm trying to relate what the thredds/ncss is doing when compared to how you'd do it in GEMPAK. In GEMPAK, let's look at the 9h forecast time. There are two variables valid at that forecast hour which are accumulated precip. One is a 3h (6 to 9h accum) and the other is a total (time zero to 9h) accumulation. These both show up as valid at the 9h forecast time, but one is named P03M (that last M means millimeters) and the other is named P09M. Similaraly, at, for example, 48h, there is one that is a 6h accumulation (42 to 48h - don't ask me why NCEP uses 6-hourly accumulations at 6/12/18... and 3h at 3/9/15/21... ) and the other is the total (time zero to 48h) - both are valid at the 48h forecast time - one is named P03M and the other is P48M.
I understand that thredds is trying to be forgiving and give me some data at the time that I requested, but I would like a way to say give me this exact data at this exact time or give me an error.
I think being able to tell ncss the timeInterval might fix that - because I could ask for a Total_precipitation_surface_Mixed_intervals_Accumulation from the 12 UTC 29 Nov, 2019 data set, valid at 21 UTC (9h forecast) and either set timeInterval to 3 to get the 3h accumulation product, or timeInterval to 9h (I guess?) to get the 9h product. Or maybe there's a special time/word that would use the 0 to now accumulation? Is that kind of thing going to be really specific to this data?
Sean Arms
@lesserwhirls
Unfortunately, yes, a lot of this is specific to GRIB. However, I think this particular extension to NCSS is needed, otherwise you cannot uniquely request a known combination of valid time and time interval. The 0-valid time (run total) is a special case, and we'd want to be able to handle that. I've started a github issue at Unidata/tds#55 - let's see what ideas we can come up with.
Pete Pokrandt
@PTH1_twitter
Sounds good, thanks! I'm just glad I finally figured out what was happening. Was driving me nuts, because the plots I was making matched my wxp ones for pressure, height, temp, etc, but the precip was always wrong. I'm kind of surprised that people haven't come across this problem already. If you just request a certain time accumulated precip from thredds, it's most likely incorrect..
Sean Arms
@lesserwhirls
For mixed interval variables in which there are multiple grids at the same valid time (but with differing interval widths), then it gets nasty for sure. If there are no overlaps, then the mid-point stuff should not come into play. My guess is people grab from OPeNDAP or cdmremote, although if they don't inspect the bounds attribute to see exactly what they are dealing with, then there will be issues as well. The funkiness of translating a collection of GRIB messages to something that looks like netCDF introduces all kinds of weird stuff. I don't think anyone would write a netCDF file with these kinds of mixed intervals on purpose, at least without a lot of soul searching.
agktogether
@agktogether
hello
dome one is therer
@dcamron
i need help
agktogether
@agktogether
this link for today is not working
this link below is not working for today 12 may 2020
or i can not config this link query ... i use this data it was working for 24 12 2019 but i can not get data today ..some one can help me?
what is root and config page to set these variable to access data of today
i really confused
thank
Sean Arms
@lesserwhirls
(I answered @agktogether via eSupport, but posting here in case others are wondering). The issue was that the NCEI server didn't have data for 12 May yet (only up to the 9th at this point). For the most recent output, https://thredds.ucar.edu/thredds/catalog/grib/NCEP/GFS/Global_0p5deg_ana/catalog.html should work (other servers are available, too).
agktogether
@agktogether
and i thanked you @lesserwhirls there but i post here for appreciating more your answer
thanks for your fast answering i got really surprised
hannah
@story645
Hi favorite people - I'm using the unidate education gateway thing this summer (thanks @lesserwhirls ) and have like 4 students who are gonna be working on goes 16/goes 17 and was wondering what's the best way to have them not download that 4 times...also does unidata have a cloud version of the gfs-mos?
Ryan May
@dopplershift
We ourselves don't have any cloud MOS. Maybe Google? It just sounds familiar, but I could be making it up. Don't worry about them hitting it 4 times from us, that's fine. If you really care, you could have them access GOES-16/17 from the noaa-goes16 and noaa-goes17 S3 buckets
Dan Adriaansen
@DanielAdriaansen
I would like to get a version of Siphon that contains the fix in #291 - I am using Anaconda. Do I need to build from src from master, or is there a RC somewhere on conda-forge that I can grab that would include these changes? Thanks!
Ryan May
@dopplershift
Boy we really need to make a release. :sob: In the meanwhile, if you have git available, you could do: python -m pip install git+https://github.com/Unidata/siphon.git
joleenf
@joleenf

I am currently trying to read Radar Level III storm track information from a THREDDS server using Siphon. One of the data variables is text, the second is a Structure. Is there anything I can do with that second variable? Are there any documentation online regarding these files?

``` from siphon.catalog import TDSCatalog
from siphon.radarserver import RadarServer
from siphon.cdmr import Dataset
from datetime import datetime, timedelta

cat = TDSCatalog("http://thredds.ucar.edu/thredds/radarServer/catalog.xml")
url = cat.catalog_refs['NEXRAD Level III Radar from IDD'].href
rs = RadarServer(url)
query_latest = rs.query()
now = datetime.utcnow()
query_latest.lonlat_box(292.9375, 235.0625, 25.0625, 52.9375).time(now).variables('NST')

query_latest_cat = rs.get_catalog(query_latest)

data_available = list(query_latest_cat.datasets.values())
if len(data_available) > 0:
print (data_available[0].access_urls['CdmRemote'])
data_from_thredds = Dataset(data_available[0].access_urls['CdmRemote'])
else:
print ("Empty query")

print (data_from_thredds)```

Ryan May
@dopplershift
Well, one option is to try to use opendap, which translates the structure. I'm not sure if that's more helpful or not:
nc = data_available[0].remote_access(service='OPENDAP')
joleenf
@joleenf
Hi Ryan, I tried both using opendap with netcdf4 and with xarray, the second variable was available/created? with netcdf4 and I am not sure xarray provided easier access to the variable. Though perhaps presents it differently. xarray creates five data variables. Two are string ndarrays of shape (1,) and three are int16 of shape(1,). I am thinking that this dataset is comprised of table-like data...but I don't really know. I have not found format information for these files, though I am still looking.
joleenf
@joleenf
@dopplershift Looking a little closer at this through the various tools, siphon, netcdf4, xarray... I think that the problem is that the data in the file does not comply with cdm standards as far as I can tell. I will just try to get the weather and climate toolkit scripting working. Hopefully I can point that to an opendap server via a command line call.
Ryan May
@dopplershift
You can open these using MetPy:
from metpy.io import Level3File
from siphon.catalog import TDSCatalog
from siphon.radarserver import RadarServer
from datetime import datetime, timedelta

cat = TDSCatalog("http://thredds.ucar.edu/thredds/radarServer/catalog.xml")
url = cat.catalog_refs['NEXRAD Level III Radar from IDD'].href
rs = RadarServer(url)
query_latest = rs.query()
now = datetime.utcnow()
query_latest.stations('GRR').time(now).variables('NST')
query_latest_cat = rs.get_catalog(query_latest)
data_available = list(query_latest_cat.datasets.values())
f = Level3File(data_available[0].remote_open())
The internal data structure isn't great and is pretty low level, but for these products, f.sym_block[0] should give you a collection of what's in the file, in order.
joleenf
@joleenf
@dopplershift Actually, f.sym_block[0] is very helpful. It at least provides a dictionary so it is easier to work with.
Dan Adriaansen
@DanielAdriaansen

Boy we really need to make a release. :sob: In the meanwhile, if you have git available, you could do: python -m pip install git+https://github.com/Unidata/siphon.git

Thank you so much for this suggestion!

Ryan May
@dopplershift
No problem!