Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
Pete Pokrandt
@PTH1_twitter
Ok, I didn't think so, but wasn't sure if I was missing something
I was thinking I could do one ncss call and grab all of the data subset that I need for all of my maps in one call, but I guess not
Zach Bruick
@zbruick
I just tried playing around with it and it seems like if you try to add a vertical_level call to each variable you request, it just overrides it with the last level specified, or errors out if the variables don't share the same vertical coordinate name.
Mike
@mikebelanger
hey guys, I'm trying to perform a time query on UCAR's gfs forecast archives
ideally, I'd like to be able to specify any datetime once, and get the nearest grib2 file to that time.
but so far, I've had to pass in the year, month and day into a catalog URL, in order to get the grib2
for example: cat = TDSCatalog('https://rda.ucar.edu/thredds/catalog/files/g/ds084.1/2017/20171001/')
if I try something more general, like cat = TDSCatalog('https://rda.ucar.edu/thredds/catalog/files/g/ds084.1/') I don't get any datasets
Mike
@mikebelanger
if I try something like the time series tutorial, like cat = TDSCatalog('https://rda.ucar.edu/thredds/catalog/files/g/ds084.1/' '?dataset=files/g/ds084.1/2017')
like adding the dataset query parameter - I still get no datasets
Ryan May
@dopplershift
Try this:
cat = TDSCatalog('https://rda.ucar.edu/thredds/catalog/files/g/ds084.1/catalog.xml')
cat.catalog_refs
In that top-level catalog, there are not any datasets, but there are references to other catalogs, which are available from the .catalog_refs attribute.
Amelia
@apottr
hi all!! I was wondering if anyone knew if there was a reason for the UWYO data output of siphon to not include some of the columns (relative humidity, mixing ratio, etc)?
Ryan May
@dopplershift
@apottr I think because the other parameters are derived from the reported data, and for our original uses we would just use MetPy to calculate. Feel free to open an issue if having these values would help you...or even better I’d merge a PR doing it.
Mike
@mikebelanger
@dopplershift thanks - so to get a grib2 in this case, I'd have to first query the top level as you've demonstrated, get a particular catalog_ref, and then re-query with the catalog ref added on? Then repeat until I get all the way 'down the tree'?
Ryan May
@dopplershift
@mikebelanger Pretty much, unfortunately since the RDA does not have the data set up (as far as I can tell) using a TDS Grib Collection. Another option for you is to programmatically generate the url, like:
base_url = 'https://rda.ucar.edu/thredds/catalog/files/g/ds084.1/'
dt = datetime(2017, 10, 1)
url = base_url + f'{dt:%Y/%Y%m%d}'
Not completely sure I got the syntax completely correct, but hopefully the idea comes across
Mike
@mikebelanger
ah ok - yeah no probs. I've been doing something similar to the snippet you posted. Some time stuff and string interpolation is currently what I'm doing.
I just wanted to make sure I wasn't missing some other method, or some archive feature
Amelia
@apottr
@dopplershift I figured that was the case. I'll put something together and submit a PR.
Mike
@mikebelanger
hey guys I'm having an issue accessing RDA's archives. An issue similar to here: GIS4WRF/gis4wrf#25. Basically I'm getting a HTTPSConnectionPool(host='rda.ucar.edu', port=443): Max retries exceeded with url: /thredds/catalog/files/g/ds084.1/2019/20190224/catalog.xml (Caused by NewConnectionError('<urllib3.connection.VerifiedHTTPSConnection object at 0x7fec30e04650>: Failed to establish a new connection: [Errno 110] Connection timed out',))
I'm getting that error from running a script on my remote machine.
I can still access that url from my local laptop
so its probably something to do with IP configuration?
Mike
@mikebelanger
hmm, looking more into this issue - I've found the following e-mail thread: https://www.unidata.ucar.edu/support/help/MailArchives/thredds/msg02351.html
The email from Unidata THREDDS Support, sent on Tuesday, April 18, 2017 11:36 AM, references the above error
judging from the rest of the e-mail exchange - the problem is related to an overload of the server
if this is a THREDDs issue - I can stop reporting it here, and contact support
but I want to make sure I'm using siphon properly
Ryan May
@dopplershift

I think it was just a server issue before. I'm having no problems with:

cat = TDSCatalog('https://rda.ucar.edu//thredds/catalog/files/g/ds084.1/2019/20190224/catalog.xml')

right now.

Mike
@mikebelanger
@dopplershift hey thanks for the response. I have to confess - I'm doing lots of queries - the above url was just one example.
Ryan May
@dopplershift
@mikebelanger Well I'd say that one example shows that there's nothing inherently wrong in the way you're using siphon itself. If you're angering the server somehow because you're doing too many, too quickly, well... :wink:
Pete Pokrandt
@PTH1_twitter
I think there may be an error in the way that either netcdf-java or thredds is interpreting accumulated precip valid times. I know there's a weirdness already, that the NCEP GFS grib files have both a T0-now accumulation, and an either 3 or 6h accumulation (depending on if the valid time is divisible by 6 or not) and thredds/netcdf-java has no way of distinguishing between those two things, since they both use the same variable name and are valid at the same time. But..
Pete Pokrandt
@PTH1_twitter
I've been trying to plot accumulated precip valid at a time, using python pulling from a thredds server, where I am only including the T0 to now grib records in the file I'm pulling from, and the values were always wrong, when compared to a few other sources of plots. I think I just figured out that either thredds or netcdf-java is assuming the T0 to now accumulated precip is actually valid at the midpoint, e.g. (T0 to now)/2, because if I plot the precip at that time, it matches the other (gempak, etc) plots at T0-now. Has been driving me nuts for a long time.
Let me know if I can help demonstrate what I'm talking about or othewise help get this fixed.. Not sure if it's thredds or netcdf-java - I suspect the latter.
Seems like although it has a time start and time end, it's not an average value over that time, but rather valid at the end time.
Sean Arms
@lesserwhirls
Greetings Pete! Can you point me to the problematic dataset? When you pick a time to use for the plot, are you pulling from just the time variable, or are you looking at the the time variables "bounds" attribute to get the full picture of the time associated with the interval? Have you tried looking at the data through the IDV shudders?
Sean Arms
@lesserwhirls

Unfortunately, NCSS does not allow requesting multiple levels. (Correct me if I’m wrong @lesserwhirls )

Dang...sorry for the delay here @PTH1_twitter @dopplershift ! Currently in 5.0, NCSS can only pull out a single level, but all the variables in the request need to be on the same vertical coordinate system. For example, in GRIB collections, you'll often see coordinates like isobaric1, isobaric2, height_above_ground, altitude_above_msl, etc...the variables would all need to use the same coordinate in order to subset a single level from the vertical (e.g. all using isobaric1 and not isobaric2). Also, the value you'd use in the request needs to be of the same unit as the coordinate system is defined, as the NCSS API does not support supplying a unit. So, for GFS 0.25 degree forecast temperatures on isobaric surfaces, we'd have pressure in Pa, and would request something like vertCoord=85000.

Pete Pokrandt
@PTH1_twitter
It happens with any of the GFS 1 deg data sets actually.. on thredds-test.unidata.ucar.edu, thredds.unidata.ucar.edu, thredds.aos.wisc.edu
I can shoot you the notebook I was using to test if that helps. I did notice that if I pull from thredds-test.unidata.ucar.edu and ask for data valid at what would be time 0 of the data set, it gives me the 3h precip values, but marked with a time of 1.5h. If I request the same from thredds.unidata.ucar.edu or thredds.aos it bails with a time out of bounds type error.
But on all of those, if I request data valid at 3h, I get data that looks exactly like the 6h gempak/pivotal/wxp/whatever plots. The max value from thredds at 3h is identical to the gempak max value at 6h too
Pete Pokrandt
@PTH1_twitter
Crap. Now I'm running the notebook again and not seeing the same behavior. Let me keep looking and get back to you again.. grr. been chasing this down for months..
Pete Pokrandt
@PTH1_twitter
No, that's not right. it's still wrong. If I request data from the 12 UTC run for 18 UTC (a 6h forecast) the time listed in the ncss subset is 6h but the data actually matches the 12h forecast from gempak/pivitol/wxp/etc. thredds is assuming that the 6h time is in the middle of the begin/end periods, rather than at the end.
It doesn't show in the notebook I'm testing with, but it does in a python plotting program. It's not the cleanest looking program but I can send it to you if it would help.
Sean Arms
@lesserwhirls
Ok, I think I see what the TDS is doing here. It looks like the way the NCSS handles mixed interval variables changed between 4.6 and 5.0, so that's one reason why you are seeing different behavior between thredds.ucar.edu and thredds.aos (thredds-test, thredds-dev, etc.). However, we'll focus on 5.0 here.
It all comes down to trying to pick which interval to use when a requested time is contained within more than one grid. For example, let's suppose that we have two grids for Total_precipitation_surface_Mixed_intervals_Accumulation valid at 2019-11-30 18Z, but one is a 6 hour accumulation and one is a 3 hour accumulation.
Now, let's say a user uses NCSS requesting a single time, 2019-11-30T18:00:00. What should the server do?
I think the line of thought was that the user supplied one date/time, and generally expects to receive one grid in return. If we go with that assumption, then the question becomes which grid does the server return - the one representing a 3 hour total or the one representing a 12 hour total?
What the server does is compute the mid-point value from the bounds of the two accumulation totals, and chooses the one whose mid-point is closest to the requested time (effectively the one with the smallest accumulation time period).
What if a time range is given instead of a single value? Well, NCSS bombs out and returns a message that basically says "not yet implemented". What should we do there? Probably an extension of the same idea for a single value. But is that right, or should we return any grid that overlaps with the interval requested?