Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Chelle Gentemann
    @cgentemann
    I think this is fine as is --- just want to demostrate the fast access ---
    for you Chelle, only for you
    Chelle Gentemann
    @cgentemann
    oh rich, you are the best friend ever!
    Lucas Sterzinger
    @lsterzinger
    That's a super fast TTS! (Time To Sun)
    Chelle Gentemann
    @cgentemann
    so i thought of a name for what you all are working on --- how about zarr-meta it explains what you are doing & doesn't use the term cloud-performant which lucas knows I just love...
    Chelle Gentemann
    @cgentemann
    hey - do you all want to do a write up of this for AGU's earth and space science? I would be your editor! I was just on a call this morning where we want to publish more papers that include jupyter notebooks and actually an article like this would find into ESS porfolio really well. this would be a great short paper on the process, it's implications, and then notebooks showing different data examples. I think this would be really valuable.....ESS is open access.
    Lucas Sterzinger
    @lsterzinger
    I'm happy to be involved in that @cgentemann
    Martin Durant
    @martindurant
    A nice calculation would be a correlation versus time, presumably shorter wavelengths lead longer wavelengths. I think that was the original ask.
    By the way, FITS is special, I could subdivide the frames on the biggest axis, if that would prove something interesting.
    Chelle Gentemann
    @cgentemann
    okay---- is it possible to get this to work? JWST created a yaml for science files called ASDF: https://asdf.readthedocs.io/en/stable/
    Chelle Gentemann
    @cgentemann
    okay - I showed one friend but told him not to share until it is all written up --- he was like OMG they get to keep FITS. omg omg omg! no more arguing!
    Chelle Gentemann
    @cgentemann
    i knew I started a draft. maybe this can be a starting point? https://docs.google.com/document/d/1O2dPeB1smArHg62XcNOxwwEpWDwdUiWIn09XS-fr4tc/edit
    I'm particularly interested to see what is written in the section "Describe FSSPEC without saying FSSPEC"
    Martin Durant
    @martindurant
    I am aware of ASDF, the astro-specific “modern” format (but not cloud-friendly!) that has failed to gain tracton for so long. But there are many hurdles to the references method being a viabla entry to FITS for astro analysis. In order, most important first:
    • the majority of FITS files are whole-file compressed, and this only works for uncompressed. We could maybe cope with bzip2, or even better zstd, but gzip is the one that’s used, and it is probably impossible
    • the astropy stack and other tooling assumes FITS and does not integrate with xarray. I could be sold as a way towards using Dask, which they would want to do (cf https://docs.sunpy.org/projects/ndcube/en/stable/index.html for solar, ie., this type of data)
    • astro coordinates are hard to use in the xarray model (this has been discussed extensively ( pydata/xarray#3620 , still unresolved).
    (but ASDF would work well as a target for referencing! https://asdf-standard.readthedocs.io/en/1.6.0/file_layout.html )
    Chelle Gentemann
    @cgentemann
    omg. you are killing me. they don't use internal compression?
    this is really helpful martin, thanks. if we can demo it working, that would be a win, and we can work more with nasa --- if we can show any advancement for data access - just maybe by uncompressing the gzip files after pushing to AWS, maybe that would be a path forward? the astro community is so FITS FITS FITS.... just like I was binary binary binary 30 years ago....
    Martin Durant
    @martindurant
    So if I did that same dataset, but with
    • the wavelength as a coordinate (so you can select it with a slider)
    • mapping to helio lat/lon (or just show how you can recreate the world coordinates)
    • a version with sub-selection in one of the dimensions, so you do a timeseries on a section of the image better
      … is that enough of an argument?
      Should I do the same for the much more massive set of downsampled JPEG images on the public SDO server?
      I was hoping to get around to the dataset in intake/fsspec-reference-maker#78 , as a nice example of a dataset made by merging on multiple dimensions.
    Rich Signell
    @rsignell-usgs
    So cool that FileReferenceSystem now featuring geotiff!
    https://github.com/intake/fsspec-reference-maker/issues/78#issuecomment-924456900
    Martin Durant
    @martindurant
    Perhaps more importantly: that dataset has FIVE dimensions, three aggregated (and two-dimensional images in each chunk)
    Rich Signell
    @rsignell-usgs
    Yeah, that is also very cool
    @martindurant , did you get contacted by Ivelina Momcheva from the Space Telescope Science Institute? I spoke with her a few days ago and she is very interested in your FITS work (and in Pangeo in general).
    Martin Durant
    @martindurant
    I did not
    Martin Durant
    @martindurant
    OK, I will speak with her this afternoon.
    Rich Signell
    @rsignell-usgs
    Awesome!
    Chelle Gentemann
    @cgentemann
    this is great!!!
    Chelle Gentemann
    @cgentemann
    is anyone else going to Ocean Sciences? This looks like a good session that we should submit this project too: https://www.aslo.org/osm2022/scientific-sessions/#od
    Lucas Sterzinger
    @lsterzinger
    I wasn't planning on it since I'm already presenting this at AGU, but if are able to send me to Honolulu I certainly won't complain ;)
    Chelle Gentemann
    @cgentemann
    kk working on invited slot
    start working on an abstract!
    deadline is tomorrow midnigth
    Chelle Gentemann
    @cgentemann
    okay - ken knows your abstract is coming in - he said you have to organize it after submission - & he will look for yours. submit to : OD12 Big Data for a Big Ocean 2022
    Lucas Sterzinger
    @lsterzinger
    @rsignell-usgs do you have an up-to-date NWM + intake catalog example notebook? I'm presenting fsspec-reference-maker at a geospatial workshop on campus on Monday and thought it might be a cool thing to show off
    You could spiff this up with a dask cluster of course
    Lucas Sterzinger
    @lsterzinger
    Thank you!
    Lucas Sterzinger
    @lsterzinger
    Should we make this repo open for Hacktoberfest PRs? Just need to add the "hacktoberfest" topic to the repo https://hacktoberfest.digitalocean.com/resources/maintainers
    Chelle Gentemann
    @cgentemann
    YES!
    Lucas Sterzinger
    @lsterzinger
    (also it's one of the only repos that I actually contribute to often, and I'd like a t-shirt :wink: )
    Rich Signell
    @rsignell-usgs
    There is a repo for this?
    or do you mean the pangeo gallery?
    Lucas Sterzinger
    @lsterzinger
    fsspec-reference-maker
    Rich Signell
    @rsignell-usgs
    ah!
    Martin Durant
    @martindurant
    hacktoberfest can create a lot of noise, but I’ve never been involved before, prepared to give it a go. What do I need to do?
    if we can organize some more slides based on lucas's that introduce the project, then a few more that are where we want to go & what we need work on we should be albe to get a couple volunteers (paid) from nasa's impact project.
    Lucas Sterzinger
    @lsterzinger
    I translated some of my slides into markdown for a workshop I'm giving tomorrow, feel free to use https://github.com/lsterzinger/maptimedavis-fsspec/blob/main/01-Create_References.ipynb
    Martin Durant
    @martindurant
    For strictly passer-by hackers, I would suggest that they investigate how to gett byte offsets into any other file formats that they have lying around.