Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 16:11
    rabernat commented #747
  • Dec 09 23:59
    Thomas-Moore-Creative commented #747
  • Dec 09 15:59

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • Dec 09 15:57

    jhamman on master

    Update meeting-notes.rst (compare)

  • Dec 09 15:10
    mrocklin commented #747
  • Dec 09 14:18

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • Dec 09 14:16

    rabernat on master

    More informative landing page (… (compare)

  • Dec 09 14:16
    rabernat closed #748
  • Dec 09 10:01
    mkjpryor-stfc commented #721
  • Dec 09 06:30
    Thomas-Moore-Creative commented #747
  • Dec 09 03:38
    rabernat commented #721
  • Dec 09 03:36
    rabernat commented #747
  • Dec 09 03:31
    rabernat synchronize #748
  • Dec 09 03:18
    rabernat synchronize #748
  • Dec 09 03:14
    rabernat synchronize #748
  • Dec 09 03:11
    rabernat opened #748
  • Dec 08 19:54
    mrocklin opened #747
  • Dec 07 15:22
    stale[bot] closed #723
  • Dec 07 15:21
    stale[bot] commented #723
  • Dec 07 00:19
    arokem closed #746
Kevin Paul
@kmpaul
@tinaok Also, if you are planning on attending the Pangeo community meeting remotely, you might be able to present some of your benchmarking work in a Lightning Talk. It would be a very short (and remote) presentation, but it might be possible. @jhamman would know for sure.
Kevin Paul
@kmpaul
@tinaok Ok. I've looked more closely at the benchmarking code. (@andersy005 may have something more to say about this, but I'll take a stab myself.) The design of the current code assumes 1 chunk per worker (see benchmarks/datasets.py), and it therefore assumes that the total dataset size will be equal to the (chunk size) * (number of nodes) * (number of workers per node).
This may not be optimal, but I think this can be amended in later versions.
Kevin Paul
@kmpaul
The thought behind the compute_study.md writeup was to sketch out what would be needed to generate some preliminary scaling studies for various "common" operations done with xarray and dask. The results of each study should be a plot of "Number of Nodes" vs "Operation Runtime". However, the "Operation Runtime" depends on much more than "Number of Nodes", including "Number of Workers per Node", "Number of Threads per Worker", "Total Number of Chunks", "Chunk Size", etc.
I wanted to consider 2 kinds of studies, strong and weak, since these are considered "canonical" in the HPC world. In the strong scaling studies, the "Total Data Size" should be fixed while the "Number of Nodes" is varied. In the weak scaling studies, the "Data Size per Node" should be fixed while the "Number of Nodes" is varied.
Kevin Paul
@kmpaul
When I was coming up with the compute_study.md document, I tried to find a way to fix all of the other parameters in a way such that each study was "fair." I chose 1 "Chunk per Worker" and 1 "Thread per Worker", and I chose to vary the "Chunk Size" and "Number of Workers per Node". And later we tried to come up with a way of varying the "Chunking Scheme" (i.e., chunk over all dimensions, chunk over only spatial dimensions, chunk over only time), too. But we need to generate data that looks at how these numbers vary with "Chunks per Worker" and "Threads per Worker", too.
Joe Hamman
@jhamman
@/all - the agenda and attendee list for next week’s community meeting in Seattle is now final. See details here: http://pangeo.io/meetings/2019_summer-meeting.html#
Remote participation details are also available on this page. @kmpaul - we’ll probably need to wait and see if remote lightning talks will work. We may need a proxy presenter.
Daniel Rothenberg
@darothen
Such an awesome agenda... very much looking forward to participate remotely as much as possible!
Kevin Paul
@kmpaul
@jhamman Thanks for the info regarding remote lightning talks.
@tinaok If you want to present something, and you need a proxy presenter, I'll do it for you.
Scott Henderson
@scottyhq
thought folks on this channel might be interested in this job opening at JPL https://jpl.jobs/jobs/2019-10892-Big-Data-Software-Lead
Scott
@scollis
Wow.. that is quite the requirements ;)
James A. Bednar
@jbednar
Good luck!
David Brochart
@davidbrochart
Is anyone else experiencing the Error displaying widget with e.g. the dask_kubernetes.KubeCluster widget (or any other widget)? It looks like this is related to ipywidgets==7.5. I have a Pangeo environment with jupyterlab=0.35, tornado=5.1.1 and dask_labextension==0.3.3, because I noticed that it was a working configuration at some point, but I'm not sure this is still the recommended configuration.
Matthew Rocklin
@mrocklin

I get in to Seattle a bit early (leaving tonight). I plan to mostly work on HPC deployments M/Tu if anyone wants to join.

Also, do folks want to meet up for drinks Tuesday night? I imagine that people will be arriving then.

Ryan Abernathey
@rabernat
@mrocklin - I would enjoy meeting up post-dinner on Tuesday night, preferably in the Eastlake or Capitol Hill area. @jhamman will also be around.
Ryan Abernathey
@rabernat
Question for the kubernetes folks--where is the log that tells me who has been logging into the jupyterhubs?
Joe Hamman
@jhamman
@rabernat - you can look at the hub log kubectl logs -n ocean-prod hub-7d8c8db558-95ntv | grep "Adding user “
We are also logging this on big-query: https://console.cloud.google.com/bigquery?d=dev_pangeo_io_usage_metering&p=pangeo-181919&page=dataset&project=pangeo-181919 but I haven’t looked at these logs yet.
Joe Hamman
@jhamman
@rabernat - okay if I fix up the cmip6 examples binder?
Joe Hamman
@jhamman
actually, I just did it. Two issues for you to peak at: https://github.com/pangeo-data/pangeo-cmip6-examples/issues
Ryan Abernathey
@rabernat
Can someone tell me why dashes in usernames get encoded as -2d when used in kubernetes pod names?
Joe Hamman
@jhamman
This has always been a great mystery to me.
2d is the hex representation of an ascii - still doesn’t explain why k8s puts the hex code in the pod name.
Ryan Abernathey
@rabernat
yup ok
it's fun to have data!
Joe Hamman
@jhamman
Are you looking at the logs?
Ryan Abernathey
@rabernat
yup
everything has to be parsed out of kubernetes API calls
Here are the different API method calls
method_name
io.k8s.core.v1.pods.attach.create              8
io.k8s.core.v1.pods.binding.create        291582
io.k8s.core.v1.pods.create                 66682
io.k8s.core.v1.pods.delete                656336
io.k8s.core.v1.pods.eviction.create       312632
io.k8s.core.v1.pods.exec.create              204
io.k8s.core.v1.pods.get                      109
io.k8s.core.v1.pods.portforward.create      3974
io.k8s.core.v1.pods.status.patch              17
io.k8s.core.v1.services.proxy.get              6
what is a binding and what is an eviction?
how do binding.create and eviction.create differ from just create?
Joe Hamman
@jhamman
I wonder if @betatim or @jacobtomlinson have looked this far into the k8s work hole yet?
I do see that there are roughly equal parts create & delete calls so it must be something to do with the type of pod created. All delete calls tend to use the same api.
Tim Head
@betatim
the scheduler "binds" a pod to a node
and when a pod isn't welcome anymore it gets evicated
for example because there are more important pods that need resources
karan bhatia
@lila
@mrocklin @rabernat i'm up for meeting up as well. i'm staying in westlake area, but happy to come up to capitol hill...
Chris Holdgraf
@choldgraf
Hey all, @yuvipanda @lheagy and I will be in Seattle tonight. Are folks meeting for dinner or drinks or anything?
David Hoese
@djhoese
Looks like I'm going to have to mute this channel or suffer extreme FOMO
Joe Hamman
@jhamman
@djhoese - we will miss having you. Next time?
@choldgraf, @mrocklin, @lheagy, @lila, @rabernat - tentative plan to get drinks at Sol Liquor Lounge c.8pm sound okay with people? Lots of good dinner options around there but I won’t be able to join until after the dinner hour.
Ian Rose
@ian-r-rose
Have fun y'all. I am also experiencing the FOMO
Rob Fatland
@robfatland

As indicated on slack some discussion questions for general and education:

Open Questions on Education (will also post on Gitter)
Please read and consider responding out Wednesday. We plan to devote 20 of our 30 minutes on pangeo education to open discussion. Further planning on pangeo Education will follow Thursday.

  • "What is pangeo?" or a longer version "What is unclear or perhaps confusing about pangeo?" is worth articulating: In order to proceed to "Who and what and how is pangeo teaching?"
  • Should (How should?) the pangeo Education WG play a role in testing pangeo tech?
  • Does "publications = evidence of pangeo impact" work as a retrograde motivator on the challenge of "What should the pangeo team be emphasizing today?" (both Education and more broadly)
  • Comment on the value of the imperative "Get dataset X wired into pangeo" ...motivating scientists who use X to migrate to pangeo. What is { X }? Process to select and import X? Should Pangeo Ed teach you how to import some X?
Matthew Rocklin
@mrocklin
I will show off dask-labextension in your honor @ian-r-rose !
It's a year old now!
Ian Rose
@ian-r-rose
Well, what do you know, so it is
Matthew Rocklin
@mrocklin
That was a very useful sprint
Matthew Rocklin
@mrocklin
I'm down for drinks at Sol Liquor Lounge. Do folks want to meet up for dinner beforehand at 7:00pm? Glancing through Yelp, I'll arbitrarily put forth Lionhead, a Sichuan Chinese place (with acceptable vegetarian options) or Corvus and Company (mediterranean).