Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
  • 15:59

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • 15:57

    jhamman on master

    Update meeting-notes.rst (compare)

  • 15:10
    mrocklin commented #747
  • 14:18

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • 14:16

    rabernat on master

    More informative landing page (… (compare)

  • 14:16
    rabernat closed #748
  • 10:01
    mkjpryor-stfc commented #721
  • 06:30
    Thomas-Moore-Creative commented #747
  • 03:38
    rabernat commented #721
  • 03:36
    rabernat commented #747
  • 03:31
    rabernat synchronize #748
  • 03:18
    rabernat synchronize #748
  • 03:14
    rabernat synchronize #748
  • 03:11
    rabernat opened #748
  • Dec 08 19:54
    mrocklin opened #747
  • Dec 07 15:22
    stale[bot] closed #723
  • Dec 07 15:21
    stale[bot] commented #723
  • Dec 07 00:19
    arokem closed #746
  • Dec 07 00:19
    arokem commented #746
  • Dec 07 00:05
    arokem opened #746
@jhamman I've to customize the installation with GCSFS/Fuse as by https://gcsfs.readthedocs.io/en/latest/fuse.html and seems that the only way ( for my understanding ) is to create a clone repository follow the instruction here https://github.com/pangeo-data/pangeo-cloud-federation. If you have any other idea I'm more than open to avoid a huge headache to me
karan bhatia
looking forward to attend the pangeo community meeting next week http://pangeo.io/meetings/2019_summer-meeting.html ... Thank you all for organizing and in particular providing remote access for those not able to attend in person (but i will be there in person)...
Joe Hamman
Hi Karan, we’re looking forward to having you.
Tina Odaka
Hi, I am in france and can not attend the meeting but would like to remote-attend, i saw that you plan to provide the remote access, do i need to register in advance for remote attending? thank you for your help!
hi kevin , thanks for the marge, i have a question

in compute_study.md, it is indicated that
'Duplicate each study for 2, 4, 8, and 16 workers per node (reducing chunk size proportionally)'

But I do not recall this reduction of chunk size for each increase of workers per node in utils.py,

am i missing something??
Kevin Paul
@tinaok I believe you are correct; utils.py does not account for a reduction in chunk size as you increase the workers per node.
Let me look more closely...
@tinaok Also, if you are planning on attending the Pangeo community meeting remotely, you might be able to present some of your benchmarking work in a Lightning Talk. It would be a very short (and remote) presentation, but it might be possible. @jhamman would know for sure.
Kevin Paul
@tinaok Ok. I've looked more closely at the benchmarking code. (@andersy005 may have something more to say about this, but I'll take a stab myself.) The design of the current code assumes 1 chunk per worker (see benchmarks/datasets.py), and it therefore assumes that the total dataset size will be equal to the (chunk size) * (number of nodes) * (number of workers per node).
This may not be optimal, but I think this can be amended in later versions.
Kevin Paul
The thought behind the compute_study.md writeup was to sketch out what would be needed to generate some preliminary scaling studies for various "common" operations done with xarray and dask. The results of each study should be a plot of "Number of Nodes" vs "Operation Runtime". However, the "Operation Runtime" depends on much more than "Number of Nodes", including "Number of Workers per Node", "Number of Threads per Worker", "Total Number of Chunks", "Chunk Size", etc.
I wanted to consider 2 kinds of studies, strong and weak, since these are considered "canonical" in the HPC world. In the strong scaling studies, the "Total Data Size" should be fixed while the "Number of Nodes" is varied. In the weak scaling studies, the "Data Size per Node" should be fixed while the "Number of Nodes" is varied.
Kevin Paul
When I was coming up with the compute_study.md document, I tried to find a way to fix all of the other parameters in a way such that each study was "fair." I chose 1 "Chunk per Worker" and 1 "Thread per Worker", and I chose to vary the "Chunk Size" and "Number of Workers per Node". And later we tried to come up with a way of varying the "Chunking Scheme" (i.e., chunk over all dimensions, chunk over only spatial dimensions, chunk over only time), too. But we need to generate data that looks at how these numbers vary with "Chunks per Worker" and "Threads per Worker", too.
Joe Hamman
@/all - the agenda and attendee list for next week’s community meeting in Seattle is now final. See details here: http://pangeo.io/meetings/2019_summer-meeting.html#
Remote participation details are also available on this page. @kmpaul - we’ll probably need to wait and see if remote lightning talks will work. We may need a proxy presenter.
Daniel Rothenberg
Such an awesome agenda... very much looking forward to participate remotely as much as possible!
Kevin Paul
@jhamman Thanks for the info regarding remote lightning talks.
@tinaok If you want to present something, and you need a proxy presenter, I'll do it for you.
Scott Henderson
thought folks on this channel might be interested in this job opening at JPL https://jpl.jobs/jobs/2019-10892-Big-Data-Software-Lead
Wow.. that is quite the requirements ;)
James A. Bednar
Good luck!
David Brochart
Is anyone else experiencing the Error displaying widget with e.g. the dask_kubernetes.KubeCluster widget (or any other widget)? It looks like this is related to ipywidgets==7.5. I have a Pangeo environment with jupyterlab=0.35, tornado=5.1.1 and dask_labextension==0.3.3, because I noticed that it was a working configuration at some point, but I'm not sure this is still the recommended configuration.
Matthew Rocklin

I get in to Seattle a bit early (leaving tonight). I plan to mostly work on HPC deployments M/Tu if anyone wants to join.

Also, do folks want to meet up for drinks Tuesday night? I imagine that people will be arriving then.

Ryan Abernathey
@mrocklin - I would enjoy meeting up post-dinner on Tuesday night, preferably in the Eastlake or Capitol Hill area. @jhamman will also be around.
Ryan Abernathey
Question for the kubernetes folks--where is the log that tells me who has been logging into the jupyterhubs?
Joe Hamman
@rabernat - you can look at the hub log kubectl logs -n ocean-prod hub-7d8c8db558-95ntv | grep "Adding user “
We are also logging this on big-query: https://console.cloud.google.com/bigquery?d=dev_pangeo_io_usage_metering&p=pangeo-181919&page=dataset&project=pangeo-181919 but I haven’t looked at these logs yet.
Joe Hamman
@rabernat - okay if I fix up the cmip6 examples binder?
Joe Hamman
actually, I just did it. Two issues for you to peak at: https://github.com/pangeo-data/pangeo-cmip6-examples/issues
Ryan Abernathey
Can someone tell me why dashes in usernames get encoded as -2d when used in kubernetes pod names?
Joe Hamman
This has always been a great mystery to me.
2d is the hex representation of an ascii - still doesn’t explain why k8s puts the hex code in the pod name.
Ryan Abernathey
yup ok
it's fun to have data!
Joe Hamman
Are you looking at the logs?
Ryan Abernathey
everything has to be parsed out of kubernetes API calls
Here are the different API method calls
io.k8s.core.v1.pods.attach.create              8
io.k8s.core.v1.pods.binding.create        291582
io.k8s.core.v1.pods.create                 66682
io.k8s.core.v1.pods.delete                656336
io.k8s.core.v1.pods.eviction.create       312632
io.k8s.core.v1.pods.exec.create              204
io.k8s.core.v1.pods.get                      109
io.k8s.core.v1.pods.portforward.create      3974
io.k8s.core.v1.pods.status.patch              17
io.k8s.core.v1.services.proxy.get              6
what is a binding and what is an eviction?
how do binding.create and eviction.create differ from just create?
Joe Hamman
I wonder if @betatim or @jacobtomlinson have looked this far into the k8s work hole yet?
I do see that there are roughly equal parts create & delete calls so it must be something to do with the type of pod created. All delete calls tend to use the same api.
Tim Head
the scheduler "binds" a pod to a node
and when a pod isn't welcome anymore it gets evicated
for example because there are more important pods that need resources
karan bhatia
@mrocklin @rabernat i'm up for meeting up as well. i'm staying in westlake area, but happy to come up to capitol hill...
Chris Holdgraf
Hey all, @yuvipanda @lheagy and I will be in Seattle tonight. Are folks meeting for dinner or drinks or anything?
David Hoese
Looks like I'm going to have to mute this channel or suffer extreme FOMO