Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 05:16
    stale[bot] closed #697
  • 05:16
    stale[bot] commented #697
  • 01:16
    stale[bot] labeled #749
  • 01:16
    stale[bot] commented #749
  • Feb 16 17:37
    cgentemann commented #440
  • Feb 16 17:37
    cgentemann commented #440
  • Feb 16 17:36
    cgentemann commented #440
  • Feb 16 15:02

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • Feb 16 15:00

    rabernat on master

    move srclinks to docs template … (compare)

  • Feb 16 15:00
    rabernat closed #762
  • Feb 15 22:21
    rabernat opened #762
  • Feb 14 20:36
    XingongLi closed #758
  • Feb 11 17:32

    rabernat on gh-pages

    Update docs after building Trav… (compare)

  • Feb 11 17:30

    rabernat on master

    Update data.rst (compare)

  • Feb 11 04:36
    stale[bot] labeled #697
  • Feb 11 04:36
    stale[bot] commented #697
  • Feb 10 20:13
    jbednar commented #707
  • Feb 10 20:00
    stale[bot] unlabeled #707
  • Feb 10 20:00
    jhamman reopened #707
  • Feb 10 19:43
    brendancol commented #707
Matthew Rocklin
@mrocklin
I'm also currently away from a full-sized USB port, maybe tomorrow though?
Joe Hamman
@jhamman
Yeah, sounds good. You’ll find me here tomorrow as well. :airplane:
Pier
@PhenoloBoy
@jhamman is it you that I've to contact to get the git-crypt symmetric key as described here? https://github.com/pangeo-data/pangeo-cloud-federation
Scott
@scollis
@jhamman is there a publication I can cite for Pangeo?
Joe Hamman
@jhamman
@scollis - nothing comprehensive just yet. However, we did publish the NSF proposal (https://figshare.com/articles/Pangeo_NSF_Earthcube_Proposal/5361094) and @guillaumeeb had this conference paper published:
Eynard-Bontemps, G., R Abernathey, J. Hamman, A. Ponte, W. Rath, 2019: The Pangeo Big Data Ecosystem and its use at CNES. In P. Soille, S. Loekken, and S. Albani, Proc. of the 2019 conference on Big Data from Space (BiDS’2019), 49-52. EUR 29660 EN, Publications Office of the European Union, Luxembourg. ISBN: 978-92-76-00034-1, doi:10.2760/848593.
@PhenoloBoy - we’re trying not to share the git-crypt key unless absolutely necessary. Can I ask what you are up to?
Scott
@scollis
Perfect. Thanks… This would be perfect for something like AMS BAMS or suchlike or even a glossy like Science..
Joe Hamman
@jhamman
We’re…trying...
We actually have something in review with Nature but things seem to have stalled out in a big way.
This is on my list of things to deal with today.
Matthew Rocklin
@mrocklin
@jhamman I'm around. I've submitted a job and am going to wait a while to see if it clears
Joe Hamman
@jhamman
cheyenne is down right now so you’ll need to be on casper.
Matthew Rocklin
@mrocklin
Well that's good to know :)
Joe Hamman
@jhamman
do we have a working slurm cluster right now?
https://jupyterhub.ucar.edu/dav will put you on casper (which uses slurm)
Matthew Rocklin
@mrocklin
I'm ssh'ing in
I've just discovered that we use SLURM rather than PBS
sbatch: error: You must specify an account (--account)
In [4]: print(cluster.job_script())
#!/usr/bin/env bash

#SBATCH -J dask-worker
#SBATCH -n 1
#SBATCH --cpus-per-task=16
#SBATCH --mem=60G
#SBATCH -t 00:30:00
#SBATCH -C skylake
JOB_ID=${SLURM_JOB_ID%;*}



/glade/u/home/mrocklin/miniconda/envs/dev/bin/python -m distributed.cli.dask_worker tcp://10.12.203.5:39794 --nthreads 16 --memory-limit 64.00GB --name dask-worker--${JOB_ID}-- --death-timeout 60 --interface ib0


In [5]: import dask

In [6]: dask.config.get("jobqueue.slurm")
Out[6]:
{'name': 'dask-worker',
 'cores': 1,
 'memory': '25 GB',
 'processes': 1,
 'interface': 'ib0',
 'walltime': '00:30:00',
 'job-extra': {'-C skylake': None},
 'death-timeout': 60,
 'local-directory': None,
 'shebang': '#!/usr/bin/env bash',
 'queue': None,
 'project': None,
 'extra': ['--interface', 'ib0'],
 'env-extra': [],
 'job-cpu': None,
 'job-mem': None,
 'log-directory': None}
Matthew Rocklin
@mrocklin
I'm good
I had to copy over my project from my PBS_ACCOUNT environment variable
Joe Hamman
@jhamman
sounds good. Enjoy.
Matthew Rocklin
@mrocklin
It looks like we're storing some config here:
/glade/u/apps/config/dask
Is this global?
Matthew Rocklin
@mrocklin
I'm not certain that that config is optimal
Joe Hamman
@jhamman
@mrocklin - yes, that is the baseline config we have but we can ask for specific edits.
Matthew Rocklin
@mrocklin
It's cool that they've added a baseline config
OK, I'm all set. Thanks for your help @jhamman !
Joe Hamman
@jhamman
do you have some specific suggestions on edits to that config?
Matthew Rocklin
@mrocklin
In the future we should extend dask-jobqueue to respect environment variables, and add project: $PBS_ACCOUNT
Joe Hamman
@jhamman
+1
Matthew Rocklin
@mrocklin
I did have suggestions, but then I realized that I was mixing up two config files
Joe Hamman
@jhamman
Great!
Pier
@PhenoloBoy
@jhamman I've to customize the installation with GCSFS/Fuse as by https://gcsfs.readthedocs.io/en/latest/fuse.html and seems that the only way ( for my understanding ) is to create a clone repository follow the instruction here https://github.com/pangeo-data/pangeo-cloud-federation. If you have any other idea I'm more than open to avoid a huge headache to me
karan bhatia
@lila
looking forward to attend the pangeo community meeting next week http://pangeo.io/meetings/2019_summer-meeting.html ... Thank you all for organizing and in particular providing remote access for those not able to attend in person (but i will be there in person)...
Joe Hamman
@jhamman
Hi Karan, we’re looking forward to having you.
Tina Odaka
@tinaok
Hi, I am in france and can not attend the meeting but would like to remote-attend, i saw that you plan to provide the remote access, do i need to register in advance for remote attending? thank you for your help!
hi kevin , thanks for the marge, i have a question

in compute_study.md, it is indicated that
'Duplicate each study for 2, 4, 8, and 16 workers per node (reducing chunk size proportionally)'

But I do not recall this reduction of chunk size for each increase of workers per node in utils.py,

am i missing something??
Kevin Paul
@kmpaul
@tinaok I believe you are correct; utils.py does not account for a reduction in chunk size as you increase the workers per node.
Let me look more closely...
@tinaok Also, if you are planning on attending the Pangeo community meeting remotely, you might be able to present some of your benchmarking work in a Lightning Talk. It would be a very short (and remote) presentation, but it might be possible. @jhamman would know for sure.
Kevin Paul
@kmpaul
@tinaok Ok. I've looked more closely at the benchmarking code. (@andersy005 may have something more to say about this, but I'll take a stab myself.) The design of the current code assumes 1 chunk per worker (see benchmarks/datasets.py), and it therefore assumes that the total dataset size will be equal to the (chunk size) * (number of nodes) * (number of workers per node).
This may not be optimal, but I think this can be amended in later versions.
Kevin Paul
@kmpaul
The thought behind the compute_study.md writeup was to sketch out what would be needed to generate some preliminary scaling studies for various "common" operations done with xarray and dask. The results of each study should be a plot of "Number of Nodes" vs "Operation Runtime". However, the "Operation Runtime" depends on much more than "Number of Nodes", including "Number of Workers per Node", "Number of Threads per Worker", "Total Number of Chunks", "Chunk Size", etc.
I wanted to consider 2 kinds of studies, strong and weak, since these are considered "canonical" in the HPC world. In the strong scaling studies, the "Total Data Size" should be fixed while the "Number of Nodes" is varied. In the weak scaling studies, the "Data Size per Node" should be fixed while the "Number of Nodes" is varied.
Kevin Paul
@kmpaul
When I was coming up with the compute_study.md document, I tried to find a way to fix all of the other parameters in a way such that each study was "fair." I chose 1 "Chunk per Worker" and 1 "Thread per Worker", and I chose to vary the "Chunk Size" and "Number of Workers per Node". And later we tried to come up with a way of varying the "Chunking Scheme" (i.e., chunk over all dimensions, chunk over only spatial dimensions, chunk over only time), too. But we need to generate data that looks at how these numbers vary with "Chunks per Worker" and "Threads per Worker", too.
Joe Hamman
@jhamman
@/all - the agenda and attendee list for next week’s community meeting in Seattle is now final. See details here: http://pangeo.io/meetings/2019_summer-meeting.html#
Remote participation details are also available on this page. @kmpaul - we’ll probably need to wait and see if remote lightning talks will work. We may need a proxy presenter.