github-actions[bot] on rsignell-usgs-patch-1-preview
Update documentation (compare)
github-actions[bot] on rsignell-usgs-patch-1-preview
Update documentation (compare)
rsignell-usgs on rsignell-usgs-patch-1
Update docs/setup_guides/hpc.rs… (compare)
rsignell-usgs on rsignell-usgs-patch-1
Update docs/setup_guides/hpc.rs… (compare)
sbatch: error: You must specify an account (--account)
In [4]: print(cluster.job_script())
#!/usr/bin/env bash
#SBATCH -J dask-worker
#SBATCH -n 1
#SBATCH --cpus-per-task=16
#SBATCH --mem=60G
#SBATCH -t 00:30:00
#SBATCH -C skylake
JOB_ID=${SLURM_JOB_ID%;*}
/glade/u/home/mrocklin/miniconda/envs/dev/bin/python -m distributed.cli.dask_worker tcp://10.12.203.5:39794 --nthreads 16 --memory-limit 64.00GB --name dask-worker--${JOB_ID}-- --death-timeout 60 --interface ib0
In [5]: import dask
In [6]: dask.config.get("jobqueue.slurm")
Out[6]:
{'name': 'dask-worker',
'cores': 1,
'memory': '25 GB',
'processes': 1,
'interface': 'ib0',
'walltime': '00:30:00',
'job-extra': {'-C skylake': None},
'death-timeout': 60,
'local-directory': None,
'shebang': '#!/usr/bin/env bash',
'queue': None,
'project': None,
'extra': ['--interface', 'ib0'],
'env-extra': [],
'job-cpu': None,
'job-mem': None,
'log-directory': None}
in compute_study.md, it is indicated that
'Duplicate each study for 2, 4, 8, and 16 workers per node (reducing chunk size proportionally)'
But I do not recall this reduction of chunk size for each increase of workers per node in utils.py,
benchmarks/datasets.py
), and it therefore assumes that the total dataset size will be equal to the (chunk size) * (number of nodes) * (number of workers per node)
.
compute_study.md
writeup was to sketch out what would be needed to generate some preliminary scaling studies for various "common" operations done with xarray
and dask
. The results of each study should be a plot of "Number of Nodes" vs "Operation Runtime". However, the "Operation Runtime" depends on much more than "Number of Nodes", including "Number of Workers per Node", "Number of Threads per Worker", "Total Number of Chunks", "Chunk Size", etc.
compute_study.md
document, I tried to find a way to fix all of the other parameters in a way such that each study was "fair." I chose 1 "Chunk per Worker" and 1 "Thread per Worker", and I chose to vary the "Chunk Size" and "Number of Workers per Node". And later we tried to come up with a way of varying the "Chunking Scheme" (i.e., chunk over all dimensions, chunk over only spatial dimensions, chunk over only time), too. But we need to generate data that looks at how these numbers vary with "Chunks per Worker" and "Threads per Worker", too.
Error displaying widget
with e.g. the dask_kubernetes.KubeCluster
widget (or any other widget)? It looks like this is related to ipywidgets==7.5
. I have a Pangeo environment with jupyterlab=0.35
, tornado=5.1.1
and dask_labextension==0.3.3
, because I noticed that it was a working configuration at some point, but I'm not sure this is still the recommended configuration.