These are chat archives for dereneaton/ipyrad

2nd
Sep 2017
tommydevitt
@tommydevitt
Sep 02 2017 02:18

@dereneaton @isaacovercast Deren and Isaac: Still trying to figure out the ipyrad API. I started a jupyter notebook on a node. From the jupyter notebook, I tried to start the ipcluster instance separately by submitting a separate job submission script and got this:

TimeoutErrorTraceback (most recent call last)

<ipython-input-3-34fef5f1eb52> in <module>()
1 ## connect to a running ipcluster instance
----> 2 ipyclient = ipp.Client(profile="MPI48")

/home1/miniconda2/lib/python2.7/site-packages/ipyparallel/client/client.pyc in init(self, url_file, profile, profile_dir, ipython_dir, context, debug, sshserver, sshkey, password, paramiko, timeout, cluster_id, **extra_args)
493
494 try:
--> 495 self._connect(sshserver, ssh_kwargs, timeout)
496 except:
497 self.close(linger=0)

/home1/miniconda2/lib/python2.7/site-packages/ipyparallel/client/client.pyc in _connect(self, sshserver, ssh_kwargs, timeout)
613 evts = poller.poll(timeout*1000)
614 if not evts:
--> 615 raise error.TimeoutError("Hub connection request timed out")
616 idents, msg = self.session.recv(self._query_socket, mode=0)
617 if self.debug:

TimeoutError: Hub connection request timed out

Do I need to put in the sleep command?
Isaac Overcast
@isaacovercast
Sep 02 2017 17:38
@tommydevitt Can you paste the code you are running inside the notebook? I don't understand exactly how you tried to start the ipcluster instance, from inside the notebook or from a job submission script?
Isaac Overcast
@isaacovercast
Sep 02 2017 20:34
@tommydevitt Ok, I see thanks for clarifying. The ipcluster instance has to be running on the same cluster node as ipyrad or else they don't know how to talk to each other. You need to set it up so that the notebook you are trying to run is on the same node as ipcluster. You can do this by adding the call to start the notebook to the qsub script where you launch ipcluster.
tommydevitt
@tommydevitt
Sep 02 2017 20:35
This message was deleted
@isaacovercast that makes sense, but what would that look like in the script?
Isaac Overcast
@isaacovercast
Sep 02 2017 20:45
ipcluster start --n 20 --daemonize
jupyter-notebook --no-browser --ip=$(hostname -i) --port=9999
Then follow the instructions for setting up the ssh tunnel here: http://ipyrad.readthedocs.io/HPC_Tunnel.html?highlight=ssh%20tunnel
Deren Eaton
@dereneaton
Sep 02 2017 20:50
@tommydevitt you should only include the profile=MPI48 argument if you named your ipcluster instance with a --profile {name} flag.
If you paste the job submission scripts we could probably figure it out.
But, you actually can start ipcluster on a separate node than the notebook is running on. No problem there, they should be able to find each other still.
Deren Eaton
@dereneaton
Sep 02 2017 20:56
But unless you're planning to use MPI to launch a multi-node ipcluster, the single script setup like Isaac posted above is the simplest way to start everything.
tommydevitt
@tommydevitt
Sep 02 2017 20:57
#!/bin/bash

#SBATCH -J tunnel               # job name
#SBATCH -o jupyter-log-%j.txt   # output and error file name (%j expands to job$
#SBATCH -N 1                    # number of nodes requested
#SBATCH -n 24                   # total number of mpi tasks requested
#SBATCH -p development          # queue (partition) -- normal, development, etc.
#SBATCH -t 02:00:00             # run time (hh:mm:ss) - 2 hours

## get tunneling info
XDG_RUNTIME_DIR=""
ipnport=$(shuf -i8000-9999 -n1)
ipnip=$(hostname -i)

## print tunneling instructions to jupyter-log-{jobid}.txt
echo -e "
    Copy/Paste this in your local terminal to ssh tunnel with remote
    -----------------------------------------------------------------
    ssh -N -L $ipnport:$ipnip:$ipnport user@host
    -----------------------------------------------------------------

    Then open a browser on your local machine to the following address
    ------------------------------------------------------------------
    localhost:$ipnport  (prefix w/ https:// if using password)
    ------------------------------------------------------------------
    "

## start an ipcluster instance and launch jupyter server
jupyter-notebook --no-browser --port=$ipnport --ip=$ipnip
thats
Deren Eaton
@dereneaton
Sep 02 2017 20:58
That looks good for the notebook connection.
Though if you are starting ipcluster separately then you will only need one core to run the notebook, so you could request only -n 1 in the SBATCH args.
tommydevitt
@tommydevitt
Sep 02 2017 21:01
#!/bin/bash

#SBATCH -J MPI48                # job name
#SBATCH -o ipcluster-log-%J.txt # output and error file name (%j expands to job$
#SBATCH -N 2                    # number of nodes requested
#SBATCH -n 24                   # total number of mpi tasks requested
#SBATCH -p normal               # queue (partition) -- normal, development, etc.
#SBATCH -t 02:00:00             # run time (hh:mm:ss) - 2 hours

## set the profile name here
profile="MPI48"

## Start an ipcluster instance. This server will run until killed.
ipcluster start --n=48 --engines=MPI --ip='*' --profile=$profile
How do I know if I need a multi-node ipcluster or not?
Deren Eaton
@dereneaton
Sep 02 2017 21:03
Your cluster should report somewhere how many cores are on each node (e.g., 24). So if you want 48 cores then you know you need two nodes.
You'll want to add a system import to this command to load the MPI module for your system.
before the ipcluster call add something like module load OpenMPI.
tommydevitt
@tommydevitt
Sep 02 2017 21:04
24 cores per node, I believe. I'm not clear on the best way to parallelize my jobs, but that's my problem (one of them).
Deren Eaton
@dereneaton
Sep 02 2017 21:05
I think adding the module load command will fix your problem.
otherwise it may have stalled b/c it couldn't find the 48 engines.
tommydevitt
@tommydevitt
Sep 02 2017 21:06
After searching, I don't think I need the module load command; I'm sure our HPC supports MPI though. I'll have to ask my admin I guess.
Deren Eaton
@dereneaton
Sep 02 2017 21:06
I'm sure they do support it, but you still need to load it usually, since the HPC software isn't all available by default.
tommydevitt
@tommydevitt
Sep 02 2017 21:06
Y'alls documentation is excellent, but I'm very new at this.
Deren Eaton
@dereneaton
Sep 02 2017 21:07
The exact name may vary though. It may be module load OpenMPI-1.0.123 or something.
So you'll need to look on your system. You can use module avail to see avialable packages.
Once you get the notebook to print that it finds all 48 engines then you should be good to go, and anything run in ipyrad with the .run() command will distribute work across the 48 cores.
tommydevitt
@tommydevitt
Sep 02 2017 21:11
"mpiP/3.4.1"
Deren Eaton
@dereneaton
Sep 02 2017 21:13
hmm, Is there one called openmpi?
tommydevitt
@tommydevitt
Sep 02 2017 21:16

module keyword mpi returns
mpiP: mpiP/3.4.1
Lightweight, Scalable MPI Profiling

mvapich2-largemem: mvapich2-largemem/2.1
MPI-2 implementation for Infiniband

Deren Eaton
@dereneaton
Sep 02 2017 21:18
hmm
let's avoid any of that stuff for now.
tommydevitt
@tommydevitt
Sep 02 2017 21:20
I'm not finding it. But maybe I don't need it.
Deren Eaton
@dereneaton
Sep 02 2017 21:20
maybe not.
tommydevitt
@tommydevitt
Sep 02 2017 21:20
I can start simple.
Deren Eaton
@dereneaton
Sep 02 2017 21:21
Look in your home directory: ~/.ipython/ do you see a folder called profile_MPI48/?
This folder will hold the information that ipcluster gathers about where the engines are that are available to you. It should have been created when you used --profile=MPI48.
As long as that is there, and the ipcluster instance is still running, the ipp.Client(profile="MPI48") command should be able to find it...
MPI can be tricky sometimes though, because of how it is setup on different clusters. So it's possible that there is something tricky we need to figure out to make it work, unfortunately. If so, like you said, if all else fails you can still run it in single-node mode (i.e., without MPI).
tommydevitt
@tommydevitt
Sep 02 2017 21:32
I'm just going to run it in single-node mode. So, should I combine Isaac's two lines into my script for the notebook connection? Thanks so much guys.
The folder profile_MPI48 is there in my home directory, BTW @dereneaton
Deren Eaton
@dereneaton
Sep 02 2017 21:39
Yeah, that's the easiest way. Should give you 24 cores.
Deren Eaton
@dereneaton
Sep 02 2017 22:08
yep, that should work.
tommydevitt
@tommydevitt
Sep 02 2017 22:09
Thanks @dereneaton and @isaacovercast . Sure appreciate it.