These are chat archives for dereneaton/ipyrad

2nd
Dec 2015
Isaac Overcast
@isaacovercast
Dec 02 2015 00:14
I tend to agree that ipcluster launch belongs in ip.init. It's the "right" place.
Isaac Overcast
@isaacovercast
Dec 02 2015 00:23
Lol, didn't finish my thought.... At the same time I think simpler is better. Having 2 ways to do things is just having twice as many ways something can break ;) I guess the real question is "Should assembly objects be flexible enough to select their own controller type?" or alternatively "Is each instance of ipyrad married to one controller type?"
Is there not a way to query for a currently running ipcluster? Seems like if you could query for a currently running cluster of the desired type and attach to it, then CLI would "just work", interactive runs would "just work" too, and if you wanted to switch controller types it'd test, and then just fire up a new one if the right type wasn't running.
This above is all assuming launch moves to `ip.__init__, natch.
Isaac Overcast
@isaacovercast
Dec 02 2015 22:54
Better still, we could create ipyparallel profiles for each cluster type (local, mpi, pbs), include these profiles in the ipyrad distribution, and use the --profile-dir and --profile flags for all cluster commands to just talk to these profiles?
Then it will be very simple to change controller type w/in an assembly object, by just initializing the clients with the appropriate profile.
Deren Eaton
@dereneaton
Dec 02 2015 23:10
That sounds like a really good idea. If we're going to autoinitiate and autoclose ipcluster then we should definitely stay away from profile_default to avoid interfering with other jobs running outside of ipyrad.
As for avoiding the problem of Engines trying to re-init ipcluster, we could probably just have a try/except clause that avoids Exceptions that raise when ipcluster is already running on the given profile.
Isaac Overcast
@isaacovercast
Dec 02 2015 23:15
Exactly!
Deren Eaton
@dereneaton
Dec 02 2015 23:17
But, do we need each ip.__init__ to create a unique profile id, to allow for the possibility that someone can run more than one notebook at a time? Otherwise when you shut down one notebook it would kill both...
If so, then ip.__init__ needs to ask (1) is a controller already running with this profile-id? (2) If no, I'll use it. But if yes, then I'll create a new profile... The problem with this is that we need some way for Engines to know that they are Engines so they do not create new profiles too.
Deren Eaton
@dereneaton
Dec 02 2015 23:26
It's complicate because we have to run ipcluster stop at shutdown to ensure we don't leave it hanging, since it's daemonized. Unless there's some smarter way to do that.
ipyparallel folks are up on this topic too: ipython/ipyparallel#22
Deren Eaton
@dereneaton
Dec 02 2015 23:36

Makes me think we might have jumped the gun on switching to ipyparallel. My main reason for doing so was that multiprocessing really doesn't work well in notebooks.

Using ipyparallel is forward thinking, but will probably require some kludges early on. We'll just have to keep an eye on ipyparallel and update our code as it improves. For now, I'm fine with leaving ipcluster launch inside Assembly.__init__, since I can't see a way to get it to work otherwise yet, and I've spent quite a lot of time trying. But you are welcome to try out the profiles biz, maybe there's something I've missed about it.