These are chat archives for thunder-project/thunder

11th
Feb 2015
Nikita Vladimirov
@nvladimus
Feb 11 2015 19:06
Hi, folks. Dumb question - how to start using Thunder in IPython notebook under Windows? I installed Anaconda and made pip install thunder-python, but thenthunder just does start. When do I need to start the cluster? Thanks!
Nikita Vladimirov
@nvladimus
Feb 11 2015 19:30
Related question - do I need to install Spark on my local machine, in order to run analysis on cluster?
industrial-sloth
@industrial-sloth
Feb 11 2015 19:37

@nvladimus to your second question: in general, no, you could always just ssh up to a cluster with Thunder and Spark already installed, in which case you don't really need anything (not even Thunder) beyond an ssh client. You do need Spark to be installed locally, or at least have access to Spark's ec2.py script, in order to launch an AWS cluster with the thunder-ec2 script.

I unfortunately do not know enough about the ipython notebook setup to be able to speak to that with any authority - my guess would be that you should be able to ssh into the cluster, launch the notebook server on the cluster using the cluster's installation of Thunder and Spark, and then connect to that running notebook from your local browser. In which case again you shouldn't have to install anything on your local machine. But hopefully somebody else who uses the notebook routinely would be able to confirm this?

Nikita Vladimirov
@nvladimus
Feb 11 2015 19:40
thanks, @industrial-sloth , I will try this now
Jeremy Freeman
@freeman-lab
Feb 11 2015 20:51
@snarles thanks for the interest! Are you referring to EM as an algorithm for estimating some particular model (e.g. gaussian mixture model), or an implementation of EM in general? There's a new EM implementation of GMMs in Spark itself that we might want to add a wrapper for in Thunder (https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/clustering/GaussianMixture.scala), but if you had other analyses in mind they could go directly into Thunder, let us know what you were thinking!
Jeremy Freeman
@freeman-lab
Feb 11 2015 20:58
@nvladimus for clarification, are you trying to run Thunder + Spark + IPyton on your local Windows machine for testing purposes, or are you talking about running on the Janelia cluster ? If it's just local, maybe give a little more info about what you are doing and what's not working. If you are talking about the cluster, let's move the conversation over here (https://gitter.im/freeman-lab/spark-janelia), as it's a site-specific issue.
Jason Wittenbach
@jwittenbach
Feb 11 2015 21:00
@nvladimus I just posted over in in the spark-janelia gitter with some notes on how to use Thunder on the local cluster, if that's what you're looking for
Nikita Vladimirov
@nvladimus
Feb 11 2015 22:23
Thanks, guys! I was asking about running on Janelia cluster, of course.