These are chat archives for thunder-project/thunder

16th
Nov 2015
alexandrelaborde
@AlexandreLaborde
Nov 16 2015 08:53

Hello again guys, how are you? I am just a step away form setting thunder here at now it's the time for demos, but I stumbled into an obvious problem that I hadn't foreseen somehow…

In order to load images, thunder uses the ThunderContext tsc which is created when thunder starts using the SparkContext sc. But when submitting jobs to the cluster using thunder-submit, there’s no initialization step where tsc is created and therefore no tsc.LoadImagesAsSeries for instance.

Is there a ThunderContext hiding around in thunder-submit and I haven’t found him yet, or do I have to create mine in order to load the data when running thunder-submit?

tlabruyere
@tlabruyere
Nov 16 2015 13:22
Is there a way to install thunder from source? I do not have admin rights on the machine i am trying to run on and it would be nice to run thunder as a library.
alexandrelaborde
@AlexandreLaborde
Nov 16 2015 16:08
@tlabruyere I think you just have to manually compile everything but I am not the right person to answer this
Jeremy Freeman
@freeman-lab
Nov 16 2015 16:42
@tlabruyere in general there should be no need to compile / build anything, though there are a couple exceptions to that depending on the environment you’re in (local vs cluster)… maybe tell us more about what you tried to do and what the error was?
@kkcthans if it’s just affecting the series loading examples, can you confirm you are running thunder 0.5.1? if so, this is an issue that’s fixed on master, i just need to push a release to pypi
at one point thunder’s binary loading relied on a scala library that in turn depending on having a particular version of hadoop (1.0)
very annoying, so we got rid of it
alexandrelaborde
@AlexandreLaborde
Nov 16 2015 16:54
@freeman-lab do I need to create a ThunderContext to load my data in thunder-submit ?
Kyle
@kr-hansen
Nov 16 2015 19:00
@freeman-lab It seems to only affect the series loading examples. When I type "thunder" and it opens in interactive mode, it says "Thunder version 0.5.1". Also, typing "thunder.version" returns '0.5.1'. Would there be other ways to check this?
Kyle
@kr-hansen
Nov 16 2015 19:27

@freeman-lab Also another small and probably unrelated issue I'm coming across in the same environment.

When I use Colorize.image() to display the example images that load ('mouse-images'), the image does not automatically display. I need to "import matplotlib.pyplot as plt" and do a plt.show().

From what I read in the code for colorize.py (https://github.com/thunder-project/thunder/blob/master/thunder/viz/colorize.py), image() itself seems that it should all ready show my image, though that is not happening. Also the tutorials also seem structured to suggest that image() should automatically display the images.

Kyle
@kr-hansen
Nov 16 2015 19:33

@AlexandreLaborde I believe you need to create a ThunderContext at the beginning of your script that you submit. This is also true for Spark. When you run Spark or Thunder in interactive mode, it automatically creates a spark-context for you (sc or tsc respectively). However, whenever you submit a job to a cluster, this context needs to be created at the beginning of your script.

I am still newish to both Spark and Thunder, so someone can correct me if I'm wrong, but I am pretty certain this is the case.

alexandrelaborde
@AlexandreLaborde
Nov 16 2015 19:39
@kkcthans I believe that you are right, but I would like to corfirm that with the God of Thunder @freeman-lab :)
I dont like having to create a SparkContext to create a ThunderContext to be able to load the images... Not because of me but because the researchers that I work with don't have many programing skills and this process requires creating alot of stuff with a lot of paramenters, some associated with distruibuted computing that I am sure they don't know.
@kkcthans I think the best option if it turns out that you have to create the contexts is to create some sort of wrapper that takes the same inputs as the thunder-submit but adds the remaning code and then call thunder-submit
tlabruyere
@tlabruyere
Nov 16 2015 22:40
@freeman-lab I tried to use pip install thunder-python on the cluster that I am working on (school provided env), however it could not install into the required location because I do not have admin rights. To get this installed I believe I have 2 options: 1) Go down the path of creating a virtualenv in python and do a pip install, or 2) Attempt to build from source and reference the code in my cluster?