These are chat archives for thunder-project/thunder

26th
Mar 2015
Noah Young
@npyoung
Mar 26 2015 00:21
Having some trouble upgrading thunder on an existing cluster. I've pulled master and installed thunder on all machines on the cluster, but thunder code seems to be running from an egg in /mnt which isn't getting reset. Not familiar with how eggs work... what's the right way to get the latest code on my cluster?
Jeremy Freeman
@freeman-lab
Mar 26 2015 01:28
ah, yes, we made a change to the egg distribution process in response to thunder-project/thunder#131 and it would affect an already running cluster if you upgraded
the fix should just be calling the "build" executable (as easy as typing "build" if it's on your path)
basically, when thunder launches one of the things it does is distribute an egg file with its code to the workers
this is a much simpler alternative to actually installing it on the workers
and we recently changed where it gets that egg from
Jeremy Freeman
@freeman-lab
Mar 26 2015 01:35
though i'm a little surprised it didn't do this automatically, here: https://github.com/thunder-project/thunder/blob/master/python/thunder/utils/launch.py#L139-146
Jeremy Freeman
@freeman-lab
Mar 26 2015 01:43
i wonder if your installing thunder on the workers is actually causing unexpected behavior
Jeremy Freeman
@freeman-lab
Mar 26 2015 01:58
@npyoung ok, just started an ec2 cluster, under the environment section of the UI, for spark.files, you should see: file:/root/thunder/python/thunder/lib/thunder_python-0.5.0_dev-py2.6.egg
the "location" of that egg once running is indeed under mnt, fairly certain that's just how spark handles the files it adds
and if you ever update the code and rerun build you should have the latest (on driver and workers)
Noah Young
@npyoung
Mar 26 2015 05:26
Ran setup.py build on the master, but didn't manage to update the egg.
Jeremy Freeman
@freeman-lab
Mar 26 2015 06:10
sorry, it's just "build"... it's an executable inside thunder/python/bin
it's calling setup.py clean bdist_egg under the hood
Jason Wittenbach
@jwittenbach
Mar 26 2015 19:07
:point_up: March 24 2015 7:52 PM @Yassum You can do linear regression with a single regressor, just make sure that the design matrix has dimensions 1-by-N and not 0-by-N; i.e. make sure it's a size=1 list/array of lists/arrays rather than just a list with the values in it.
Jason Wittenbach
@jwittenbach
Mar 26 2015 19:13
I know that's how it works with LinearRegressionModel. I haven't tried using BilinearRegressionModel yet, but looking at the implementation, I think that the same should hold. Though note that for BilinearRegressionModel you need two design matrices as this is a more complicated model of the data.
@freeman-lab On the topic of regression: would it be reasonable to include an option (at least for linear regression) to get the intercept term back as well? Seems kind of sneaky to compute it and then not report it...