These are chat archives for thunder-project/thunder

26th
Nov 2015
Jeremy Freeman
@freeman-lab
Nov 26 2015 00:35
@jwittenbach what did we do in bolt for non-tuple keys? are they allowed, or not?
looks like we force keys to always be tuples
which is perfectly reasonable
Jason Wittenbach
@jwittenbach
Nov 26 2015 01:48
@freeman-lab yeah, I think we actually ran into this problem in Thunder earlier — if I remember right, there was actually an open issue about it, but we put it off, saying that we would wait for Bolt to come along and fix it at that level.
In Bolt, I thinkn we even wrote a tupelize function so that we always make sure the keys end up as tuples, no matter what
Jeremy Freeman
@freeman-lab
Nov 26 2015 01:51
nice ok cool
in this current refactoring, i'm making it so the keys in a series are always tupled, but in images they are allowed to be ints
Jason Wittenbach
@jwittenbach
Nov 26 2015 01:58
that makes sense. There was someone recently who was actually asking about multi-indexing on Images. As long as they still have the option of being tuples, there’s no reason we could move some of that functionality up. Key pieces of code like that would just have to remember that they can’t assume that indices will always be tuples.
Jeremy Freeman
@freeman-lab
Nov 26 2015 01:59
hmm multi-indexing like your multi-index stuff on series?
that doesn't make sense to me, cause those are different indices
tarun joshi
@26tarun
Nov 26 2015 02:25
@jwittenbach : ohh! my bad! that was without the bracket https://github.com/26tarun/iPythonNotebooks/blob/master/PCA_single_image.ipynb , i would like to extract features from multiple unrelated ( images of various objects). When I perform the PCA tutorial and do model.scores.pack() it packs all the series data in one big array. Instead of this i would like features from each image in form of series/key,value/ indexed format. So that i can plot them ( this is what i get as features from multiple images, https://github.com/26tarun/iPythonNotebooks/blob/master/PCA_multipleImages.ipynb )
Kyle
@kr-hansen
Nov 26 2015 02:33

@rmchurch I can see the sc.parallelize... Would there be any reason that using loadImages() on a .tif stack with nplanes=1 would end up doing the def read in LocalFSFileReader? Basically when I reserve a node on our cluster, I take the whole node and all cores, run in local mode, and parallelize across the node. However, Thunder doesn't automatically parallelize my tif stack when I read it in in this method and I'm trying to figure out why.

It stays on one core until I do a .toSeries() to the data, then it will parallelize correctly.