These are chat archives for thunder-project/thunder

9th
Dec 2016
chenminyeh
@chenminyeh
Dec 09 2016 17:37
memory error: Hi, I loaded a 4.2GB tif series (20,47,2048,2048) on python using td.images.fromtif. It is 56 bytes determined sys.getsizeof(data). While the computer still has 15GB out of 16GB RAM free, it shows an memory error when I used ICA: algorithm = ICA(k=50, k_pca=10, svd_method='em', max_iter=10, tol=0.000001, seed=100).fit(data). Could anyone give some suggestions? Thanks!!
Davis Bennett
@d-v-b
Dec 09 2016 20:46
is this on 1 computer?
Forrest Collman
@fcollman
Dec 09 2016 23:23
hi, i would like to do some image processing on a dataset that i currently access via an image tile rest api (https://github.com/saalfeldlab/render), is there a natural way to express this image volume in thunder? I can see there is the fromlist(items, accessor=None, keys=None, dims=None, dtype=None, labels=None, npartitions=None, engine=None) method... and i can imagine writing the accessor function to grab the image tile through the api, and i already have functions that do that. However that will just give me one image tile, and the dataset is a 3d volume.
Davis Bennett
@d-v-b
Dec 09 2016 23:32
@fcollman can you express or represent each image as an item which can be put in a list? In the case of images stored on a local file system, this would result in a list of paths to images, like ['/images/im1.tif', '/images/im2.tif]. If you can do this for each plane in the volume, then you can load each plane with some accessor function and you should be all set
Forrest Collman
@fcollman
Dec 09 2016 23:33
what if the planes are 38000x10000 pixels
i can express each plane, and then write an accessor function to call the api to get the image yes.. its just when each plane is 2.8 GB i'm not sure i want to be making web calls asking for 2.8 GB at a time
i can break it up into smaller bits, but i don't see how to tell thunder how to reassemble all the bits into a 3d matrix appropriately
Davis Bennett
@d-v-b
Dec 09 2016 23:38
can you request a small region from a single plane?
Forrest Collman
@fcollman
Dec 09 2016 23:38
yes i can
Davis Bennett
@d-v-b
Dec 09 2016 23:39
and you can request the same [x,y] region from a different plane
and what's your application?
Forrest Collman
@fcollman
Dec 09 2016 23:41
yes i can
i want to run a median filter on a very large image volume
i have very large array tomography volumes that have some sporadic artificats in them that can be detected by subtracting the median filtered version of the data from the data, thresholding and then doing some morphological operations and finding large connected components.
Davis Bennett
@d-v-b
Dec 09 2016 23:45
are you doing median filtering in 3d?
Forrest Collman
@fcollman
Dec 09 2016 23:45
yes that's the critical bit as fluorescent junk doesn't span sections
and sections are in z
GET /v1/owner/{owner}/project/{project}/stack/{stack}/z/{z}/box/{x},{y},{width},{height},{scale}/tiff-image
so i can get any 2d image from the volume via this api
or png
GET /v1/owner/{owner}/project/{project}/stack/{stack}/z/{z}/box/{x},{y},{width},{height},{scale}/png-image
Davis Bennett
@d-v-b
Dec 09 2016 23:49
so you could use that api to get a whole plane, if you set width and height to their max values
Forrest Collman
@fcollman
Dec 09 2016 23:49
correct
it just would be a 2.8GB web call
Davis Bennett
@d-v-b
Dec 09 2016 23:49
don't you have to load that data no matter what, since you want to process that whole volume?
Forrest Collman
@fcollman
Dec 09 2016 23:50
yes, but in my experience render doesn't like serving out such large images
i suppose i can write an accessor function that breaks it out into smaller calls
but in the end returns a 2.8GB array
does thunder only partition the data by imaging plane?
Davis Bennett
@d-v-b
Dec 09 2016 23:52
the partitioning is done by spark
and you can specify how big you want your partitions to be
when you initialize td.images object with a sparkcontext you can set the number of partitions
Forrest Collman
@fcollman
Dec 09 2016 23:54
okay, so if i setup a list of z values, then write an accessor function that returns a NxM ndarray which is the image for that z value
i assume i'm using fromlist(items, accessor=None, keys=None, dims=None, dtype=None, labels=None, npartitions=None, engine=None), where do i tell it which item is which z plane