These are chat archives for epnev/ca_source_extraction
Matlab implementation of a source extraction and spike inference algorithm for large scale calcium imaging data analysis, based on a constrained matrix factorization approach.
I've been testing out the latest 'demo_memmap' script using the parallelized script 'run_CNMF_patches', but I'm afraid I'm choosing patch sizes poorly. The default is [32 32] size patches with [4 4] overlap, and for my images (512 x 512), I decided to expand it up to patches of [128 x 128] with 20 pixel patch-overlap. However, the final 'merged' background estimate for the whole FOV is very artifact-ridden, with large, dark ovals tiling the whole background (this is the reshaped 'b' matrix). Is this a common problem/something you've encountered before? Most of the ovals are parallel to eachother and seem to sit in the middle of where each patch was, so I suspect there's something going on during the background merging between patches. Any advice for how to choose different patch-sizes, perhaps, to avoid this?
Secondly, for a given patch-size, do you recommend setting K to over-estimate the average number of cells within a patch of that size, or should it roughly equal that number?
Thanks in advance! Conor
And sorry, one more question:
Say I had a really big dataset, and wanted to initialize A from a factorization only run on the first 1000 frames for memory reasons. From what I understand, you can then feed that A matrix into an iteration of 'update_spatial' or 'update_temporal', that takes the WHOLE dataset as input, but using the A matrix initialized from a smaller temporal chunk. What would you recommend setting the non-zero weights of each A component to? Since the specific spatial weights values are determined/constrained by the fluorescence dynamics over which they were estimated (e.g. the weights are basically (Y - Ybackground) / C for that first 1000 frames), would it be helpful to convert the A components for the subsequent CNMF iterations to a binary mask? Or maybe push them all to the mean/median value? Or maybe it doesn't matter, just wanted to check in advance
For the patch size, the overlap is used to make sure that neurons that are right in the boundary between two different patches are captured sufficiently well. I think 20 pixel overlap is a bit too high, although it really depends on the size of the neurons you're looking for. In any case, I'm not sure why this would matter for the poor estimation of 'b'. If you want send me a figure in private to see if I have any ideas. In general, K should be slightly overestimating the average number of neurons. There is a post selection screening process that aim to filter out false positives.
I would use exactly the same A that was estimated in the smaller set of frames. The values might probably change, but the shape of A would more or less remain intact, and in general the initial estimate would be a better approximation than replacing with a binary mask (or some other value). I would recommend you to take a look at the analysis pipeline explained here. The general idea there is to estimate [A,b] on downsampled data (for memory reasons and also to improve SNR), then use this A on the original files to get the traces at the original frame.