C
, the SVC hyperparameter, in order to make the margin narrower. but it turns out that C=1
, C=100
and C=10000
do not have any differences in terms of the classification accuracy and classification confidence. My guess is that the precomputed kernel may play a role? Anyways I do not fully understand why this is the case, any help on either just SVM or just FCMA would all be super helpful! Thank you all in advance!
Hi, I am using the brainiak.reprsimil package and have pre-processed the fmri data the way it is necessary to perform a GBRSA. All data is in the needed format, i.e. the rois to be fitted are in a list (of length #subjects) with each element of the list being an array of shape time points x voxel, whereby the runs are concatenated along the time points. Furthermore, the design matrix is as needed and the scan-onsets and some nuisance regressors are provided correctly. When initiating the instance I set auto_nuisance=False, in order to completely use the given nuisance regressors.
The GBRSA is running for a while now and does not seem to come to an end. Any experiences of how long this may take for about 20 subjects with about 2000 time points each or a way to parallelise the computation and visualise the process?
Any advice or idea would be helpful, thank you very much! :)
Hello! I am helping a grad student use the searchlight code on a really big dataset/analysis and we are running into a strange problem I have never witnessed. In particular, the code runs for approximately 8 hours, then just seems to freeze and stops producing any more outputs, even if we let it run for multiple days.
To give some more details, the kernel computation takes 8-10s, we have 230k voxels and we have used up to 120 cores to run this, although we get similar results with fewer cores. The way we track progress is that we print out to a log file the time stamp that every searchlight was run. No error messages are printed in the log, it just times out, after hanging for multiple days without producing a new result. Using a back of the envelope calculation, this code should only take 5 hours on 120 cores so it is already running slow.
@manojneuro @mjanderson09
Hi everyone.. I am wanting to calculate ISC over time.. Right now the data structure is TR x Voxels x Subjects.. The code that I am working with is the following..
n_TR = 190; # Total time-points
window_width = 10
T_iscs = []
for start in np.arange(0,n_TR,10):
window_data = data[start:start+9, 0:, 0:]
window_isc = isc(window_data, pairwise = False)
T_iscs.append(window_isc)
It is giving me an output as a list with len(window_iscs) as 63..However all the values are either nan or 1..
How can I solve this??
Hello, everyone, I'm a beginner of brainiak. Where can I find utils.py of this class in https://brainiak.org/tutorials/02-data-handling/?
https://github.com/brainiak/brainiak-tutorials/blob/master/tutorials/utils.py
Hi everyone, I am trying to install BrainIAK in Mac Mini. I installed miniconda and activated it.
While installing BrainIAK I encounter the following problem. Please help. Thanks!
Verifying transaction: | WARNING conda.core.path_actions:verify(962): Unable to create environments file. Path not writable.
environment location: /Users/wu_lab/.conda/environments.txt
done
Executing transaction: | WARNING conda.core.envs_manager:register_env(50): Unable to register environment. Path not writable or missing.
environment location: /Users/wu_lab/miniconda3/envs/venv
registry file: /Users/wu_lab/.conda/environments.txt
Hi everyone! BrainIAK is such a great program, thanks for making it so available to everyone. I am currently trying to do an ISFC analysis on 2 groups. Specifically, I have independently calculated the ISFC for my two groups (i.e., I only calculated the within-group ISFCs) and I would now like the compare them - i.e. find the edges/connections where one group has a significantly stronger connection strength than the others. Kind of like a 2-group ISC permutation analysis, but with ISFC data. Does anyone know that the appropriate statistical test would be for this data? Would I be able to use the permutation_isc function on ISFC data?
thanks!
Hello everyone, I am currently trying to compute spatial ISC.. I am having a little trouble understanding the input data format.. Right now, following the tutorial I am unable to plot the iscs as well.. Instead of plotting the mean linear correlation it is plotting for 300 time-points taking subjects on the X-axis..
While computing spatial ISC should my data format be voxel x TR x Sub?? Do I need to change my data format prior to that since there are 2 transposes in 2 different steps..
Any help is appreciated..Thanks!
[HARDWARE PURCHASE RECOMMENDATIONS]
Hello everyone,
our lab currently has a local compute cluster with 2 nodes (24 cores, 128GB ram each) and 3 workstations (8 cores, 32GB each). We wanted to stack up with a few new machines and I have a very basic question: For naturalistic neuroimaging analysis (ISC, IS-RSA, etc.) would you recommend
buying two machines with 24 cores and 128 GB RAM
buying one machine with 64 cores and 256GB RAM
My understanding is that two machines are better for embarrassingly parallel problems (e.g., distribtuing multiple subjects when running fMRIprep), and that a single, high-powered machine is better for costly computations (e.g., phase-randomization during ISC). As additional info, we do use SLURM as scheduler.
Many thanks!
Freddy
Dear Brainiak Team/everyone,
Sorry for the newbie question, I'm a complete beginner in Python. I would like to run the tutorial via Python in my computer, not via Jupyter Notebook (because I want to adapt it later to my own data). Could you please point me out what to install for it? Should I try to run what is explained in "Cluster" in https://brainiak.org/tutorials/? (if so, I'm getting some error messages whenever I try)
Thank you!!
Hi
Sorry if this is a stupid question but I am so confused about the permutation testing bit of the ISC (10) project. I understand what the permutation test is doing and why but the output from the tutorial makes no sense to me? The tutorial output is:
observed: (98508,)
p:(98508,)
distribution:( 1000, 98508)
Please help!
Dear all,
I meet a problem when I try to do the ISC analysis.
All the data were preprocessed and normalized to MNI 6th generation template (resolution: 2) via fMRIprep, then were smoothed via SPM12 and denoised via Denoiser (https://github.com/arielletambini/denoiser).
I'm following Nastase's ISC tutorial (https://github.com/snastase/isc-tutorial/blob/master/isc_tutorial.ipynb), and I get an error when I try to use "MaskedMultiSubjectData.from_masked_images" to collate data into a single TR x voxel x subject array.
The error is:
ValueError: Image 19 has different shape from first image: (111, 253137) != (105, 253137)
I find that including 3 subjects' data in the analysis will cause this error. But it's weird due to all data was preprocessed in the same way. I also check the dimensions and voxel size of these images via SPM Display function, and both are the same as the other data which won't make the same error.
Does anyone meet the same problem or know how to solve it?
Hi all! Hope everyone is well. I have a Searchlight()
question regarding multiple outputs from a kernel function. This could be multiple subjects (see example below) or multiple scores in case people like to compare multiple accuracy measures (e.g., accuracy + AUC) or distance measures (e.g., correlation distance + Euclidean distance).
In the Searchlight tutorial for running multiple subjects, the output of the kernel function accuracy
is a list of elements (with each element corresponding to the kernel's output for a different subject). When the searchlight completes, the output (e.g., sl_result_3subj
) will be a 3D array with an odd form. Essentially, it's shape will reflect the input voxel dimensionality (x, y, z)
(e.g., (64, 64, 26)
) instead of having a fourth dimension for subjects (e.g., (64, 64, 26, 3)
for three subjects). This is because the searchlight only runs in voxels where mask==1
. The output array therefore is a bit odd---here's a simplistic example for sl_result_3subj
: np.array([None, [subj1a, subj2a, subj3a], [subj1b, subj2b, subj3b], None, None])
. The shape of this array would still be (5,)
, not reflecting the elements with three items.
I would like to create an output with the form (x, y, z, s)
that reflects the multiple outputs from the kernel function and that takes the masked voxels into account, but unsure what the most efficient way to accomplish this would be! Any help would be super appreciated! Thanks so much!
-Shawn
Hi everyone,
I'm looking for a proper tool/software/viewer for Inter-subject correlation (ISC) maps. I'd appreciate it if anyone can recommend a tool/software/viewer to me. (I tried to use xjview, a toolbox of SPM, but it doesn't properly support viewing ISC maps.
Howdy! I have what I'm afraid is a rather basic question. I set the searchlight shape to "Ball", but when I output "sl_mask.shape" as in the tutorial, it gives me cube dimensions e.g., "(3,3,3)".
However, If I understand right, this is expected. The documentation states that the searchlight package takes a cube shape and sets certain voxels which have "...a Euclidean distance of equal to or less than rad from the center point" equal to True.
My question, then, is how can I see which voxels have been set to True so that I can get a better visualization of the searchlight shape?
Thank you much!!
!pip install matplotlib==3.1.3
. I found this info from https://github.com/facebook/prophet/issues/1691
We may want to update the requirements for brainiak to suit
Hi, I'm really struggling conceptually with the second part of the ISC (10) tutorial - ISC with statistics. I'm fairly new to python and I don't really understand what to do with the code. The tutorial output shows tutorial output is:
observed: (98508,)
p:(98508,)
distribution:( 1000, 98508).
It has been explained to me that 98508 refers to the number of voxels, 1000 refers to the number of times the permutation is being done. But how do I get a P value which tells me whether my two conditions are significantly different or not? Below is the code I am confused about:
n_permutations = 1000
summary_statistic='mean'
observed, p, distribution = permutation_isc(
isc_maps_all_tasks,
pairwise=False,
group_assignment=group_assignment,
summary_statistic=summary_statistic,
n_permutations=n_permutations
)
p = p.ravel()
observed = observed.ravel()
print('observed:{}'.format(np.shape(observed)))
print('p:{}'.format(np.shape(p)))
print('distribution: {}'.format(np.shape(distribution)))
Any basic explanations would be really greatly appreciated.
Hi, I'm following 2 ISC tutorials, the official Brainiak tutorial (https://brainiak.org/tutorials/10-isc/) and Nastase's tutorial (https://github.com/snastase/isc-tutorial). I notice these two tutorials use different ways to calculate ISC maps via isc(data, pairwise=False, summary_statistic=None, tolerate_nans=True)
.
In the former one, it calculates ISC maps for 2 groups separately via a for loop, then it uses np.vstack
to concatenate ISC maps from both tasks for the permutation test.
# run ISC, loop over conditions
isc_maps = {}
for task_name in all_task_names:
isc_maps[task_name] = isc(bold[task_name], pairwise=False)
print('Shape of %s condition:' % task_name, np.shape(isc_maps[task_name]))
# Concatenate ISCs from both tasks
isc_maps_all_tasks = np.vstack([isc_maps[task_name] for
task_name in all_task_names])
print('group_assignment: {}'.format(group_assignment))
print('isc_maps_all_tasks: {}' .format(np.shape(isc_maps_all_tasks)))
# permutation testing
n_permutations = 1000
summary_statistic='mean'
observed, p, distribution = permutation_isc(
isc_maps_all_tasks,
pairwise=False,
group_assignment=group_assignment,
summary_statistic=summary_statistic,
n_permutations=n_permutations
)
In contrast, the data variable in the latter one contains 2 groups' data, and they only use isc(data, pairwise=False, summary_statistic=None, tolerate_nans=True)
once to calculate ISC maps.
# Create data with noisy subset of subjects
noisy_data = np.dstack((np.dstack((
simulated_timeseries(n_subjects // 2, n_TRs,
n_voxels=n_voxels, noise=1))),
np.dstack((
simulated_timeseries(n_subjects // 2, n_TRs,
n_voxels=n_voxels, noise=5)))))
# Create group_assignment variable with group labels
group_assignment = [1]*10 + [2]*10
print(f"Group assignments: \n{group_assignment}")
# Compute ISCs and then run two-sample permutation test on ISCs
iscs = isc(noisy_data, pairwise=True, summary_statistic=None)
observed, p, distribution = permutation_isc(iscs,
group_assignment=group_assignment,
pairwise=True,
summary_statistic='median',
n_permutations=200)
I'm confused about this difference and wondering whether it makes different results when using permutation_isc
becuase I think these 2 ways will create different numbers of isc maps with the pairwise approach. Does permutation_isc
has different ways to calculate isc maps from different approach, thus these 2 way calculate ISC correctly?
from statsmodels.stats.multitest import multipletests
# Get number of NaN voxels
n_nans = np.sum(np.isnan(observed))
print(f"{n_nans} voxels out of {observed.shape[0]} are NaNs "
f"({n_nans / observed.shape[0] * 100:.2f}%)")
# Get voxels without NaNs
nonnan_mask = ~np.isnan(observed)
nonnan_coords = np.where(nonnan_mask)
# Mask both the ISC and p-value map to exclude NaNs
nonnan_isc = observed[nonnan_mask]
nonnan_p = p[nonnan_mask]
# Get FDR-controlled q-values
nonnan_q = multipletests(nonnan_p, method='fdr_by')[1]
threshold = .05
print(f"{np.sum(nonnan_q < threshold)} significant voxels "
f"controlling FDR at {threshold}")
# Threshold ISCs according FDR-controlled threshold
nonnan_isc[nonnan_q >= threshold] = np.nan
# Reinsert thresholded ISCs back into whole brain image
isc_thresh = np.full(observed.shape, np.nan)
isc_thresh[nonnan_coords] = nonnan_isc
Hello, I'm running a Brainiak searchlight (sl_rad = 3, max_blk_edge = 10, pool_size = 1) and getting errors from mpi4py (trace below). The searchlight loads 253GB of data. I'm using 3 Tiger cluster nodes (120 ranks in total) and the job uses 840GB of memory across all 3 nodes. The log shows an mpi4py error, a quick Google search suggests that this error crops up when the size of the pickled object is >2GB (https://githubmemory.com/repo/mpi4py/mpi4py/issues/119). Any suggestions for solving this issue would be much appreciated!
Traceback (most recent call last):
File "../notebooks/batch/delta_searchlight.py", line 899, in <module>
main()
File "../notebooks/batch/delta_searchlight.py", line 880, in main
sl.distribute(full_set, sl_mask)
File "../.conda/envs/brainiak11/lib/python3.7/site-packages/brainiak/searchlight/searchlight.py", line 370, in distribute
for (s_idx, s) in enumerate(splitsubj)]
File "../.conda/envs/brainiak11/lib/python3.7/site-packages/brainiak/searchlight/searchlight.py", line 370, in <listcomp>
for (s_idx, s) in enumerate(splitsubj)]
File "../.conda/envs/brainiak11/lib/python3.7/site-packages/brainiak/searchlight/searchlight.py", line 310, in _scatter_list
mytrans = self.comm.scatter(padded, root=owner)
File "mpi4py/MPI/Comm.pyx", line 1267, in mpi4py.MPI.Comm.scatter
File "mpi4py/MPI/msgpickle.pxi", line 730, in mpi4py.MPI.PyMPI_scatter
File "mpi4py/MPI/msgpickle.pxi", line 125, in mpi4py.MPI.Pickle.dumpv
File "mpi4py/MPI/msgbuffer.pxi", line 44, in mpi4py.MPI.downcast
OverflowError: integer 2157576948 does not fit in 'int'
Hey I am running the brainiak tutorial 10 (ISC), the final part with the spatial correlation. In the example it uses sns.tsplot to plot the correlation over time, however this has now been depreciated.
I have done this instead:
for j, roi_name in enumerate(roi_selected):
# For each task
for i, task_name in enumerate(all_task_names):
data = iscs_roi_selected [j][task_name]
mean_data = np.mean(data, axis=0)
sns.lineplot(
data = mean_data,
color=col_pal[i], ci=ci,
ax=axes[j]
)
f.legend(all_task_des)
sns.despine()
And the plots look like they could be correct, however the range and the shape of the spikes is different to the online example shown here: https://brainiak.org/tutorials/10-isc/, even though I am using the same data. What am I doing wrong, what have I actually plotted? How should this be done now the ts.plot has been depreciated?
Thank you for your advice!
Hello BrainIAK folks!
First, so excited to use BrainIAK, especially the hyperalignment stuff! BUT I am new and having issues with running the tutorials ;\
I followed the current installation instructions via conda, new environment for brainiak, installed nb_conda to manage environments from within Jupyter Notebook (because conda has playground issues), and still receiving the error: ModuleNotFoundError: No module named 'brainiak'
I know that the issue I am having has been posted many times before and I have worked through most of the provided solutions without success and reproducing the same error.
If you have any further guidance, I would greatly appreciate your input!