# Gallant Lab Natural Short Clips 3T fMRI Data ## Summary This dataset contains BOLD fMRI responses in human subjects viewing a set of natural short movie clips. The functional data were collected in five subjects, in three sessions over three separate days for each subject. Details of the experiment are described in the original publication [1]. > **[1]** Huth, Alexander G., Nishimoto, S., Vu, A. T., & Gallant, J. L. > (2012). A continuous semantic space describes the representation of thousands > of object and action categories across the human brain. Neuron, 76(6), > 1210-1224. https://dx.doi.org/10.1016/j.neuron.2012.10.014 If you publish any work using the dataset, please cite the original publication [1], and cite the dataset [1b] in the following recommended format: > **[1b]** Huth, A. G., Nishimoto, S., Vu, A. T., Dupre la Tour, T., & Gallant, > J. L. (2022). Gallant Lab Natural Short Clips 3T fMRI Data. > https://dx.doi.org/10.12751/g-node.vy1zjd #### Difference with the "vim-2" dataset The present dataset uses the same stimuli (natural short movie clips) than a previous experiment of the Gallant lab [2], publicly released in CRCNS under the name ["vim-2"](https://crcns.org/data-sets/vc/vim-2/) [2b]. Both dataset use the same stimuli, but the functional data is different. The "shortclips" dataset [1b] contains full brain responses recorded every two seconds (2s) with a 3T scanner. The "vim-2" dataset [2b] contains responses from the occipital lobe only, recorded every second (1s) with a 4T scanner. Contrary to the "shortclips" dataset, the "vim-2" dataset does not provide mappers to plot the data on flatten maps of the cortical surface. > **[2]** Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & > Gallant, J. L. (2011). Reconstructing visual experiences from brain activity > evoked by natural movies. Current Biology, 21(19), 1641-1646. > https://dx.doi.org/10.1016/j.cub.2011.08.031 > **[2b]** Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & > Gallant, J. L. (2014). Gallant Lab Natural Movie 4T fMRI Data. CRCNS.org. > https://dx.doi.org/10.6080/K00Z715X ## How to get started #### a. With dedicated tutorials The preferred way to explore this dataset is through the [voxelwise tutorials](https://github.com/gallantlab/voxelwise_tutorials). These tutorials includes Python downloading tools, data loaders, plotting utilities, and examples of analysis following the original publication [1] [2]. Example #### b. With git and git-annex To download the data with [git-annex](https://git-annex.branchable.com/), run the commands: ```bash # clone the repository, without the data files git clone https://gin.g-node.org/gallantlab/shortclips cd shortclips # download one file (e.g. features/wordnet.hdf) git annex get features/wordnet.hdf --from wasabi # download all files git annex get . --from wasabi ``` To maximize the downloading speed, two remotes are available to download the data. The first remote is GIN (`--from origin`), but the bandwidth might be limited. The second remote is Wasabi (`--from wasabi`), with a larger bandwidth. To load and plot the data, a basic example script is available in `example.py`. For more utilities and examples of analysis, see the dedicated [voxelwise tutorials](https://github.com/gallantlab/voxelwise_tutorials). #### How to get help The recommended way to ask questions is in the issue tracker on the GitHub page https://github.com/gallantlab/voxelwise_tutorials/issues. ## Dataset content #### Data file organization ```text features/ → feature spaces used for voxelwise modeling motion_energy.hdf → visual motion energy, as described in [2] wordnet.hdf → visual semantic labels, as described in [1] mappers/ → plotting mappers for each subject S01_mapper.hdf ... S05_mapper.hdf responses/ → functional responses for each subject S01_responses.hdf ... S05_responses.hdf stimuli/ → natural movie stimuli, for each fMRI run test.hdf train_00.hdf ... train_11.hdf utils/ wordnet_categories.txt → names of the wordnet labels wordnet_graph.dot → wordnet graph to plot as in [1] example.py → Python example to load and plot the data ``` #### Data format All files are hdf5 files, with multiple arrays stored inside. The names, shapes, and descriptions of each array are listed below. ```text Each file in `features` contains: X_train: array of shape (3600, n_features) Training features. X_test: array of shape (270, n_features) Testing features. run_onsets: array of shape (12, ) Indices of each run onset. where (n_features = 6555) for `motion_energy.hdf` and (n_features = 1705) for `wordnet.hdf`. Each file in `mappers` contains: voxel_to_flatmap: CSR sparse array of shape (n_pixels, n_voxels) Mapper from voxels to flatmap image. The sparse array is stored with four dense arrays: (data, indices, indptr, shape). voxel_to_fsaverage: CSR sparse array of shape (n_vertices, n_voxels) Mapper from voxels to FreeSurfer surface. The sparse array is stored with four dense arrays: (data, indices, indptr, shape). flatmap_mask: array of shape (width, height) Pixels of the flatmap image associated with a voxel. flatmap_rois: array of shape (width, height, 4) Transparent image with annotated ROIs (for subjects S01, S02, and S03). flatmap_curvature: array of shape (width, height) Transparent image with binarized curvature to locate sulci/gyri. roi_mask_xxx: array of shape (n_voxels, ) Mask indicating which voxels are in the ROI `xxx`. ROI list is different on each subject. SO4 and S05 have no ROIs. Each file in `responses` contains: Y_train: array of shape (3600, n_voxels) Training responses. Y_test: array of shape (270, n_voxels) Testing responses. run_onsets: array of shape (12, ) Indices of each run onset. Each file in `stimuli` contains: stimuli: array of shape (n_images, 512, 512, 3) Each training run contains 9000 images total. The test run contains 8100 images total. ```