This dataset contains BOLD fMRI responses in human subjects viewing a set of natural short movie clips. The functional data were collected in five subjects, in three sessions over three separate days for each subject. Details of the experiment are described in the original publication [1].
[1] Huth, Alexander G., Nishimoto, S., Vu, A. T., & Gallant, J. L. (2012). A continuous semantic space describes the representation of thousands of object and action categories across the human brain. Neuron, 76(6), 1210-1224.
If you publish any work using the dataset, please cite the original publication [1], and cite the dataset [1b] in the following recommended format:
[1b] Huth, A. G., Nishimoto, S., Vu, A. T., Dupre la Tour, T., & Gallant, J. L. (2022). Gallant Lab Natural Short Clips 3T fMRI Data. http://dx.doi.org/--TBD--
The present dataset uses the same stimuli (natural short movie clips) than a previous experiment of the Gallant lab [2], publicly released in CRCNS under the name "vim-2" [2b]. Both dataset use the same stimuli, but the functional data is different.
The "shortclips" dataset [1b] contains full brain responses recorded every two seconds (2s) with a 3T scanner. The "vim-2" dataset [2b] contains responses from the occipital lobe only, recorded every second (1s) with a 4T scanner. Contrary to the "shortclips" dataset, the "vim-2" dataset does not provide mappers to plot the data on flatten maps of the cortical surface.
[2] Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. L. (2011). Reconstructing visual experiences from brain activity evoked by natural movies. Current Biology, 21(19), 1641-1646.
[2b] Nishimoto, S., Vu, A. T., Naselaris, T., Benjamini, Y., Yu, B., & Gallant, J. L. (2014). Gallant Lab Natural Movie 4T fMRI Data. CRCNS.org. http://dx.doi.org/10.6080/K00Z715X
The preferred way to explore this dataset is through the voxelwise tutorials. These tutorials includes Python downloading tools, data loaders, plotting utilities, and examples of analysis following the original publication [1] [2].
To run the tutorials, see https://gallantlab.github.io/voxelwise_tutorials).
To download the data with git-annex, the following dependencies are necessary: git, git-annex. Then, run the commands
# clone the repository, without the data files
git clone https://gin.g-node.org/gallantlab/shortclips
cd shortclips
# download one file (e.g. features/wordnet.hdf)
git annex get features/wordnet.hdf --from wasabi
# download all files
git annex get . --from wasabi
To maximize the download speed, two remotes are available to download the data.
The first remote is GIN (--from origin
), but the bandwidth might be limited.
The second remote is Wasabi (--from wasabi
), with a larger bandwidth.
A basic example script is available in example.py
. For more utilities and
example of analysis, see the dedicated voxelwise
tutorials.
The recommended way to ask questions is in the issue tracker on the GitHub page https://github.com/gallantlab/voxelwise_tutorials/issues.
features/ → feature spaces used for voxelwise modeling
motion_energy.hdf → visual motion energy, as described in [2]
wordnet.hdf → visual semantic labels, as described in [1]
mappers/ → plotting mappers for each subject
S01_mapper.hdf
...
S05_mapper.hdf
responses/ → functional responses for each subject
S01_responses.hdf
...
S05_responses.hdf
stimuli/ → natural movie stimuli, for each fMRI run
test.hdf
train_00.hdf
...
train_11.hdf
utils/
wordnet_categories.txt → names of the wordnet labels
wordnet_graph.dot → wordnet graph to plot as in [1]
example.py → Python example to load and plot the data
All files are hdf5 files, with multiple arrays stored inside. The names, shapes, and descriptions of each array are listed below.
Each file in `features` contains:
X_train: array of shape (3600, n_features)
Training features.
X_test: array of shape (270, n_features)
Testing features.
run_onsets: array of shape (12, )
Indices of each run onset.
where (n_features = 6555) for `motion_energy.hdf`
and (n_features = 1705) for `wordnet.hdf`.
Each file in `mappers` contains:
voxel_to_flatmap: CSR sparse array of shape (n_pixels, n_voxels)
Mapper from voxels to flatmap image. The sparse array is stored with
four dense arrays: (data, indices, indptr, shape).
voxel_to_fsaverage: CSR sparse array of shape (n_vertices, n_voxels)
Mapper from voxels to FreeSurfer surface. The sparse array is stored
with four dense arrays: (data, indices, indptr, shape).
flatmap_mask: array of shape (width, height)
Pixels of the flatmap image associated with a voxel.
flatmap_rois: array of shape (width, height, 4)
Transparent image with annotated ROIs (for subjects S01, S02, and S03).
flatmap_curvature: array of shape (width, height)
Transparent image with binarized curvature to locate sulci/gyri.
roi_mask_xxx: array of shape (n_voxels, )
Mask indicating which voxels are in the ROI `xxx`.
ROI list is different on each subject. SO4 and S05 have no ROIs.
Each file in `responses` contains:
Y_train: array of shape (3600, n_voxels)
Training responses.
Y_test: array of shape (270, n_voxels)
Testing responses.
run_onsets: array of shape (12, )
Indices of each run onset.
Each file in `stimuli` contains:
stimuli: array of shape (n_images, 512, 512, 3)
Each training run contains 9000 images total.
The test run contains 8100 images total.