README.md 3.3 KB

narratives_reading_listening_fmri

This folder contains stimuli, models, and fMRI data originally created for and collected in Deniz et al. 2019 (see below for full reference).

Some results from this study can be viewed online at:

https://www.gallantlab.org/brainviewer/Deniz2019/

Stimuli

The stimuli folder contains wav files presented to the subjects in the experiment. The reading stimuli was based on the transcripts of the stories that are included into this folder as textfiles. story_11.wav is the validation story. For full details regarding stimulus presentation, please see the methods section of Deniz et al. 2019.

Model Features

In the features folder, the two files features_trn_NEW.hdf and features_val_NEW.hdf store the values for features in all the models for each TR of the stimulus stories. These feature values are downsampled to the sampling rate of the MRI data. For example, the dimensions of the stored array for the semantic model for each story is (time x 985) because there are 985 features in the semantic feature space. The dimensions of the stimulus features of each story are 10 seconds (5 TR) less than the fMRI data because the 10 s (5 TRs) silence after the story during the scan are not reflected in the stimulus features. However, the 10 s (5 TRs) silence before the story are included in the stimulus features. When data is trimmed as described in the paper, this discrepancy should be considered. The file moth_en_moten_20210928.npz contains the motion energy feature and is already trimmed and concatenated across train and test stories.

Data

In the responses folder, fMRI data for the six subjects in the experiment are provided in arrays of (time x voxels) for each data collection run (10 stories of training data, 1 story repeated 2 times as validation data). The data have been preprocessed to account for effects of subject motion and voxel selection has been applied to select only cortical voxels in each scan. Manually edited freesurfer segmentations of the cortex were used along with pycortex to produce masks for cortical activity. Out of concern for subject privacy, we provide neither raw functional scans nor anatomical scans. Instead, we provide sparse matrices that can be used (1) to map the per-voxel data onto a flattened version of each subject's brain and (2) to map each subject's brain onto the fsaverage surface in freesurfer (which itself is in MNI space). These sparse matrices can be found in the mappers folder.

These mapping arrays provide a way to assess the relationship between anatomy and function without compromising subjects' privacy. Full raw data may be provided for specifically defined research goals requiring raw data if and only if the subjects consent to that specific use.

Check example.py in code/ directory for example code that can be used to load data and visualize data on subject's cortical surface.

Citation

Deniz, F., Nunez-Elizalde, A. O., Huth, A. G., & Gallant, J. L. (2019). The representation of semantic information across human cerebral cortex during listening versus reading is invariant to stimulus modality. Journal of Neuroscience, 39(39), 7722-7736.

This repository also contains data from other papers that used the dataset, such as Chen et al., 2024 in the chen2024_timescales folder.