gxlilyBerkeley 1 anno fa
parent
commit
c979389933

+ 119 - 1
README.md

@@ -1,2 +1,120 @@
-# story_listening
+## Nature Story Listening 3T fMRI Data
+  
+## Summary
+
+This dataset contains BOLD fMRI responses in human subjects listening to a set of natural autobiographic stories. The functional data were collected in eleven subjects, in two sessions over two separate days for each subject. Details of the experiment are described in the original publications [1], [2], [3], [4]. Source data used to generate all the figures in the publication [4] is included. Code used to analyze the data in the publication [4] is [here](https://github.com/theunissenlab/phoneme_segmentation).
+
+> **[1]** Huth, A. G., De Heer, W. A., Griffiths, T. L., Theunissen, F. E., & Gallant, J. L.
+> Natural speech reveals the semantic maps that tile human cerebral cortex.
+> Nature 532, 453–458 (2016). https://doi.org/10.1038/nature17637
+
+> **[2]** de Heer, W. A., Huth, A. G., Griffiths, T. L., Gallant, J. L., & Theunissen, F. E..
+>The hierarchical cortical organization of human speech processing.
+>Journal of Neuroscience, 37(27), 6539-6557 (2017). DOI: https://doi.org/10.1523/JNEUROSCI.3267-16.2017
+
+> **[3]** Deniz, F., Nunez-Elizalde, A. O., Huth, A. G., & Gallant, J. L..
+> The representation of semantic information across human cerebral cortex during listening versus reading is invariant to stimulus modality.
+> Journal of Neuroscience, 39(39), 7722-7736 (2019). DOI: https://doi.org/10.1523/JNEUROSCI.0675-19.2019
+
+> **[4]** Gong, X., Huth, A. G., Johnson, K., Gallant, J. L., & Theunissen, F. E..
+> Phonemic segmentation of narrative speech in human cerebral cortex.
+> Nature Communications, (2023). https://doi.org/
+
+If you publish any work using the dataset, please cite the original publication
+[2] and [4], and cite the dataset [1b] in the following recommended format:
+
+> **[1]** Huth, A. G., De Heer, W. A., Deniz, F., Gong, X., Gallant, J. L., & Theunissen, F. E..
+> Nature Story Listening 3T fMRI Data.
+
+## How to get started
+
+#### With git and git-annex
+
+To download the data with [git-annex](https://git-annex.branchable.com/), run
+the commands:
+```bash
+# clone the repository, without the data files
+git clone https://gin.g-node.org/gallantlab/story_listening
+cd story_listening
+# download one file (e.g. features/features_matrix.hdf)
+git annex get features/feature_matrix.hdf --from wasabi
+# download all files
+git annex get . --from wasabi
+```
+
+To maximize the downloading speed, two remotes are available to download the
+data. The first remote is GIN (`--from origin`), but the bandwidth might be
+limited. The second remote is Wasabi (`--from wasabi`), with a larger
+bandwidth.
+
+## Dataset content
+
+#### Data file organization
+
+```text
+features/                    → feature spaces used for voxelwise modeling
+    english1000.hdf          → semantic embeddings, as described in [1], [2], [3], [4]
+    feature_basis.hdf        → all feature labels, as described in [1]
+    feature_matrix.hdf       → all feature, as described in [1]
+mappers/                     → plotting mappers for each subject
+    S01_mappers.hdf
+    ...
+    S11_mappers.hdf
+responses/                   → functional responses for each subject
+    S01_BOLD.hdf
+    ...
+    S11_BOLD.hdf
+    simulation_BOLD.hdf      → simulated functional responses for simulation analysis
+stimuli/                     → natural autobiographic story, for each fMRI run
+    test.wav
+    train_00.wav
+    ...
+    train_11.wav
+```
+
+#### Data format
+
+All files are hdf5 files, with multiple arrays stored inside.
+The names, shapes, and descriptions of each array are listed below.
+
+```text
+
+Each file in `features` contains:
+    X_train: array of shape (3737, n_features)
+        Training features.
+    X_test: array of shape (291, n_features)
+        Testing features.
+
+where (n_features = 448) for `spectral power` 
+and (n_features = 1) for `number of phonemes` & `number of words` 
+and (n_features = 39) for `single phoneme`
+and (n_features = 858) for `diphone`.
+and (n_features = 4841) for `triphone`.
+and (n_features = 985) for `semantics`.
+
+Each file in `mappers` contains:
+    voxel_to_flatmap: CSR sparse array of shape (n_pixels, n_voxels)
+        Mapper from voxels to flatmap image. The sparse array is stored with
+        four dense arrays: (data, indices, indptr, shape).
+    voxel_to_fsaverage: CSR sparse array of shape (n_vertices, n_voxels)
+        Mapper from voxels to FreeSurfer surface. The sparse array is stored
+        with four dense arrays: (data, indices, indptr, shape).
+    flatmap_mask: array of shape (width, height)
+        Pixels of the flatmap image associated with a voxel.
+    flatmap_rois: array of shape (width, height, 4)
+        Transparent image with annotated ROIs (for subjects S01, S02, and S03).
+    flatmap_curvature: array of shape (width, height)
+        Transparent image with binarized curvature to locate sulci/gyri.
+    roi_mask_xxx: array of shape (n_voxels, )
+        Mask indicating which voxels are in the ROI `xxx`.
+        ROI list is different on each subject. SO4 and S05 have no ROIs.
+
+Each file in `responses` contains:
+    Y_train: array of shape (3737, n_voxels)
+        Training responses.
+    Y_test: array of shape (291, n_voxels)
+        Testing responses.
+
+Each file in `stimuli` contains the raw sound wav for each story. 
+```
 

+ 1 - 0
features/english1000.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s82673264--74a55e89b855ff47b05a63e30c265837.hdf

+ 1 - 0
features/features_basis.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s58027--5b00f3cc0f44201c144d8098341af57d.hdf

+ 1 - 0
features/features_matrix.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s231150944--273239bf3daaefd5e584309e4d9e58a7.hdf

+ 1 - 0
mappers/AH_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4280532--94305014ee65167ac00134b5b8543c1e.hdf

+ 1 - 0
mappers/AN_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4622511--d2f0610de85937f15d980883982cb4a6.hdf

+ 1 - 0
mappers/BG_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4375331--2a4e277cf6796fea58e7047eafb2f386.hdf

+ 1 - 0
mappers/DS_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s3886903--377f4e8d090ce9d249345d91220027e4.hdf

+ 1 - 0
mappers/JG_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4166101--d5d94bbe1626a494f38e3ad9a82fd326.hdf

+ 1 - 0
mappers/ML_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4681773--034a5efe0ebd566ba678d0b655a96165.hdf

+ 1 - 0
mappers/NNS0_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4134915--cefcd3dbcdfa4f31905968ac1f4e0668.hdf

+ 1 - 0
mappers/SP_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4714394--ad2de32165100b6eada424ef3a712170.hdf

+ 1 - 0
mappers/SS_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4663078--1b016d7f5143c1926f676c1399be3f43.hdf

+ 1 - 0
mappers/TZ_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s5182319--a5035a3056c01d2dc72cd944825f5a2d.hdf

+ 1 - 0
mappers/WH_mappers.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s4612060--33af172e397c6881556408b7f7c99739.hdf

+ 1 - 0
responses/S01_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2614431960--5e25254f983fe79582be1eb9d00e5a9c.hdf

+ 1 - 0
responses/S02_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2353095320--f56166a9015681a3f800c320cef8337e.hdf

+ 1 - 0
responses/S03_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2461593528--b3f55246f37b984b9c6ee3fce96e8cc0.hdf

+ 1 - 0
responses/S04_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2597739928--03a0b8fe01c81c1232eff8f826980a35.hdf

+ 1 - 0
responses/S05_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2995867448--2146e5daf884c05e095395e3f63d5648.hdf

+ 1 - 0
responses/S06_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2435814328--da4f67d818303cc27a23e69ebcacd197.hdf

+ 1 - 0
responses/S07_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2374105368--e0be0ef20f8b7115c38953a7fcce5309.hdf

+ 1 - 0
responses/S08_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2589200568--734b668e38925e18c7f4ae39bc29cc86.hdf

+ 1 - 0
responses/S09_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2575859832--b2433694901c318b81cdb2cadc491692.hdf

+ 1 - 0
responses/S10_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2827400376--518a9fc79b2002a0c9dec30b576437d9.hdf

+ 1 - 0
responses/S11_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s2747968216--37726120ecc2f2ced7137a73c126f000.hdf

+ 1 - 0
responses/simulation_BOLD.hdf

@@ -0,0 +1 @@
+/annex/objects/MD5E-s100768--e37d0f9c2bab224ce5436effe2bdcce6.hdf

+ 1 - 0
source_data/Source_data_manuscript.xlsx

@@ -0,0 +1 @@
+/annex/objects/MD5E-s5825571--e392e3c622e1d00f63439b970d3d73e8.xlsx

+ 1 - 0
stimuli/test.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s106181968--6bf1edd89bd2c13270ebea03abcf165f.wav

+ 1 - 0
stimuli/train_00.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s124660160--3985f8ba848c5cfdac4fbd665a54f98a.wav

+ 1 - 0
stimuli/train_01.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s133051216--53f8d723704f50777e9fd0c88aaaff9a.wav

+ 1 - 0
stimuli/train_02.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s128468252--ea29171119ff2a915ca40510faf2a4d8.wav

+ 1 - 0
stimuli/train_03.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s144648044--b922b19b5f662f47b2421d4142e93a39.wav

+ 1 - 0
stimuli/train_04.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s155281288--e8e3221dbca7c0fc53b1b0f5a1c4245f.wav

+ 1 - 0
stimuli/train_05.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s129986852--fcd71830ccced0b0b479f94298a3eb16.wav

+ 1 - 0
stimuli/train_06.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s152593688--e1112aae574de3e8f25fbe7c23fed41b.wav

+ 1 - 0
stimuli/train_07.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s146078060--c4c8c7af4a845566632ca276ffba2611.wav

+ 1 - 0
stimuli/train_08.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s128772044--5aebb0e88a1f155591a72f628f7df3cc.wav

+ 1 - 0
stimuli/train_09.wav

@@ -0,0 +1 @@
+/annex/objects/MD5E-s110734852--d2b680dfcc5a51bb4886b18e6b36f5d8.wav