Large-scale electrophysiological recordings from V1, V4 and IT in two macaques in response to ~22k images from the THINGS image database. https://things-initiative.org/

paolo_papale 67e73ba3c7 Aggiorna 'datacite.yml' 1 هفته پیش
_code 1ff1c0fee4 Aggiorna '_code/lucent-things/README.md' 1 هفته پیش
monkeyF b8dc0dcf3b gin commit from stijngtx 1 هفته پیش
monkeyN 1caa9a9bed gin commit from stijngtx 1 هفته پیش
LICENSE 182acb797a Aggiorna 'LICENSE' 1 هفته پیش
README.md c554be7db3 Aggiorna 'README.md' 1 هفته پیش
datacite.yml 67e73ba3c7 Aggiorna 'datacite.yml' 1 هفته پیش
gitignore.txt 4d2c97fc5d gin commit from stijngtx 1 ماه پیش

README.md

TVSD

THINGS Ventral stream Spiking Dataset (TVSD)

logo

Large-scale electrophysiological recordings from V1, V4 and IT in two macaques in response to ~22k images from the THINGS image database.

Paolo Papale, Feng Wang, Matthew W. Self and Pieter R. Roelfsema

Dept. of Vision and Cognition, Netherlands Institute for Neuroscience (KNAW), 1105 BA Amsterdam (NL).

N.B. The THINGS stimuli from Martin Hebart (et al.) are not provided, but you can download them on things-initiative.org

Info

A full description of the dataset is provided in the Neuron paper. A few additional things are needed to be able to work on the data.

RAW

Data for each monkey is provided in a folder, containing both the RAW and MUA data. The RAW data is subdived in different days of recordings, and individual blocks/runs of ~20 minutes of lenght. We provide the MATLAB code to extract the MUA out of RAW data, aggregate all the trials across blocks and days, normalize it, filter and chunck it for model training. These scripts can be easily changed to extract LFP, or to aggregate the RAW data, or look into not-completed images.

These scripts can be found in "_code" and are (in sequence):

  • extract_MUA_v2.m (for monkeyF) and extraxt_MUA_v2_N.m (for monkeyN - we used 2 different recording systems)
  • collect_MUA_v2.m
  • norm_MUA.m
  • export_MUA.m

The MUA data is provided both un-normalized ("THINGS_MUA_trials.mat") and normalized and averaged in time-windows ("THINGS_normMUA.mat")

MUA

THINGS_MUA_trials.mat contains:

  • ALLMAT: has info on what was shown in each stimulus (rows). The columns are: [#trial_idx #train_idx #test_idx #rep #count #correct #day]. Most importat stuff are "train_idx" and "test_idx" which tells you the ID of the stimulus. If "train_idx" for a specific row is 0, that means that a test image was shown, and the other way around. "count" is the position of the stimulus in the sequence of 4 images that were showed to the monkeys. "rep" (i.e. repetitions) was reshuffled by randomization so it's practically useless. "trial_idx" is linearly adding up. Only correct trials are included! "day" is just the day of that session.
  • ALLMUA: rows correspond to stimuli. The columns are: [#electrode #trial_idx #time-points].
  • tb: time, in ms, w.r.t. the stimulus onset, correspoing to the elements in "time-points" in "ALLMUA".

normalized MUA

THINGS_normMUA.mat contains:

  • SNR: the signal-to-noise ratio, computed for each day of recordings #electrodes #days
  • SNR_max: the SNR_max (described in the paper)
  • lats: latency of the onset of the (mean) response, computed for each day of recordings #electrodes #days
  • reliab: the reliability of each electrode - the mean reliability described in the paper is the just the mean of reliab across combinations of trials
  • oracle: the oracle correlation (described in the paper)
  • train_MUA: averaged and normalized response to each train stimulus, already sorted to match "things_imgs.mat" (see below) #electrode #stimuli
  • test_MUA: averaged and normalized response to each test stimulus, already sorted to match "things_imgs.mat" (see below) #electrode #stimuli
  • test_MUA_reps: normalized response to each test stimulus and repetition, already sorted to match "things_imgs.mat" (see below) #electrode #stimuli
  • tb: time, in ms, w.r.t. the stimulus onset, correspoing to the elements in "time-points" in "ALLMUA".

Stimuli

The scripts and data rely on logfiles hosted in "_logs". There, you can also find "things_imgs.mat" that is required to associate each stimulus from the THINGS initiative database to the specific trial (see below). things_imgs.mat contains:

  • train_imgs: has info on the stimulus ID, e.g. field 1 corresponds to stim 1 "train_idx" in ALLMAT.
  • test_imgs: has info on the stimulus ID, e.g. field 1 corresponds to stim 1 "test_idx" in ALLMAT.

N.B. the normalized data is already sorted according to the order of images in "train_imgs" and "test_imgs" from "things_imgs.mat"

Code

In addition to the scripts mentioned, we provide the Matlab APIs from Blackrock Neurotech, the same version used for the paper. Also, we provide a few util functions that are called by the main scripts or can be used to plot some of the results. Finally, we provide the Python code for the MEIs in "lucent-things", based on the lucent viz library GitHub link

Downloading the data

Using gin

Create an account on gin and download the gin client as described here. On your computer, log in using:

gin login

Clone the repository using:

gin get paolo_papale/TVSD

The cloning step can take a long time, due to the large amount of individual files. Please be patient.

Large data files will not be downloaded automatically, they will appear as git-annex links instead. We recommend downloading only the files you need, since the entire dataset is large. To get the contents of a certain file:

gin get-content <filename>

Downloaded large files will be read-only. You might want to unlock the files using:

gin unlock <filename>

To remove the contents of a large file again, use:

gin remove-content <filename>

Detailed description of the gin client can be found at the gin wiki. See the gin usage tutorial for advanced features.

Using the web browser

Download the files you want by clicking download in the gin web interface. Convenience summary tables of the data and sessions can be found in the summary above.

Citation policy

Cite this work by citing the original publication.

Contact information

Don't use our institutional emails for questions about the TVSD, instead you can reach us at things [dot] tvsd [at] gmail [dot] com

License

Creative Commons License
The data and metadata in this work are licensed under a Creative Commons Attribution 4.0 International License.

Python (v. 2.7) and Matlab (v. 2019b) code in this repository are licensed under the same license, with the following exceptions:

The matlab-based NPMK package provided within this repository is re-distributed under the BSD 3-clause license, in compliance with the original licensing terms.

The python-based lucent library provided within this repository is re-distributed under the Apache License 2.0 license, in compliance with the original licensing terms.

The python-based models under lucent provided within this repository are re-distributed under the MIT License license, in compliance with the original licensing terms.

datacite.yml
Title TVSD: THINGS Ventral-stream Spiking Dataset
Authors Papale,Paolo;NIN (KNAW);ORCID:0000-0002-6249-841X
Roelfsema,Pieter;NIN (KNAW);ORCID:0000-0002-1625-0034
Description Large-scale electrophysiological recordings from V1, V4 and IT in two macaques in response to ~22k images from the THINGS image database.
License Creative Commons CC BY Attribution 4.0 International (https://creativecommons.org/licenses/by/4.0/)
References Papale, P., Wang, F., Self, W.M. and Roelfsema P.R., An extensive dataset of spiking activity to reveal the syntax of the ventral stream, Neuron, accepted [doi:tba] (IsDescribedBy)
Funding NWO, VI.Veni.222.217
NWO, OCENW.XS22.2.097
NWO, Crossover grant 17619 INTENSE
NWO, DBI2
EU, HORIZON 2020 FP, 945539 Human Brain Project
EU, HORIZON 2020 FP, 899287 NeuraViper
EU, ERC 101052963 NUMEROUS
Keywords THINGS
Vision
Neurophysiology
Monkey
Ventral stream
Natural Images
TVSD
Resource Type Dataset