Parcourir la source

Import README

Michael Hanke il y a 3 ans
Parent
commit
d162ed3353
1 fichiers modifiés avec 337 ajouts et 1 suppressions
  1. 337 1
      README.md

+ 337 - 1
README.md

@@ -1,7 +1,343 @@
-# Project <insert name>
+# Processing of visual and non-visual naturalistic spatial information in the "parahippocampal place area": from raw data to results
+
+[![made-with-datalad](https://www.datalad.org/badges/made_with.svg)](https://datalad.org)
+
+This repository contains the raw data and all code to generate the results in
+Häusler C.O. & Hanke M. (submitted).
+
+If you have never used [DataLad](https://www.datalad.org/) before, please read the
+section on DataLad datasets below.
+
+## DataLad datasets and how to use them
+
+This repository is a [DataLad](https://www.datalad.org/) dataset. It allows
+fine-grained data access up to the level of single files.  In order to use this
+repository for data retrieval, [DataLad](https://www.datalad.org/) is required.
+It is a free and open source command line tool, available for all major
+operating systems, and builds up on Git and
+[git-annex](https://git-annex.branchable.com/) to allow sharing, synchronizing,
+and version controlling collections of large files. You can find information on
+how to install DataLad at
+[handbook.datalad.org/en/latest/intro/installation.html](http://handbook.datalad.org/en/latest/intro/installation.html).
+
+### Get the dataset
+
+A DataLad dataset can be `cloned` by running
+
+```
+datalad clone <url>
+```
+Once a dataset is cloned, it is a light-weight directory on your local machine.
+At
+this point,
+it contains only small metadata and information on the identity of the files in the dataset,
+but not actual *content* of the (sometimes large) data files.
+
+### Retrieve dataset content
+
+After cloning a dataset, you can retrieve file contents by running
+```
+datalad get <path/to/directory/or/file>
+```
+This command will trigger a download of the files, directories, or subdatasets you have specified.
+
+DataLad datasets can contain other datasets, so called *subdatasets*. If you clone the top-level
+dataset, subdatasets do not yet contain metadata and information on the identity of files,
+but appear to be empty directories. In order to retrieve file availability metadata in
+subdatasets, run
+
+```
+datalad get -n <path/to/subdataset>
+```
+Afterwards, you can browse the retrieved metadata to find out about subdataset contents, and
+retrieve individual files with `datalad get`. If you use `datalad get <path/to/subdataset>`,
+all contents of the subdataset will be downloaded at once.
+
+### Stay up-to-date
+
+DataLad datasets can be updated. The command `datalad update` will *fetch* updates and store them
+on a different branch (by default `remotes/origin/master`). Running
+```
+datalad update --merge
+```
+will *pull* available updates and integrate them in one go.
+
+### More information
+
+More information on DataLad and how to use it can be found in the DataLad Handbook at
+[handbook.datalad.org](http://handbook.datalad.org/en/latest/index.html). The chapter
+"DataLad datasets" can help you to familiarize yourself with the concept of a dataset.
 
 ## Dataset structure
 
 - All inputs (i.e. building blocks from other sources) are located in
   `inputs/`.
 - All custom code is located in `code/`.
+
+
+## Cookbook -- How this dataset was assembled
+### install subdatasets and get the raw data
+
+    # install subdataset that provides motion corrected fMRI data from the audio-visual movie and its audio-description
+    datalad install -d . -s https://github.com/psychoinformatics-de/studyforrest-data-aligned inputs/studyforrest-data-aligned
+    # download 4D fMRI data (and motion correction parameters of the movie) 
+    datalad get inputs/studyforrest-data-aligned/sub-??/in_bold3Tp2/sub-??_task-a?movie_run-?_bold*.*
+
+    # install subdataset that provides the original 7 Tesla data to get the motion correction parameters of the audio-descriptio
+    datalad install -d . -s juseless.inm7.de:/data/project/studyforrest/collection/phase1 inputs/phase1
+    datalad get inputs/phase1/sub???/BOLD/task001_run00?/bold_dico_moco.txt
+
+    # install subdataset "template & transforms", and download the relevant images
+    datalad install -d . -s https://github.com/psychoinformatics-de/studyforrest-data-templatetransforms inputs/studyforrest-data-templatetransforms
+    datalad get inputs/studyforrest-data-templatetransforms/sub-*/bold3Tp2/
+    datalad get inputs/studyforrest-data-templatetransforms/templates/*
+
+    # install subdataset "studyforrest-data-annotations" that contains the annotation of cuts & locations as subdataset
+    # and "code/researchcut2segments.py" that we need to segment the (continuous) annotations
+    datalad install -d . -s https://github.com/psychoinformatics-de/studyforrest-data-annotations inputs/studyforrest-data-annotations
+
+    # install the annotation of speech as subdataset
+    datalad install -d . -s juseless.inm7.de:/data/group/psyinf/studyforrest-speechannotation inputs/studyforrest-speechannotation
+    # download the annotation as TSV-file (BIDS)
+    datalad get inputs/studyforrest-speechannotation/annotation/fg_rscut_ad_ger_speech_tagged.tsv
+
+### segmenting of continuous annotations
+    # segment the location annotation using timings of the audio-visual movie segments
+    datalad run \
+    -i inputs/studyforrest-data-annotations/researchcut/locations.tsv \
+    -o events/segments \
+    ./inputs/studyforrest-data-annotations/code/researchcut2segments.py \
+    '{inputs}' \
+    avmovie avmovie \
+    '{outputs}'
+
+    # segment the speech annotation using timings of the audio-description segments
+    datalad run \
+    -i inputs/studyforrest-speechannotation/annotation/fg_rscut_ad_ger_speech_tagged.tsv \
+    -o events/segments \
+    ./inputs/studyforrest-data-annotations/code/researchcut2segments.py \
+    '{inputs}' \
+    aomovie aomovie \
+    '{outputs}'
+
+    # for control contrasts, segment the speech annotation using timings of the audio-visual movie segments
+    datalad run \
+    -i inputs/studyforrest-speechannotation/annotation/fg_rscut_ad_ger_speech_tagged.tsv \
+    -o events/segments \
+    ./inputs/studyforrest-data-annotations/code/researchcut2segments.py \
+    '{inputs}' \
+    avmovie avmovie \
+    '{outputs}'
+
+    # for control contrasts, segment the location annotation using timings of the audio-description segments
+    datalad run \
+    -i inputs/studyforrest-data-annotations/researchcut/locations.tsv \
+    -o events/segments \
+    ./inputs/studyforrest-data-annotations/code/researchcut2segments.py \
+    '{inputs}' \
+    aomovie aomovie \
+    '{outputs}'
+
+### manual addition of confound annotations and a script that gets the annotation in shape for the subsequent FEAT analyses
+    # add low-level confound files of audio-visual movie manually & save (folder "avconfounds")
+    datalad save -m 'add low-level confound files for audio-visual movie to /events/segments'
+    # add low-level confound files of audio-description manually & save (folder "aoconfounds")
+    datalad save -m 'add low-level confound files for audio-description to /events/segments'
+
+### convert confound annotations into FEAT onset files
+    # add script code/confounds2onsets.py
+    datalad save -m 'add script that converts & copies confound files to onsets directories'
+    # perform the conversion considering the directories of corresponding fMRI runs and
+    # rename according to conventions used in FSL-design files
+    datalad run \
+    -i events/segments \
+    -o events/onsets \
+    ./code/confounds2onsets.py -i '{inputs}' -o '{outputs}'
+
+### create FEAT onsets files from the segmented annotation of cuts & locations
+    # add the script that performs the conversion
+    datalad save -m 'add script that creates event files for FSL from the segmented location annotation'
+
+    # create event onset files from segmented location annotation (timings of audio-visual movie)
+    datalad run \
+    -m "create the event files with movie timing" \
+    -i events/segments/avmovie \
+    -o events/onsets \
+    ./code/locationsanno2onsets.py \
+    -ind '{inputs}' \
+    -inp 'locations_run-?_events.tsv' \
+    -outd '{outputs}'
+
+    # create event onset files from segmented location annotation (timings of audio-description)
+    datalad run \
+    -m "create the event files with audio-track timing" \
+    -i events/segments/aomovie \
+    -o events/onsets \
+    ./code/locationsanno2onsets.py \
+    -ind '{inputs}' \
+    -inp 'locations_run-?_events.tsv' \
+    -outd '{outputs}'
+
+### create FEAT onsets files from the segmented annotation of speech
+    # add the script that performs the conversion
+    datalad save -m 'add script that creates event files for FSL from the segmented speech annotation'
+
+    # create event onset files from segmented speech annotation (timings of audio-visual movie)
+    datalad run \
+    -i events/segments/avmovie \
+    -o events/onsets \
+    ./code/speechanno2onsets.py \
+    -ind '{inputs}' \
+    -inp 'fg_rscut_ad_ger_speech_tagged_run-*.tsv' \
+    -outd '{outputs}'
+
+    # create event onset files from segmented speech annotation (timings of audio-description)
+    datalad run \
+    -i events/segments/aomovie \
+    -o events/onsets \
+    ./code/speechanno2onsets.py \
+    -ind '{inputs}' \
+    -inp 'fg_rscut_ad_ger_speech_tagged_run-*.tsv' \
+    -outd '{outputs}'
+
+### copy FEAT event files to folders of individual subjects
+
+    # manually add the script that creates directories & handles the copying
+    datalad save -m 'add script that creates subject directories and copies FSL event files  into it'
+
+    # create subjects folders & copy events with timing of the audio-visual movie
+    datalad run \
+    -m "create subject folders & copy event files to it" \
+    ./code/onsets2subfolders.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-??/in_bold3Tp2/sub-??_task-aomovie_run-1_bold.nii.gz' \
+    -onsets 'events/onsets/avmovie/run-?/*.txt' \
+    -o './'
+
+    # copy events with timing of the audio-description 
+    datalad run \
+    -m "copy event files with audio-description with movie timings to subject folders" \
+    ./code/onsets2subfolders.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-??/in_bold3Tp2/sub-??_task-aomovie_run- 1_bold.nii.gz' \
+    -onsets 'events/onsets/aomovie/run-?/*.txt' \
+    -o './'
+
+### manually add the templates of FEAT design files
+
+    # manually add the script that creates first level individual design files from template
+    datalad save -m 'add python script that creates individual (1st level) design files from templates'
+
+    # analyses in group space, level 1-3 (e.g. 1st-lvl_movie-ppa-grp.fsf, 2nd-lvl_movie-ppa-grp.fsf, 3rd-lvl_movie-ppa-grp-1.fsf)
+    # both steps include adding the bash scripts that take 2nd level templates as input and create design-files in individual directories 
+    # (e.g. generate_2nd-lvl-design_movie-ppa-grp.sh)
+    datalad save -m 'add FSL design files (lvl 1-3) for movie (group)'
+    datalad save -m 'add FSL design files (lvl 1-3) for audio (group)'
+
+    # analyses in subject space, level 1-2 (e.g. 1st-lvl_movie-ppa-ind.fsf, 2nd-lvl_movie-ppa-ind.fsf)
+    # both steps include adding the bash scripts that take 2nd level templates as input and create design-files in individual directories 
+    # (e.g. generate_2nd-lvl-design_movie-ppa-ind.sh)
+    datalad save -m 'add FSL design files (lvl 1-2) for movie (individuals)'
+    datalad save -m 'add FSL design files (lvl 1-2) for audio (individuals)'
+
+### from templates, create FEAT design files for individual subjects
+
+    # movie, group space, first level
+    datalad run \
+    -m 'for movie analysis (group), create individual (1st level) design files from template' \
+    code/generate_1st-lvl-design.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-01/in_bold3Tp2/sub-01_task-avmovie_run-1_bold.nii.gz' \
+    -design 'code/1st-lvl_movie-ppa-grp.fsf'
+
+    # movie, group space, second level
+    datalad run \
+    -m "for movie analysis (group), generate individual 2nd lvl design files from template" \
+    "./code/generate_2nd-lvl-design_movie-ppa-grp.sh"
+
+    # audio-description, group space, first level
+    datalad run \
+    -m 'for audio analysis (group), create individual 1st level design files from template' \
+    code/generate_1st-lvl-design.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-01/in_bold3Tp2/sub-01_task-aomovie_run-1_bold.nii.gz' \
+    -design 'code/1st-lvl_audio-ppa-grp.fsf'
+
+    # audio-description, group space, second level
+    datalad run \
+    -m "for audio analysis (group), generate individual 2nd lvl design files from template" \
+    "./code/generate_2nd-lvl-design_audio-ppa-grp.sh"
+
+    # movie, subject space, first level
+    datalad run \
+    -m 'for movie analysis (individuals), create individual 1st level design files from template' \
+    code/generate_1st-lvl-design.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-01/in_bold3Tp2/sub-01_task-avmovie_run-1_bold.nii.gz' \
+    -design 'code/1st-lvl_movie-ppa-ind.fsf'
+
+    # movie, subject space, second level
+    datalad run \
+    -m "for movie analysis (individuals), generate individual 2nd lvl design files from template" \
+    "./code/generate_2nd-lvl-design_movie-ppa-ind.sh"
+
+    # audio-description, subject space, first level
+    datalad run \
+    -m 'for audio analysis (individuals), create individual 1st level design files from template' \
+    code/generate_1st-lvl-design.py \
+    -fmri 'inputs/studyforrest-data-aligned/sub-01/in_bold3Tp2/sub-01_task-aomovie_run-1_bold.nii.gz' \
+    -design 'code/1st-lvl_audio-ppa-ind.fsf'
+
+    # audio-description, subject space, second level
+    datalad run \
+    -m "for audio analysis (individuals), generate individual 2nd lvl design files from template" \
+    "./code/generate_2nd-lvl-design_audio-ppa-ind.sh"
+
+### manually add bash script that handles createn custom standard space templates & matrices for FEAT
+    datalad save -m "add script that add templates & transformation matrices to 1st lvl result directories of Feat"
+
+### run the analyses via condor_submit on a computer cluster & manually save results
+    # add file "condor-commands-for-cm.txt" that contains the following commands to manually submit the subsequent analyses to HTCondor
+    datalad save -m "add txt file with instructions for manually starting Condor Jobs from CM"
+
+    # movie, group space, first level
+    condor_submit code/compute_1st-lvl_movie-ppa-grp.submit
+    # in .feat-directories, create templates and transforms
+    ./code/reg2std4feat inputs/studyforrest-data-templatetransforms bold3Tp2 grpbold3Tp2 sub-*/run-?_movie-ppa-grp.feat
+    # movie, group space, second level
+    condor_submit code/compute_2nd-lvl_movie-ppa-grp.submit
+    # movie, group space, third level
+    condor_submit code/compute_3rd-lvl_movie-ppa-grp.submit
+    # save results of first to third level
+    datalad save -m '3rd lvl results movie (group)'
+
+    # audio-description, group space, first level
+    condor_submit code/compute_1st-lvl_audio-ppa-grp.submit
+    # in .feat-directories, create templates and transforms
+    ./code/reg2std4feat inputs/studyforrest-data-templatetransforms bold3Tp2 grpbold3Tp2 sub-*/run-?_audio-ppa-grp.feat
+    # audio-description, group space, second level
+    condor_submit code/compute_2nd-lvl_audio-ppa-grp.submit    
+    # audio-description, group space, third level
+    condor_submit code/compute_3rd-lvl_audio-ppa-grp.submit
+    # save results of first to third level
+    datalad save -m '3rd lvl results audio (group)'
+
+    # movie, subject space, first level
+    condor_submit code/compute_1st-lvl_movie-ppa-ind.submit
+    # in .feat-directories, create templates and transforms
+    ./code/reg2std4feat inputs/studyforrest-data-templatetransforms bold3Tp2 bold3Tp2 sub-*/run-?_movie-ppa-ind.feat
+    # movie, subject space, second level
+    condor_submit code/compute_2nd-lvl_audio-ppa-ind.submit
+    # save results of first to second level
+    datalad save -m '2nd lvl results audio (individuals)'
+
+    # audio-description, subjects space, first level
+    condor_submit code/compute_1st-lvl_audio-ppa-ind.submit
+    # in .feat-directories, create templates and transforms
+    ./code/reg2std4feat inputs/studyforrest-data-templatetransforms bold3Tp2 bold3Tp2 sub-*/run-?_audio-ppa-ind.feat
+    # audio-description, subject space, second level
+    condor_submit code/compute_2nd-lvl_audio-ppa-ind.submit
+    # audio-description, group space, third level
+    datalad save -m '2nd lvl results audio (individuals)'   
+    # save results of first to third level
+
+### comment: some cleaning that we did
+    git annex unused
+    git annex dropunused all --force
+    datalad drop --nocheck sub*/*.feat/filtered_func_data.nii.gz
+    datalad drop --nocheck sub*/*.feat/stats/res4d.nii.gz