Browse Source

Merge remote-tracking branch 'github/master'

Michael Hanke 3 years ago
2 changed files with 105 additions and 85 deletions
  1. 105 0
  2. 0 85

+ 105 - 0

@@ -0,0 +1,105 @@
+# Dataset
+[![No registration or authentication required](]()
+## Pre-aligned MRI data
+This repository contains data derived from the raw data releases of the
+** project. In particular these are:
+* BOLD fMRI timeseries aligned to subject-specific template images
+  and using transformations available from 
+For more information about the project visit:
+## File name conventions
+Each directory in the subject directories corresponds to one template image
+space. Data in ``sub*`` directories are participant-specific (not aligned
+across participants). However, templates with
+the same name have corresponding input data.
+Each directory contains one or more image files with more-or-less
+self-explanatory names, identifying the corresponding participant and scan.
+Lastly, the ``code/`` directory contains the source code for computing all
+files contained, as well as a number of validation analyses.
+## How to obtain the dataset
+This repository is a [DataLad]( dataset. It provides
+fine-grained data access down to the level of individual files, and allows for
+tracking future updates. In order to use this repository for data retrieval,
+[DataLad]( is required. It is a free and
+open source command line tool, available for all major operating
+systems, and builds up on Git and [git-annex](
+to allow sharing, synchronizing, and version controlling collections of
+large files. You can find information on how to install DataLad at
+### Get the dataset
+A DataLad dataset can be `cloned` by running
+datalad clone <url>
+Once a dataset is cloned, it is a light-weight directory on your local machine.
+At this point, it contains only small metadata and information on the
+identity of the files in the dataset, but not actual *content* of the
+(sometimes large) data files.
+### Retrieve dataset content
+After cloning a dataset, you can retrieve file contents by running
+datalad get <path/to/directory/or/file>`
+This command will trigger a download of the files, directories, or
+subdatasets you have specified.
+DataLad datasets can contain other datasets, so called *subdatasets*.
+If you clone the top-level dataset, subdatasets do not yet contain
+metadata and information on the identity of files, but appear to be
+empty directories. In order to retrieve file availability metadata in
+subdatasets, run
+datalad get -n <path/to/subdataset>
+Afterwards, you can browse the retrieved metadata to find out about
+subdataset contents, and retrieve individual files with `datalad get`.
+If you use `datalad get <path/to/subdataset>`, all contents of the
+subdataset will be downloaded at once.
+### Keep data up-to-date
+DataLad datasets can be updated. The command `datalad update` will
+*fetch* updates and store them on a different branch (by default
+`remotes/origin/master`). Running
+datalad update --merge
+will *pull* available updates and integrate them in one go.
+### Find out what has been done
+DataLad datasets contain their history in the ``git log``.  By running ``git
+log`` (or a tool that displays Git history) in the dataset or on specific files,
+you can find out what has been done to the dataset or to individual files by
+whom, and when.
+### More information
+More information on DataLad and how to use it can be found in the DataLad Handbook at
+[]( The chapter
+"DataLad datasets" can help you to familiarize yourself with the concept of a dataset.

+ 0 - 85

@@ -1,85 +0,0 @@ Dataset
-|license| |access|
-Pre-aligned MRI data
-This repository contains data derived from the raw data releases of the
-** project. In particular these are:
-* BOLD fMRI timeseries aligned to subject-specific template images
-  and using transformations available from 
-For more information about the project visit:
-File name conventions
-Each directory in the subject directories corresponds to one template image
-space. Data in ``sub*`` directories are participant-specific (not aligned
-across participants). However, templates with
-the same name have corresponding input data.
-Each directory contains one or more image files with more-or-less
-self-explanatory names, identifying the corresponding participant and scan.
-Lastly, the ``code/`` directory contains the source code for computing all
-files contained, as well as a number of validation analyses.
-How to obtain the dataset
-This repository contains metadata and information on the identity of all
-included files. However, the actual content of the (sometime large) data
-files is stored elsewhere. To obtain any dataset component, git-annex_ is
-required in addition to Git_.
-1. Clone this repository to the desired location.
-2. Enter the directory with the local clone and run::
-     git annex init
-   Older versions of git-annex may require you to run the following
-   command immediately afterwards::
-     git annex enableremote mddatasrc
-Now any desired dataset component can be obtained by using the ``git annex get``
-command. To obtain the entire dataset content run::
-     git annex get .
-Keep data up-to-date
-If updates to this dataset are made in the future, update any local clone by
-     git pull
-followed by::
-     git annex get .
-to fetch all new files.
-.. _Git:
-.. _git-annex:
-.. |license|
-   image::
-    :target:
-    :alt: PDDL-licensed
-.. |access|
-   image::
-    :alt: No registration or authentication required