6 bulan lalu · 84dc144bc8
--- a/docs/OHBMposter.rst
+++ b/docs/OHBMposter.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! OHBM 2020 Poster
			
 
				 .. _ohbm2020poster:
			
 
				 
			
 
				 Handbook Poster from the 2020 (virtual) OHBM
			
 
				 --------------------------------------------
			
 
				 
			
 
				-.. index:: ! OHBM 2020 Poster
			
 
				-
			
 
				 Here you can find the poster about the DataLad Handbook, presented at the `2020 virtual OHBM <https://www.humanbrainmapping.org/i4a/pages/index.cfm?pageid=3958>`_ und poster number 1914.
			
 
				 
			
 
				 .. only:: html
			
--- a/docs/basics/101-123-config2.rst
+++ b/docs/basics/101-123-config2.rst
@@ -17,11 +17,11 @@ and not git-annex (see section :ref:`sibling`). The configuration responsible
 
				 for this behavior is in a ``.gitattributes`` file, and we'll start this
			
 
				 section by looking into it.
			
 
				 
			
 
				+.. index:: ! Config files; .gitattributes
			
 
				+
			
 
				 ``.gitattributes``
			
 
				 ^^^^^^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! Config files; .gitattributes
			
 
				-
			
 
				 This file lies right in the root of your superdataset:
			
 
				 
			
 
				 .. runrecord:: _examples/DL-101-123-101
			
@@ -133,11 +133,11 @@ Later however you will see preconfigured DataLad *procedures* such as ``text2git
 
				 can apply useful configurations for you, just as ``text2git`` added the last line
			
 
				 in the root ``.gitattributes`` file.
			
 
				 
			
 
				+.. index:: ! Config files; .gitmodules
			
 
				+
			
 
				 ``.gitmodules``
			
 
				 ^^^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! Config files; .gitmodules
			
 
				-
			
 
				 On last configuration file that Git creates is the ``.gitmodules`` file.
			
 
				 There is one right in the root of your dataset:
			
 
				 
			
@@ -276,13 +276,11 @@ Nevertheless, if ``subds1`` is provided with an explicit path, its subdataset ``
 
				     6 directories, 0 files
			
 
				 
			
 
				 
			
 
				-
			
 
				+.. index:: ! Config files; .datalad/config
			
 
				 
			
 
				 ``.datalad/config``
			
 
				 ^^^^^^^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! Config files; .datalad/config
			
 
				-
			
 
				 DataLad adds a repository-specific configuration file as well.
			
 
				 It can be found in the ``.datalad`` directory, and just like ``.gitattributes``
			
 
				 and ``.gitmodules`` it is version controlled and is thus shared together with
			
@@ -369,13 +367,12 @@ command. This is due to its different format that does not comply to the
 
				 ``section.variable.value`` structure of all other configuration files. This file, therefore,
			
 
				 has to be edited by hand, with an editor of your choice.
			
 
				 
			
 
				+.. index:: ! environment variable
			
 
				 .. _envvars:
			
 
				 
			
 
				 Environment variables
			
 
				 ^^^^^^^^^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! environment variable
			
 
				-
			
 
				 An :term:`environment variable` is a variable set up in your shell
			
 
				 that affects the way the shell or certain software works -- for example
			
 
				 the environment variables ``HOME``, ``PWD``, or ``PATH``.
			
--- a/docs/basics/101-124-procedures.rst
+++ b/docs/basics/101-124-procedures.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! procedures, run-procedures
			
 
				 .. _procedures:
			
 
				 
			
 
				 Configurations to go
			
 
				 --------------------
			
 
				 
			
 
				-.. index:: ! procedures, run-procedures
			
 
				-
			
 
				 The past two sections should have given you a comprehensive
			
 
				 overview on the different configuration options the tools
			
 
				 Git, git-annex, and DataLad provide. They not only
			
--- a/docs/basics/101-127-yoda.rst
+++ b/docs/basics/101-127-yoda.rst
@@ -1,11 +1,10 @@
 
				+.. index:: ! YODA principles
			
 
				 .. _2-001:
			
 
				 .. _yoda:
			
 
				 
			
 
				 YODA: Best practices for data analyses in a dataset
			
 
				 ---------------------------------------------------
			
 
				 
			
 
				-.. index:: ! YODA principles
			
 
				-
			
 
				 The last requirement for the midterm projects reads "needs to comply to the
			
 
				 YODA principles".
			
 
				 "What are the YODA principles?" you ask, as you have never heard of this
			
--- a/docs/basics/101-133-containersrun.rst
+++ b/docs/basics/101-133-containersrun.rst
@@ -122,13 +122,13 @@ Singularity (even without having Docker installed).
 
				    If it reports an error that asks "Is the docker daemon running?" give it a few more minutes to let Docker Desktop start it.
			
 
				    If it can't find the docker command, something went wrong during installation.
			
 
				 
			
 
				-Using ``datalad containers``
			
 
				-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
			
 
				-
			
 
				 .. index::
			
 
				    pair: containers-add; DataLad command
			
 
				    pair: containers-run; DataLad command
			
 
				 
			
 
				+Using ``datalad containers``
			
 
				+^^^^^^^^^^^^^^^^^^^^^^^^^^^^
			
 
				+
			
 
				 One core feature of the ``datalad containers`` extension is that it registers
			
 
				 computational containers to a dataset. This is done with the
			
 
				 :dlcmd:`containers-add` command.
			
--- a/docs/basics/101-136-cheatsheet.rst
+++ b/docs/basics/101-136-cheatsheet.rst
@@ -1,12 +1,11 @@
 
				 .. index:: ! 1-001
			
 
				+.. index:: ! Cheatsheet
			
 
				 .. _1-001:
			
 
				 .. _cheat:
			
 
				 
			
 
				 DataLad cheat sheet
			
 
				 -------------------
			
 
				 
			
 
				-.. index:: ! Cheatsheet
			
 
				-
			
 
				 .. only:: html
			
 
				 
			
 
				    Click on the image below to obtain a PDF version of the cheat sheet. Individual
			
--- a/docs/basics/101-139-gin.rst
+++ b/docs/basics/101-139-gin.rst
@@ -44,12 +44,12 @@ You should copy the contents of your public key file into the field labeled
 
				 "My private work station". Afterwards, you are done!
			
 
				 
			
 
				 
			
 
				-Publishing your dataset to GIN
			
 
				-^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
			
 
				-
			
 
				 .. index::
			
 
				    pair: create-sibling-gin; DataLad command
			
 
				 
			
 
				+Publishing your dataset to GIN
			
 
				+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
			
 
				+
			
 
				 As outlined in the section :ref:`share_hostingservice`, there are two ways in which you can publish your dataset to Gin.
			
 
				 Either by 1) creating a new, empty repository on GIN via the web interface, or 2) via the :dlcmd:`create-sibling-gin` command.
			
 
				 
			
--- a/docs/intro/howto.rst
+++ b/docs/intro/howto.rst
@@ -1,11 +1,10 @@
 
				+.. index:: ! terminal, ! shell, ! command Line
			
 
				 .. _howto:
			
 
				 
			
 
				 ****************
			
 
				 The command line
			
 
				 ****************
			
 
				 
			
 
				-.. index:: ! terminal, ! shell, ! command Line
			
 
				-
			
 
				 This chapter aims at providing novices with general basics about the shell, common Unix
			
 
				 commands and their Windows equivalent, and some general file system facts.
			
 
				 This chapter is also a place to return to and (re-)read if you come across a
			
@@ -329,7 +328,7 @@ To determine what shell you're in, run the following:
 
				 
			
 
				 .. index:: ! tab completion
			
 
				 
			
 
				-Tab Completion
			
 
				+Tab completion
			
 
				 ==============
			
 
				 
			
 
				 One of the best features ever invented is tab completion. Imagine your favorite animal sitting
			
--- a/docs/usecases/HCP_dataset.rst
+++ b/docs/usecases/HCP_dataset.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Scaling up: 80TB and 15 million files
			
 
				 .. _usecase_HCP_dataset:
			
 
				 
			
 
				 Scaling up: Managing 80TB and 15 million files from the HCP release
			
 
				 -------------------------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Scaling up: 80TB and 15 million files
			
 
				-
			
 
				 This use case outlines how a large data collection can be version controlled
			
 
				 and published in an accessible manner with DataLad in a remote indexed
			
 
				 archive (RIA) data store. Using the
			
@@ -38,11 +37,11 @@ without circumventing or breaching the data providers terms:
 
				 #. The :dlcmd:`copy-file` can be used to subsample special-purpose datasets
			
 
				    for faster access.
			
 
				 
			
 
				+.. index:: ! Human Connectome Project (HCP)
			
 
				+
			
 
				 The Challenge
			
 
				 ^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! Human Connectome Project (HCP)
			
 
				-
			
 
				 The `Human Connectome Project <http://www.humanconnectomeproject.org>`_ aims
			
 
				 to provide an unparalleled compilation of neural data through a customized
			
 
				 database. Its largest open access data collection is the
			
@@ -143,12 +142,12 @@ Building and publishing a DataLad dataset with HCP data consists of several step
 
				 an access point to all files in the HCP data release. The upcoming subsections
			
 
				 detail each of these.
			
 
				 
			
 
				-Dataset creation with ``datalad addurls``
			
 
				-"""""""""""""""""""""""""""""""""""""""""
			
 
				-
			
 
				 .. index::
			
 
				    pair: addurls; DataLad command
			
 
				 
			
 
				+Dataset creation with ``datalad addurls``
			
 
				+"""""""""""""""""""""""""""""""""""""""""
			
 
				+
			
 
				 The :dlcmd:`addurls` command
			
 
				 allows you to create (and update) potentially nested DataLad datasets from a list
			
 
				 of download URLs that point to the HCP files in the S3 buckets.
			
@@ -296,11 +295,11 @@ hidden section below.
 
				    ran over the Christmas break and finished before everyone went back to work.
			
 
				    Getting 15 million files into datasets? Check!
			
 
				 
			
 
				+.. index:: Remote Indexed Archive (RIA) store
			
 
				+
			
 
				 Using a Remote Indexed Archive Store for dataset hosting
			
 
				 """"""""""""""""""""""""""""""""""""""""""""""""""""""""
			
 
				 
			
 
				-.. index:: Remote Indexed Archive (RIA) store
			
 
				-
			
 
				 All datasets were built on a scientific compute cluster. In this location, however,
			
 
				 datasets would only be accessible to users with an account on this system.
			
 
				 Subsequently, therefore, everything was published with
			
--- a/docs/usecases/collaborative_data_management.rst
+++ b/docs/usecases/collaborative_data_management.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Collaboration
			
 
				 .. _usecase_collab:
			
 
				 
			
 
				 A typical collaborative data management workflow
			
 
				 ------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Collaboration
			
 
				-
			
 
				 This use case sketches the basics of a common, collaborative
			
 
				 data management workflow for an analysis:
			
 
				 
			
--- a/docs/usecases/encrypted_annex.rst
+++ b/docs/usecases/encrypted_annex.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Encrypted data storage and transport
			
 
				 .. _usecase_encrypted_annex:
			
 
				 
			
 
				 Encrypted data storage and transport
			
 
				 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
			
 
				 
			
 
				-.. index:: ! Usecase; Encrypted data storage and transport
			
 
				-
			
 
				 Some data are not meant for everybody's eyes - you can share a picture from a midflight-plane-window-view without a problem on your social media account, but you `shouldn't post a photo of your plane ticket next to it <https://mango.pdf.zone/finding-former-australian-prime-minister-tony-abbotts-passport-number-on-instagram>`_.
			
 
				 But there are also data so sensitive that not only should you not share them anywhere, you also need to make sure that they are inaccessible even when someone sneaks into your storage system or intercepts a file transfer - things such as passwords, private messages, or medical data.
			
 
				 One technical solution for this problem is `encryption <https://en.wikipedia.org/wiki/Encryption>`_.
			
--- a/docs/usecases/ml-analysis.rst
+++ b/docs/usecases/ml-analysis.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Machine Learning Analysis
			
 
				 .. _usecase_ML:
			
 
				 
			
 
				 DataLad for reproducible machine-learning analyses
			
 
				 --------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Machine Learning Analysis
			
 
				-
			
 
				 This use case demonstrates an automatically and computationally reproducible analyses in the context of a machine learning (ML) project.
			
 
				 It demonstrates on an example image classification analysis project how one can
			
 
				 
			
--- a/docs/usecases/provenance_tracking.rst
+++ b/docs/usecases/provenance_tracking.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Basic provenance tracking
			
 
				 .. _usecase_provenance_tracking:
			
 
				 
			
 
				 Basic provenance tracking
			
 
				 -------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Basic provenance tracking
			
 
				-
			
 
				 This use case demonstrates how the provenance of downloaded and generated files
			
 
				 can be captured with DataLad by
			
 
				 
			
--- a/docs/usecases/reproducible-paper.rst
+++ b/docs/usecases/reproducible-paper.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; reproducible paper
			
 
				 .. _usecase_reproducible_paper:
			
 
				 
			
 
				 Writing a reproducible paper
			
 
				 ----------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; reproducible paper
			
 
				-
			
 
				 This use case demonstrates how to use nested DataLad datasets to create a fully
			
 
				 reproducible paper by linking
			
 
				 
			
--- a/docs/usecases/reproducible_neuroimaging_analysis.rst
+++ b/docs/usecases/reproducible_neuroimaging_analysis.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Reproducible Neuroimaging
			
 
				 .. _usecase_reproduce_neuroimg:
			
 
				 
			
 
				 An automatically and computationally reproducible neuroimaging analysis from scratch
			
 
				 ------------------------------------------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Reproducible Neuroimaging
			
 
				-
			
 
				 This use case sketches the basics of a portable analysis that can be
			
 
				 automatically computationally reproduced, starting from the
			
 
				 acquisition of a neuroimaging dataset with a magnetic resonance imaging (MRI)
			
--- a/docs/usecases/reproducible_neuroimaging_analysis_simple.rst
+++ b/docs/usecases/reproducible_neuroimaging_analysis_simple.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Basic Reproducible Neuroimaging
			
 
				 .. _usecase_reproduce_neuroimg_simple:
			
 
				 
			
 
				 A basic automatically and computationally reproducible neuroimaging analysis
			
 
				 ----------------------------------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Basic Reproducible Neuroimaging
			
 
				-
			
 
				 This use case sketches the basics of a portable analysis of public neuroimaging data
			
 
				 that can be automatically computationally reproduced by anyone:
			
 
				 
			
--- a/docs/usecases/supervision.rst
+++ b/docs/usecases/supervision.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Student supervision
			
 
				 .. _usecase_student_supervision:
			
 
				 
			
 
				 Student supervision in a research project
			
 
				 -----------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Student supervision
			
 
				-
			
 
				 This use case will demonstrate a workflow that uses DataLad tools and principles
			
 
				 to assist in technical aspects of supervising research projects with computational
			
 
				 components.
			
--- a/docs/usecases/using_globus_as_datastore.rst
+++ b/docs/usecases/using_globus_as_datastore.rst
@@ -1,10 +1,9 @@
 
				+.. index:: ! Usecase; Using Globus as data store
			
 
				 .. _usecase_using_globus_as_datastore:
			
 
				 
			
 
				 Using Globus as a data store for the Canadian Open Neuroscience Portal
			
 
				 ----------------------------------------------------------------------
			
 
				 
			
 
				-.. index:: ! Usecase; Using Globus as data store
			
 
				-
			
 
				 This use case shows how the `Canadian Open Neuroscience Portal (CONP) <https://conp.ca>`_
			
 
				 disseminates data as DataLad datasets using the `Globus <https://www.globus.org>`_
			
 
				 network with :term:`git-annex`, a custom git-annex :term:`special remote`, and