Quellcode durchsuchen

Add commas to 'for example', where applicable

Adina Wagner vor 6 Monaten
Ursprung
Commit
564ebe6ebc
38 geänderte Dateien mit 64 neuen und 64 gelöschten Zeilen
  1. 1 0
      docs/basics/101-101-create.rst
  2. 3 3
      docs/basics/101-105-install.rst
  3. 1 1
      docs/basics/101-107-summary.rst
  4. 1 1
      docs/basics/101-108-run.rst
  5. 1 2
      docs/basics/101-110-run2.rst
  6. 1 1
      docs/basics/101-115-symlinks.rst
  7. 1 1
      docs/basics/101-117-sharelocal2.rst
  8. 2 2
      docs/basics/101-120-summary.rst
  9. 4 4
      docs/basics/101-122-config.rst
  10. 4 4
      docs/basics/101-123-config2.rst
  11. 1 1
      docs/basics/101-124-procedures.rst
  12. 5 5
      docs/basics/101-127-yoda.rst
  13. 2 2
      docs/basics/101-130-yodaproject.rst
  14. 1 1
      docs/basics/101-132-advancednesting.rst
  15. 1 1
      docs/basics/101-133-containersrun.rst
  16. 1 1
      docs/basics/101-135-help.rst
  17. 1 1
      docs/basics/101-136-filesystem.rst
  18. 3 3
      docs/basics/101-137-history.rst
  19. 1 1
      docs/basics/101-138-sharethirdparty.rst
  20. 1 1
      docs/basics/101-139-dropbox.rst
  21. 1 1
      docs/basics/101-139-figshare.rst
  22. 1 1
      docs/basics/101-139-gitlfs.rst
  23. 2 2
      docs/basics/101-139-hostingservices.rst
  24. 1 1
      docs/basics/101-139-privacy.rst
  25. 1 1
      docs/basics/101-146-gists.rst
  26. 2 2
      docs/basics/101-180-FAQ.rst
  27. 1 1
      docs/beyond_basics/101-145-hooks.rst
  28. 1 1
      docs/beyond_basics/101-146-providers.rst
  29. 1 1
      docs/beyond_basics/101-148-clonepriority.rst
  30. 2 2
      docs/beyond_basics/101-164-dataladdening.rst
  31. 4 4
      docs/beyond_basics/101-168-dvc.rst
  32. 4 4
      docs/beyond_basics/101-170-dataladrun.rst
  33. 1 1
      docs/intro/filenaming.rst
  34. 1 1
      docs/intro/howto.rst
  35. 2 2
      docs/intro/windows.rst
  36. 1 1
      docs/usecases/ml-analysis.rst
  37. 1 1
      docs/usecases/reproducible-paper.rst
  38. 1 1
      docs/usecases/reproducible_neuroimaging_analysis.rst

+ 1 - 0
docs/basics/101-101-create.rst

@@ -83,7 +83,7 @@ can be tracked (should you want them to be tracked).
 *Tracking* in this context means that edits done to a file are automatically
 associated with information about the change, the author of the edit,
 and the time of this change. This is already informative important on its own
+-- the :term:`provenance` captured with this can, for example, be used to learn
 about a file's lineage, and can establish trust in it.
 But what is especially helpful is that previous states of files or directories
 can be restored. Remember the last time you accidentally deleted content

+ 3 - 3
docs/basics/101-105-install.rst

@@ -183,7 +183,7 @@ a download of that many ``.mp3`` files not take much more time?
 Here you can see another import feature of DataLad datasets
 and the :dlcmd:`clone` command:
 Upon installation of a DataLad dataset, DataLad retrieves only small files
-(for example text files or markdown files) and (small) metadata
+(for example, text files or markdown files) and (small) metadata
 about the dataset. It does not, however, download any large files
 (yet). The metadata exposes the dataset's file hierarchy
 for exploration (note how you are able to list the dataset contents with ``ls``),
@@ -334,7 +334,7 @@ really helpful to save disk space for data you can easily reobtain, for example"
 The :dlcmd:`drop` command will remove
 file contents completely from your dataset.
 You should only use this command to remove contents that you can :dlcmd:`get`
-again, or generate again (for example with next chapter's :dlcmd:`datalad run`
+again, or generate again (for example, with next chapter's :dlcmd:`datalad run`
 command), or that you really do not need anymore.
 
 Let's remove the content of one of the files that we have downloaded, and check
@@ -470,7 +470,7 @@ modification.
 
 .. rubric:: Footnotes
 
-.. [#f1] Additionally, a source  can also be a pointer to an open-data collection,
+.. [#f1] Additionally, a source can also be a pointer to an open-data collection,
          for example :term:`the DataLad superdataset ///` -- more on what this is and how to
          use it later, though.
 

+ 1 - 1
docs/basics/101-107-summary.rst

@@ -85,7 +85,7 @@ Now what I can do with that?
 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 
 Simple, local workflows allow you to version control changing small files,
-for example your CV, your code, or a book that you are working on, but
+for example, your CV, your code, or a book that you are working on, but
 you can also add very large files to your datasets history.
 Currently, this can be considered "best-practice building": Frequent :dlcmd:`status`
 commands, :dlcmd:`save` commands to save dataset modifications,

+ 1 - 1
docs/basics/101-108-run.rst

@@ -99,7 +99,7 @@ will write it into the script.
    Instead, it only displays the base of the file name and indicates the file type with the display icon.
    You can see if this is the case for you, too, by opening the ``books\`` directory in a file explorer, and checking if the file extension (``.pdf``) is a part of the file name displayed underneath its PDF icon.
 
-   Hidden file extensions can be a confusing source of errors, because some Windows editors (for example Notepad) automatically add a ``.txt`` extension to your files -- when you save the script above under the name ``list_titles.sh``, your editor may add an extension (``list_titles.sh.txt``), and the file explorer displays your file as ``list_titles.sh`` (hiding the ``.txt`` extension).
+   Hidden file extensions can be a confusing source of errors, because some Windows editors (for example, Notepad) automatically add a ``.txt`` extension to your files -- when you save the script above under the name ``list_titles.sh``, your editor may add an extension (``list_titles.sh.txt``), and the file explorer displays your file as ``list_titles.sh`` (hiding the ``.txt`` extension).
 
    To prevent confusion, configure the file explorer to always show you the file extension.
    For this, open the Explorer, click on the "View" tab, and tick the box "File name extensions".

+ 1 - 2
docs/basics/101-110-run2.rst

@@ -328,8 +328,7 @@ the ``-o``/``--output`` option.
 
    The use case here is simplistic -- a single file gets modified.
    But there are commands and tools that create full directories with
-   many files as an output, for example
-   `FSL <https://fsl.fmrib.ox.ac.uk>`_, a neuro-imaging tool.
+   many files as an output.
    The easiest way to specify this type of output
    is by supplying the directory name, or the directory name and a :term:`globbing` character, such as
    ``-o directory/*.dat``.

+ 1 - 1
docs/basics/101-115-symlinks.rst

@@ -75,7 +75,7 @@ defined based on
 
 #. file size
 
-#. and/or path/pattern, and thus for example file extensions,
+#. and/or path/pattern, and thus, for example, file extensions,
    or names, or file types (e.g., text files, as with the
    ``text2git`` configuration template).
 

+ 1 - 1
docs/basics/101-117-sharelocal2.rst

@@ -149,4 +149,4 @@ this in the original ``DataLad-101`` directory, and do not forget to save it.
          data on an account belonging to user ``mih`` on the host name ``medusa``.
          Because we do not have the host names' address, nor log-in credentials for
          this user, we can not retrieve content from this location. However, somebody
-         else (for example the user ``mih``) could.
+         else (for example, the user ``mih``) could.

+ 2 - 2
docs/basics/101-120-summary.rst

@@ -65,8 +65,8 @@ the book you will see examples in which datasets are shared on the same
 file system in surprisingly useful ways.
 
 Simultaneously, you have observed dataset properties you already knew
-(for example how annexed files need to be retrieved via :dlcmd:`get`),
-but you have also seen novel aspects of a dataset -- for example that
+(for example, how annexed files need to be retrieved via :dlcmd:`get`),
+but you have also seen novel aspects of a dataset -- for example, that
 subdatasets are not automatically installed by default, how
 :gitannexcmd:`whereis` can help you find out where file content might be stored,
 how useful commands that capture provenance about the origin or creation of files

+ 4 - 4
docs/basics/101-122-config.rst

@@ -5,7 +5,7 @@ DIY configurations
 
 Back in section :ref:`text2git`, you already learned that there
 are dataset configurations, and that these configurations can
-be modified, for example with the ``-c text2git`` option.
+be modified, for example, with the ``-c text2git`` option.
 This option applies a configuration template to store text
 files in :term:`Git` instead of :term:`git-annex`, and thereby
 modifies the DataLad dataset's default configuration to store
@@ -107,7 +107,7 @@ configuration (namely to a single repository), but it can overrule global
 configurations: The more specific the scope of a configuration file is, the more
 important it is, and the variables in the more specific configuration
 will take precedence over variables in less specific configuration files.
-One could for example have :term:`vim` configured to be the default editor
+One could, for example, have :term:`vim` configured to be the default editor
 on a global scope, but could overrule this by setting the editor to nano
 in a given repository. For this reason, the repository-specific configuration
 does not reside in a file in your home directory, but in ``.git/config``
@@ -149,7 +149,7 @@ this configures core Git functionality. There are
 configurations than the ones in this config file, but
 they are related to Git, and less related or important to the configuration of
 a DataLad dataset. We will use this section to showcase the anatomy of the
-:gitcmd:`config` command. If for example you would want to specifically
+:gitcmd:`config` command. If, for example, you would want to specifically
 configure :term:`nano` to be the default editor in this dataset, you
 can do it like this:
 
@@ -204,7 +204,7 @@ specified in the command.
 .. find-out-more:: If things go wrong during Git config
 
    If something goes wrong during the :gitcmd:`config` command,
-   for example you end up having two keys of the same name because you
+   for example, you end up having two keys of the same name because you
    added a key instead of replacing an existing one, you can use the
    ``--unset`` option to remove the line. Alternatively, you can also open
    the config file in an editor and remove or change sections by hand.

+ 4 - 4
docs/basics/101-123-config2.rst

@@ -318,7 +318,7 @@ files, and I do not know with which command I can write into these files."
 it's also the :gitcmd:`config` command. The only part of it you need to
 adjust is the ``-f``, ``--file`` parameter. By default, the command writes to
 a Git config file. But it can write to a different file if you specify it
-appropriately. For example
+appropriately. For example,
 
    ``git config --file=.gitmodules --replace-all submodule."name".url "new URL"``
 
@@ -377,7 +377,7 @@ Environment variables
 ^^^^^^^^^^^^^^^^^^^^^
 
 An :term:`environment variable` is a variable set up in your shell
-that affects the way the shell or certain software works -- for example
+that affects the way the shell or certain software works -- for example,
 the environment variables ``HOME``, ``PWD``, or ``PATH``.
 Configuration options that determine the behavior of Git, git-annex, and
 DataLad that could be defined in a configuration file can also be set (or overridden)
@@ -408,7 +408,7 @@ configuration option thus is the environment variable ``DATALAD_LOG_LEVEL``.
 
    Names of environment variables are often all-uppercase. While the ``$`` is not part of
    the name of the environment variable, it is necessary to *refer* to the environment
-   variable: To reference the value of the environment variable ``HOME`` for example you would
+   variable: To reference the value of the environment variable ``HOME``, for example, you would
    need to use ``echo $HOME`` and not ``echo HOME``. However, environment variables are
    set without a leading ``$``. There are several ways to set an environment variable
    (note that there are no spaces before and after the ``=`` !), leading to different
@@ -498,7 +498,7 @@ Write a note about configurations in datasets into ``notes.txt``.
          extension (such as ``.txt``, ``.pdf``, ``.jpg``) for the operating system to know
          how to open or use this file (in contrast to Windows, which does not know how to
          open a file without an extension). To do this, Unix systems rely on a file's
-         MIME type -- an information about a file's content. A ``.txt`` file for example
+         MIME type -- an information about a file's content. A ``.txt`` file, for example,
          has MIME type ``text/plain`` as does a bash script (``.sh``), a Python
          script has MIME type ``text/x-python``, a ``.jpg`` file is ``image/jpg``, and
          a ``.pdf`` file has MIME type ``application/pdf``. You can find out the MIME type

+ 1 - 1
docs/basics/101-124-procedures.rst

@@ -63,7 +63,7 @@ Just like ``cfg_text2git``, all DataLad procedures are
 executables (such as a script, or compiled code).
 In principle, they can be written in any language, and perform
 any task inside of a dataset.
-The ``text2git`` configuration for example applies a configuration for how
+The ``text2git`` configuration, for example, applies a configuration for how
 git-annex treats different file types. Other procedures do not
 only modify ``.gitattributes``, but can also populate a dataset
 with particular content, or automate routine tasks such as

+ 5 - 5
docs/basics/101-127-yoda.rst

@@ -131,7 +131,7 @@ computational environments, results, ...) in dedicated directories. For example:
 - Collect **results** of an analysis in a dedicated ``outputs/`` directory, and
   leave the input data of an analysis untouched by your computations.
 
-- Include a place for complete **execution environments**, for example
+- Include a place for complete **execution environments**, such as
   `singularity images <https://singularity.lbl.gov>`_ or
   `docker containers <https://www.docker.com/get-started>`_ [#f2]_, in
   the form of an ``envs/`` directory, if relevant for your analysis.
@@ -182,7 +182,7 @@ You can get a few non-DataLad related advice for structuring your directories in
    #. Within ``code/``, it is best practice to add **tests** for the code.
       These tests can be run to check whether the code still works.
 
-   #. It is even better to further use automated computing, for example
+   #. It is even better to further use automated computing such as
       `continuous integration (CI) systems <https://en.wikipedia.org/wiki/Continuous_integration>`_,
       to test the functionality of your functions and scripts automatically.
       If relevant, the setup for continuous integration frameworks (such as
@@ -190,7 +190,7 @@ You can get a few non-DataLad related advice for structuring your directories in
       in a dedicated ``ci/`` directory.
 
    #. Include **documents for fellow humans**: Notes in a README.md or a HOWTO.md,
-      or even proper documentation (for example using  in a dedicated ``docs/`` directory.
+      or even proper documentation (for example, using  in a dedicated ``docs/`` directory.
       Within these documents, include all relevant metadata for your analysis. If you are
       conducting a scientific study, this might be authorship, funding,
       change log, etc.
@@ -228,7 +228,7 @@ more.
 The directory tree above and :numref:`dataset_modules` highlight different aspects
 of this principle. The directory tree illustrates the structure of
 the individual pieces on the file system from the point of view of
-a single top-level dataset with a particular purpose. It for example
+a single top-level dataset with a particular purpose. For example, it
 could be an analysis dataset created by a statistician for a scientific
 project, and it could be shared between collaborators or
 with others during development of the project. In this
@@ -280,7 +280,7 @@ be included in an analysis superdataset as subdatasets. Thanks to
 :dlcmd:`clone`, information on the source of these subdatasets
 is stored in the history of the analysis superdataset, and they can even be
 updated from those sources if the original data dataset gets extended or changed.
-If you are including a file, for example code from GitHub,
+If you are including a file, for example, code from GitHub,
 the :dlcmd:`download-url` command (introduced in section :ref:`populate`)
 will record the source of it safely in the dataset's history. And if you add anything to your dataset,
 from simple incremental coding progress in your analysis scripts up to

+ 2 - 2
docs/basics/101-130-yodaproject.rst

@@ -479,7 +479,7 @@ point with the ``--version-tag`` option of :dlcmd:`save`.
    was added.
    Later we can use this tag to identify the point in time at which
    the analysis setup was ready -- much more intuitive than a 40-character shasum!
-   This is handy in the context of a :dlcmd:`rerun` for example:
+   This is handy in the context of a :dlcmd:`rerun`, for example:
 
    .. code-block:: bash
 
@@ -612,7 +612,7 @@ dataset that you can use for this [#f4]_.
    $ datalad save -m "Provide project description" README.md
 
 Note that one feature of the YODA procedure was that it configured certain files
-(for example everything inside of ``code/``, and the ``README.md`` file in the
+(for example, everything inside of ``code/``, and the ``README.md`` file in the
 root of the dataset) to be saved in Git instead of git-annex. This was the
 reason why the ``README.md`` in the root of the dataset was easily modifiable.
 

+ 1 - 1
docs/basics/101-132-advancednesting.rst

@@ -165,7 +165,7 @@ interested in this, checkout the :ref:`dedicated Findoutmore <fom-status>`.
       $ datalad -f json_pp status -d . midterm_project
 
    This still was not all of the available functionality of the
-   :dlcmd:`status` command. You could for example adjust whether and
+   :dlcmd:`status` command. You could, for example, adjust whether and
    how untracked dataset content should be reported with the ``--untracked``
    option, or get additional information from annexed content with the ``--annex``
    option (especially powerful when combined with ``-f json_pp``). To get a complete overview on what you could do, check out the technical

+ 1 - 1
docs/basics/101-133-containersrun.rst

@@ -291,7 +291,7 @@ To ensure that the dataset is correctly bind-mounted on all systems, let's add a
    pair: run command; with DataLad containers-run
 
 Now that we have a complete computational environment linked to the ``midterm_project``
-dataset, we can execute commands in this environment. Let us for example try to repeat
+dataset, we can execute commands in this environment. Let us, for example, try to repeat
 the :dlcmd:`run` command from the section :ref:`yoda_project` as a
 :dlcmd:`containers-run` command.
 

+ 1 - 1
docs/basics/101-135-help.rst

@@ -355,7 +355,7 @@ many reasons, but as long as there are other remotes you can access the
 data from, you are fine.
 
 A similar warning message may appear when adding a sibling that is a pure Git
-:term:`remote`, for example a repository on GitHub:
+:term:`remote`, such as a repository on GitHub:
 
 .. code-block:: bash
 

+ 1 - 1
docs/basics/101-136-filesystem.rst

@@ -638,7 +638,7 @@ the Unix :shcmd:`mv` command to move or rename, and the :dlcmd:`save`
 to clean up afterwards, just as in the examples above. Make sure to
 **not** use ``git mv``, especially for subdatasets.
 
-Let's for example rename the ``books`` directory:
+Let's, for example, rename the ``books`` directory:
 
 .. runrecord:: _examples/DL-101-136-151
    :language: console

+ 3 - 3
docs/basics/101-137-history.rst

@@ -138,7 +138,7 @@ DataLad in the editor)!
       $ git rebase -i HEAD~N
 
    where ``N`` specifies how far back you want to rewrite commits.
-   ``git rebase -i HEAD~3`` for example lets you apply changes to the
+   ``git rebase -i HEAD~3``, for example, lets you apply changes to the
    any number of commit messages within the last three commits.
 
    Be aware that an interactive rebase lets you *rewrite* history.
@@ -635,7 +635,7 @@ under which situations and how to perform such an interactive rebase.
 However, outlining an interactive rebase here in the handbook could lead to
 problems for readers without (much) Git experience: An interactive rebase,
 even if performed successfully, can lead to many problems if it is applied with
-too little experience, for example in any collaborative real-world project.
+too little experience, for example, in any collaborative real-world project.
 
 .. index::
    pair: revert; Git command
@@ -811,7 +811,7 @@ to remove the ``Gitjoke2.txt`` file.
          this hash. Likewise, the :gitcmd:`diff` can work with commit hashes.
 
 .. [#f2] There are other alternatives to reference commits in the history of a dataset,
-         for example "counting" ancestors of the most recent commit using the notation
+         for example, "counting" ancestors of the most recent commit using the notation
          ``HEAD~2``, ``HEAD^2`` or ``HEAD@{2}``. However, using hashes to reference
          commits is a very fail-save method and saves you from accidentally miscounting.
 

+ 1 - 1
docs/basics/101-138-sharethirdparty.rst

@@ -19,7 +19,7 @@ But at some point in a dataset's life, you may want to share it with people that
 can't access the computer or server your dataset lives on, store it on other infrastructure
 to save diskspace, or create a backup.
 When this happens, you will want to publish your dataset to repository hosting
-services (for example :term:`GitHub`, :term:`GitLab`, or :term:`Gin`)
+services (for example, :term:`GitHub`, :term:`GitLab`, or :term:`Gin`)
 and/or third party storage providers (such as Dropbox_, Google_,
 `Amazon S3 buckets <https://aws.amazon.com/s3>`_,
 the `Open Science Framework`_ (OSF), and many others).

+ 1 - 1
docs/basics/101-139-dropbox.rst

@@ -293,7 +293,7 @@ to "enable" this special remote (inside of the installed ``DataLad-101``):
      enable -s dropbox-for-friends
    .: dropbox-for-friends(?) [git]
 
-And once this is done, you can get any annexed file contents, for example the
+And once this is done, you can get any annexed file contents, for example, the
 books, or the cropped logos from chapter :ref:`chapter_run`:
 
 .. code-block:: bash

+ 1 - 1
docs/basics/101-139-figshare.rst

@@ -8,7 +8,7 @@ annexed content to a variety of third party infrastructure, DataLad also has
 some built-in support for "exporting" data to other services.
 This usually means that a static snapshot of your dataset and its files are shared
 in archives or collections of files.
-While an export of a dataset looses some of the advantages that a DataLad dataset has, for example a transparent version history, it can be a fast and simple way to make the most recent version of your dataset available or archived.
+While an export of a dataset looses some of the advantages that a DataLad dataset has, for example, a transparent version history, it can be a fast and simple way to make the most recent version of your dataset available or archived.
 
 One example is the command :dlcmd:`export-archive`.
 Running this command creates a ``.tar.gz`` file with the content of your dataset.

+ 1 - 1
docs/basics/101-139-gitlfs.rst

@@ -4,7 +4,7 @@ Walk-through: Git LFS as a special remote on GitHub
 ---------------------------------------------------
 
 Some repository hosting services provide for-pay support for large files, and can thus be used as special remotes as well.
-GitHub and GitLab for example support `Git Large File Storage <https://github.com/git-lfs/git-lfs>`_ (Git LFS) for managing data files using Git.
+GitHub and GitLab, for example, support `Git Large File Storage <https://github.com/git-lfs/git-lfs>`_ (Git LFS) for managing data files using Git.
 A free GitHub subscription allows up to `1GB of free storage and up to 1GB of bandwidth monthly <https://docs.github.com/en/repositories/working-with-files/managing-large-files/about-storage-and-bandwidth-usage>`_.
 As such, it might be sufficient for some use cases, and could be configured
 quite easily.

+ 2 - 2
docs/basics/101-139-hostingservices.rst

@@ -106,12 +106,12 @@ Each command is slightly tuned towards the peculiarities of each particular plat
 
 - ``[REPONAME]`` (required): The name of the repository on the hosting site. It will be created under a user's namespace, unless this argument includes an organization name prefix. For example, ``datalad create-sibling-github my-awesome-repo`` will create a new repository under ``github.com/<user>/my-awesome-repo``, while ``datalad create-sibling-github <orgname>/my-awesome-repo`` will create a new repository of this name under the GitHub organization ``<orgname>`` (given appropriate permissions).
 - ``-s/--name <name>`` (required): A name under which the sibling is identified. By default, it will be based on or similar to the hosting site. For example, the sibling created with ``datalad create-sibling-github`` will  be called ``github`` by default.
-- ``--credential <name>`` (optional): Credentials used for authentication are stored internally by DataLad under specific names. These names allow you to have multiple credentials, and flexibly decide which one to use. When ``--credential <name>`` is the name of an existing credential, DataLad tries to authenticate with the specified credential; when it does not yet exist DataLad will prompt interactively for a credential, such as an access token, and store it under the given ``<name>`` for future authentications. By default, DataLad will name a credential according to the hosting service URL it used for, for example ``datalad-api.github.com`` as the default for credentials used to authenticate against GitHub.
+- ``--credential <name>`` (optional): Credentials used for authentication are stored internally by DataLad under specific names. These names allow you to have multiple credentials, and flexibly decide which one to use. When ``--credential <name>`` is the name of an existing credential, DataLad tries to authenticate with the specified credential; when it does not yet exist DataLad will prompt interactively for a credential, such as an access token, and store it under the given ``<name>`` for future authentications. By default, DataLad will name a credential according to the hosting service URL it used for, such as ``datalad-api.github.com`` as the default for credentials used to authenticate against GitHub.
 - ``--access-protocol {https|ssh|https-ssh}`` (default ``https``): Whether to use :term:`SSH` or :term:`HTTPS` URLs, or a hybrid version in which HTTPS is used to *pull* and SSH is used to *push*. Using :term:`SSH` URLs requires an :term:`SSH key` setup, but is a very convenient authentication method, especially when pushing updates -- which would need manual input on user name and token with every ``push`` over HTTPS.
 - ``--dry-run`` (optional): With this flag set, the command will not actually create the target repository, but only perform tests for name collisions and report repository name(s).
 - ``--private`` (optional): A switch that, if set, makes sure that the created repository is private.
 
-Other streamlined arguments, such as ``--recursive`` or ``--publish-depends`` allow you to perform more complex configurations, for example publication of dataset hierarchies or connections to :term:`special remote`\s. Upcoming walk-throughs will demonstrate them.
+Other streamlined arguments, such as ``--recursive`` or ``--publish-depends`` allow you to perform more complex configurations, such as publication of dataset hierarchies or connections to :term:`special remote`\s. Upcoming walk-throughs will demonstrate them.
 
 Self-hosted repository services, e.g., Gogs or Gitea instances, have an additional required argument, the ``--api`` flag.
 It needs to point to the URL of the instance, for example

+ 1 - 1
docs/basics/101-139-privacy.rst

@@ -35,7 +35,7 @@ Strategy 2: Restrict access via third party service or file system permissions
 
 When you have a dataset and only authorized actors should be allowed to access it,
 it is possible to set access restrictions simply via choice of (third party) storage permissions.
-When it is an access restricted dataset on shared infrastructure, for example a scientific dataset that only researchers who signed a data usage agreement should have access to, it could suffice to create specific `Unix groups <https://en.wikipedia.org/wiki/Group_identifier>`_ with authorized users, and give only those groups the necessary permissions.
+When it is an access restricted dataset on shared infrastructure, for example, a scientific dataset that only researchers who signed a data usage agreement should have access to, it could suffice to create specific `Unix groups <https://en.wikipedia.org/wiki/Group_identifier>`_ with authorized users, and give only those groups the necessary permissions.
 Depending on what permissions are set, unauthorized actors would not be able to retrieve file contents, or be able to clone the dataset at all.
 
 The ability of repository hosting services to make datasets private and only allow select collaborators access is yet another method of keeping complete datasets as private as necessary, even though you should think twice on whether or not you should host sensitive repositories at all on these services.

+ 1 - 1
docs/basics/101-146-gists.rst

@@ -28,7 +28,7 @@ Parallelize subdataset processing
 
 DataLad can not yet parallelize processes that are performed
 independently over a large number of subdatasets. Pushing across a dataset
-hierarchy for example, is performed one after the other.
+hierarchy, for example, is performed one after the other.
 Unix however, has a few tools such as `xargs <https://en.wikipedia.org/wiki/Xargs>`_
 or the ``parallel`` tool of `moreutils <https://joeyh.name/code/moreutils>`_
 that can assist.

+ 2 - 2
docs/basics/101-180-FAQ.rst

@@ -82,7 +82,7 @@ and functions:
   Whereas git and git-annex would require the caller to first cd to the target
   repository, DataLad figures out which repository the given paths belong to and
   then works within that repository.
-  :dlcmd:`save . --recursive` will solve the subdataset problem above
+  :dlcmd:`save . --recursive` will solve the subdataset problem above,
   for example, no matter what was changed/added, no matter where in a tree
   of subdatasets.
 - DataLad provides users with the ability to act on "virtual" file paths. If
@@ -548,7 +548,7 @@ Here is an example:
 
 .. figure:: ../artwork/src/defaultgitannex_light.png
 
-This is related to GitHub's decision to make ``main`` `the default branch for newly created repositories <https://github.blog/changelog/2020-10-01-the-default-branch-for-newly-created-repositories-is-now-main>`_ -- datasets that do not have a ``main`` branch (but for example a ``master`` branch) may end up with a different branch being displayed on GitHub than intended.
+This is related to GitHub's decision to make ``main`` `the default branch for newly created repositories <https://github.blog/changelog/2020-10-01-the-default-branch-for-newly-created-repositories-is-now-main>`_ -- datasets that do not have a ``main`` branch (but, for example, a ``master`` branch) may end up with a different branch being displayed on GitHub than intended.
 
 To fix this for present and/or future datasets, the default branch can be configured to a branch name of your choice on a repository- or organizational level `via GitHub's web-interface <https://github.blog/changelog/2020-08-26-set-the-default-branch-for-newly-created-repositories>`_.
 Alternatively, you can rename existing ``master`` branches into ``main`` using ``git branch -m master main`` (but beware of unforeseen consequences - your collaborators may try to ``update`` the ``master`` branch but fail, continuous integration workflows could still try to use ``master``, etc.).

+ 1 - 1
docs/beyond_basics/101-145-hooks.rst

@@ -188,7 +188,7 @@ you defined), into the new dataset.
 .. rubric:: Footnotes
 
 .. [#f2] It only needs to be compatible with :gitcmd:`config`. This means that
-         it for example should not contain any dots (``.``).
+         it, for example, should not contain any dots (``.``).
 
 .. [#f3] To re-read about the :gitcmd:`config` command and other configurations
          of DataLad and its underlying tools, go back to the chapter on Configurations,

+ 1 - 1
docs/beyond_basics/101-146-providers.rst

@@ -8,7 +8,7 @@ protocol from various data storage solutions via its downloading commands
 (:dlcmd:`download-url`, :dlcmd:`addurls`,
 :dlcmd:`get`).
 If data retrieval from a storage solution requires *authentication*,
-for example via a username and password combination, DataLad provides an
+for example, via a username and password combination, DataLad provides an
 interface to query, request, and store the most common type of credentials that
 are necessary to authenticate, for a range of authentication types.
 There are a number of natively supported types of authentication and out-of-the

+ 1 - 1
docs/beyond_basics/101-148-clonepriority.rst

@@ -51,7 +51,7 @@ Clone candidate priority
 We have established that subdatasets can come from several sources.
 Let's now motivate *why* it might be useful to prioritize one subdataset clone location over another one.
 
-Consider a hierarchy of datasets that exist in several locations, for example one :term:`Remote Indexed Archive (RIA) store` *with* a storage special remote [#f2]_, and one without a special remote.
+Consider a hierarchy of datasets that exist in several locations, for example, one :term:`Remote Indexed Archive (RIA) store` *with* a storage special remote [#f2]_, and one without a special remote.
 The topmost superdataset is published to a human-readable and accessible location such as :term:`GitHub` or :term:`GitLab`, and should be configured to always clone subdatasets from the RIA store *with* the storage special remote, even if it was originally created with subdatasets from the RIA store with no storage sibling.
 In order to be able to retrieve subdataset *data* from the subdatasets after cloning the hierarchy of datasets, the RIA store with the storage special remote needs to be configured as a clone candidate.
 Importantly, it should not only be configured as one alternative, but it should be configured as the first location to try to clone from -- else, cloning from the wrong RIA store could succeed and prevent any configured second clone candidate location from being tried.

+ 2 - 2
docs/beyond_basics/101-164-dataladdening.rst

@@ -59,8 +59,8 @@ If you want to transform a series of nested directories into nested datasets, co
    In deciding how many datasets you need, try to follow the benchmarks in chapter :ref:`chapter_gobig` and the yoda principles in section :ref:`yoda`.
    Two simple questions can help you make a decision:
 
-   #. Do you have independently reusable components in your directory, for example data from several studies, or data and code/results? If yes, make each individual component a dataset.
-   #. How large is each individual component? If it exceeds 100k files, split it up into smaller datasets. The decision on where to place subdataset boundaries can be guided by the existing directory structure or by common access patterns, for example based on data type (raw, processed, ...) or subject association. One straightforward organization may be a top-level superdataset and subject-specific subdatasets, mimicking the structure chosen in the use case :ref:`usecase_HCP_dataset`.
+   #. Do you have independently reusable components in your directory, such as data from several studies, or data and code/results? If yes, make each individual component a dataset.
+   #. How large is each individual component? If it exceeds 100k files, split it up into smaller datasets. The decision on where to place subdataset boundaries can be guided by the existing directory structure or by common access patterns, for example, based on data type (raw, processed, ...) or subject association. One straightforward organization may be a top-level superdataset and subject-specific subdatasets, mimicking the structure chosen in the use case :ref:`usecase_HCP_dataset`.
 
 You can automate this with :term:`bash` loops, if you want.
 

+ 4 - 4
docs/beyond_basics/101-168-dvc.rst

@@ -319,8 +319,8 @@ DVC uses the term "data remote" to refer to external storage locations for (larg
 
 Both DVC and DataLad support a range of hosting solutions, from local paths and SSH servers to providers such as S3 or GDrive.
 For DVC, every supported remote is pre-implemented, which restricts the number of available services (a list is `here <https://dvc.org/doc/command-reference/remote/add>`_), but results in a convenient, streamlined procedure for adding remotes based on URL schemes.
-DataLad, largely thanks to "external special remotes" mechanism of git-annex, has more storage options (in addition for example :ref:`DropBox <sharethirdparty>`, `the Open Science Framework (OSF) <https://docs.datalad.org/projects/osf>`_, :ref:`Git LFS <gitlfs>`, :ref:`Figshare <figshare>`, :ref:`GIN <gin>`, or :ref:`RIA stores <riastore>`), but depending on selected storage provider, the procedure to add a sibling may differ.
-In addition, DataLad is able to store complete datasets (annexed data *and* Git repository) in certain services (e.g., OSF, GIN, GitHub if used with GitLFS, Dropbox, ...), enabling a clone from for example Google Drive, and while DVC can never keep data in Git repository hosting services, DataLad can do this if the hosting service supports hosting annexed data (default on :term:`Gin` and possible with :term:`GitHub`, :term:`GitLab` or :term:`BitBucket` if used with `GitLFS <https://git-lfs.com>`_).
+DataLad, largely thanks to "external special remotes" mechanism of git-annex, has more storage options (in addition, for example, :ref:`DropBox <sharethirdparty>`, `the Open Science Framework (OSF) <https://docs.datalad.org/projects/osf>`_, :ref:`Git LFS <gitlfs>`, :ref:`Figshare <figshare>`, :ref:`GIN <gin>`, or :ref:`RIA stores <riastore>`), but depending on selected storage provider, the procedure to add a sibling may differ.
+In addition, DataLad is able to store complete datasets (annexed data *and* Git repository) in certain services (e.g., OSF, GIN, GitHub if used with GitLFS, Dropbox, ...), enabling a clone from, for example, Google Drive, and while DVC can never keep data in Git repository hosting services, DataLad can do this if the hosting service supports hosting annexed data (default on :term:`Gin` and possible with :term:`GitHub`, :term:`GitLab` or :term:`BitBucket` if used with `GitLFS <https://git-lfs.com>`_).
 
 
 DVC workflow
@@ -876,7 +876,7 @@ Instead of creating a pipeline stage and giving it a name, we attach a meaningfu
 The results of this computation are automatically saved and associated with their inputs and command execution.
 This information isn't stored in a separate file, but in the Git history, and saved with the commit message we have attached to the :dlcmd:`run` command.
 
-To stay close to the DVC tutorial, we will also work with tags to identify analysis versions, but DataLad could also use a range of other identifiers, for example commit hashes, to identify this computation.
+To stay close to the DVC tutorial, we will also work with tags to identify analysis versions, but DataLad could also use a range of other identifiers (such as commit hashes) to identify this computation.
 As we at this point have set up our data and are ready for the analysis, we will name the first tag "ready-for-analysis".
 This can be done with :gitcmd:`tag`, but also with :dlcmd:`save`.
 
@@ -993,7 +993,7 @@ We could automatically compute this on a different branch if we wanted to by usi
    $ datalad rerun --branch="randomforest" -m "Recompute classification with random forest classifier" ready-for-analysis..SGD
 
 Done!
-The difference in accuracies between models could now for example be compared with a ``git diff``:
+The difference in accuracies between models could now, for example, be compared with a ``git diff``:
 
 
 .. runrecord:: _examples/DL-101-168-179

+ 4 - 4
docs/beyond_basics/101-170-dataladrun.rst

@@ -65,7 +65,7 @@ But: Multiple simultaneous ``datalad (containers-)run`` invocations in the same
 - A number of *concurrency issues*, unwanted interactions of processes when they run simultaneously, can arise and lead to internal command failures
 
 Some of these problems can be averted by invoking the ``(containers-)run`` command with the ``--explicit`` [#f1]_ flag.
-This doesn't solve all of the above problems, though, and may not be applicable to the computation at hand -- for example because all jobs write to a similar file or the result files are not known beforehand.
+This doesn't solve all of the above problems, though, and may not be applicable to the computation at hand -- for example, because all jobs write to a similar file or the result files are not known beforehand.
 Below, you can find a complete, largely platform and scheduling-system agnostic containerized analysis workflow that addressed the outlined problems.
 
 Processing FAIRly *and* in parallel -- General workflow
@@ -83,7 +83,7 @@ The analysis is carried out on a computational cluster that uses a job schedulin
 
 The "creative" bits involved in this parallelized processing workflow boil down to the following tricks:
 
-- Individual jobs (for example subject-specific analyses) are computed in **throw-away dataset clones** to avoid unwanted interactions between parallel jobs.
+- Individual jobs (for example, subject-specific analyses) are computed in **throw-away dataset clones** to avoid unwanted interactions between parallel jobs.
 - Beyond computing in job-specific, temporary locations, individual job results are also saved into uniquely identified :term:`branch`\es to enable simple **pushing back of the results** into the target dataset.
 - The jobs constitute a complete DataLad-centric workflow in the form of a simple **bash script**, including dataset build-up and tear-down routines in a throw-away location, result computation, and result publication back to the target dataset.
   Thus, instead of submitting a ``datalad run`` command to the job scheduler, **the job submission is a single script**, and this submission is easily adapted to various job scheduling call formats.
@@ -124,7 +124,7 @@ To get an idea of the general setup of parallel provenance-tracked computations,
       install (ok: 1)
       save (ok: 1)
 
-... and a dataset with a containerized pipeline (for example from the `ReproNim container-collection <https://github.com/repronim/containers>`_ [#f2]_) as another subdataset:
+... and a dataset with a containerized pipeline (for example, from the `ReproNim container-collection <https://github.com/repronim/containers>`_ [#f2]_) as another subdataset:
 
 .. code-block::
 
@@ -319,7 +319,7 @@ You can save this script into your analysis dataset, e.g., as ``code/analysis_jo
 
 Job submission now only boils down to invoking the script for each participant with the relevant command line arguments (e.g., input files for our artificial example) and the necessary environment variables (e.g., the job ID that determines the branch name that is created, and one that points to a lockfile created beforehand once in ``.git``).
 Job scheduler such as HTCondor can typically do this with automatic variables.
-They for example have syntax that can identify subject IDs or consecutive file numbers from consistently named directory structure, access the job ID, loop through a predefined list of values or parameters, or use various forms of pattern matching.
+They, for example, have syntax that can identify subject IDs or consecutive file numbers from consistently named directory structure, access the job ID, loop through a predefined list of values or parameters, or use various forms of pattern matching.
 Examples of this are demonstrated `here <https://jugit.fz-juelich.de/inm7/training/htcondor/-/blob/master/03_define_jobs.md>`_.
 Thus, the submit file takes care of defining hundreds or thousands of variables, but can still be lean even though it queues up hundreds or thousands of jobs.
 Here is a submit file that could be employed:

+ 1 - 1
docs/intro/filenaming.rst

@@ -159,7 +159,7 @@ Prevent paths to be interpreted as command line arguments
 While it's not "illegal" to start a directory of file name with a hyphen (``-``), it's a bad idea, and doing so is disallowed by certain tools due to security risks.
 In theory, a file name starting with a hyphen can clash with a command line argument, and a tool called to operate on that file may then misinterpret it as an argument name.
 If you were to create a file called ``-n`` on a Unix system, an ``ls`` or ``cat`` on this file (unless you would add a ``./`` prefix to indicate a file in the current directory) would behave different than expected, parametrizing the command line tool instead of displaying any file information.
-Because this can be a security hazard, for example leading to remote code execution, `Git will refuse to operate on submodules that start with a hyphen (CVE-2018-17456) <https://www.exploit-db.com/exploits/45631>`_.
+Because this can be a security hazard, leading to remote code execution for example, `Git will refuse to operate on submodules that start with a hyphen (CVE-2018-17456) <https://www.exploit-db.com/exploits/45631>`_.
 
 Other hassles
 =============

+ 1 - 1
docs/intro/howto.rst

@@ -73,7 +73,7 @@ to list the size of a file in a *human-readable* format, supply the short option
    $ ls -l --human-readable output.txt
 
 Every command has many of those options (often called "flags") that modify their behavior.
-On Windows, options of native Windows commands can be preceded by a ``/`` instead of dashes, for example ``dir /p output.txt``.
+On Windows, options of native Windows commands can be preceded by a ``/`` instead of dashes, for example, ``dir /p output.txt``.
 There are too many to even consider memorizing. Remember the ones you use often,
 and the rest you will lookup in their documentation or via your favorite search engine.
 DataLad commands naturally also come with many options, and in the next chapters

+ 2 - 2
docs/intro/windows.rst

@@ -11,7 +11,7 @@ This makes the user experience less fun than on other operating systems -- an ho
 
 
 Many software tools for research or data science are first written and released for Linux, then for Mac, and eventually Windows.
-TensorFlow for Windows was `released only a full year after it became open source <https://developers.googleblog.com/2016/11/tensorflow-0-12-adds-support-for-windows.html>`_, for example, and Python only became easy to install on Windows in `2019 <https://devblogs.microsoft.com/python/python-in-the-windows-10-may-2019-update>`_.
+TensorFlow for Windows was `released only a full year after it became open source <https://developers.googleblog.com/2016/11/tensorflow-0-12-adds-support-for-windows.html>`_ for example, and Python only became easy to install on Windows in `2019 <https://devblogs.microsoft.com/python/python-in-the-windows-10-may-2019-update>`_.
 The same is true for DataLad and its underlying tools.
 There *is* Windows support and user documentation, but it isn't as far developed as for Unix-based systems.
 This page summarizes core downsides and deficiencies of Windows, DataLad on Windows, and the user documentation.
@@ -25,7 +25,7 @@ Beyond this, Windows uses a different file system than Unix based systems.
 Given that DataLad is a data management software, it is heavily affected by this, and the Basics part of the handbook is filled with "Windows-Wits", dedicated sections that highlight different behavior on native Windows installations of DataLad, or provide adjusted commands -- nevertheless, standard DataLad operations on Windows can be much slower than on other operating systems.
 
 A major annoyance and problem is that some tools that DataLad or :term:`datalad extension`\s use are not available on Windows.
-If you are interested in adding :term:`software container`\s to your DataLad dataset (with the ``datalad-container`` extension), for example, you will likely not be able to do so on a native Windows computer -- :term:`Singularity`, a widely used containerization software, doesn't exit for Windows, and while there *is* some support for :term:`Docker` on Windows, it does not apply to most private computers [#f1]_.
+If you are interested in adding :term:`software container`\s to your DataLad dataset (with the ``datalad-container`` extension) for example, you will likely not be able to do so on a native Windows computer -- :term:`Singularity`, a widely used containerization software, doesn't exit for Windows, and while there *is* some support for :term:`Docker` on Windows, it does not apply to most private computers [#f1]_.
 
 Windows also has insufficient support for :term:`symlink`\ing and locking files (i.e., revoking write :term:`permissions`), which alters how :term:`git-annex` works, and may make interoperability of datasets between Windows and non-Windows operating systems not as smooth as between various flavors of Unix-like operating systems.
 

+ 1 - 1
docs/usecases/ml-analysis.rst

@@ -537,7 +537,7 @@ This way, we have access to a trained random-forest model or a trained SGD model
    $ datalad rerun --branch="randomforest" -m "Recompute classification with random forest classifier" ready4analysis..SGD-100
 
 This updated the model.joblib file to a trained random forest classifier, and also updated ``accuracy.json`` with the current models' evaluation.
-The difference in accuracy between models could now for example be compared with a ``git diff`` of the contents of ``accuracy.json`` to the :term:`main` :term:`branch`:
+The difference in accuracy between models could now, for example, be compared with a ``git diff`` of the contents of ``accuracy.json`` to the :term:`main` :term:`branch`:
 
 .. runrecord:: _examples/ml-134
    :workdir: usecases/ml-project

+ 1 - 1
docs/usecases/reproducible-paper.rst

@@ -338,7 +338,7 @@ in the actual manuscript, if you want!). This was step number 1 of 4.
 
 .. find-out-more:: How about figures?
 
-   To include figures, the figures just need to be saved into a dedicated location (for example
+   To include figures, the figures just need to be saved into a dedicated location (for example,
    a directory ``img/``) and included into the ``.tex`` file with standard ``LaTeX`` syntax.
    Larger figures with subfigures can be created by combining several figures:
 

+ 1 - 1
docs/usecases/reproducible_neuroimaging_analysis.rst

@@ -419,7 +419,7 @@ Archive data and results
 """"""""""""""""""""""""
 
 After study completion it is important to properly archive data and results,
-for example for future inquiries by reviewers or readers of the associated
+for example, for future inquiries by reviewers or readers of the associated
 publication. Thanks to the modularity of the study units, this tasks is easy
 and avoids needless duplication.