Browse Source

Standardize `filesystem` -> `file system`

Closes #1104
Michael Hanke 6 months ago
parent
commit
7da68879bf

+ 5 - 5
docs/basics/101-115-symlinks.rst

@@ -95,7 +95,7 @@ tree is also known as the *annex* of a dataset.
    :float:
 
    Windows has insufficient support for :term:`symlink`\s and revoking write :term:`permissions` on files.
-   Therefore, :term:`git-annex` classifies it as a :term:`crippled filesystem` and has to stray from its default behavior.
+   Therefore, :term:`git-annex` classifies it as a :term:`crippled file system` and has to stray from its default behavior.
    While git-annex on Unix-based file operating systems stores data in the annex and creates a symlink in the data's original place, on Windows it moves data into the :term:`annex` and creates a *copy* of the data in its original place.
 
    **Why is that?**
@@ -208,7 +208,7 @@ Lastly, understanding that annexed files in your dataset are symlinked
 will be helpful to understand how common file system operations such as
 moving, renaming, or copying content translate to dataset modifications
 in certain situations. Later in this book we will have a section on how
-to manage the file system in a DataLad dataset (:ref:`filesystem`).
+to manage the file system in a DataLad dataset (:ref:`file system`).
 
 .. index::
    pair: key; git-annex concept
@@ -266,7 +266,7 @@ to manage the file system in a DataLad dataset (:ref:`filesystem`).
    consisting of two letters each.
    These two letters are derived from the md5sum of the key, and their sole purpose to exist is to avoid issues with too many files in one directory (which is a situation that certain file systems have problems with).
    The next subdirectory in the symlink helps to prevent accidental deletions and changes, as it does not have write :term:`permissions`, so that users cannot modify any of its underlying contents.
-   This is the reason that annexed files need to be unlocked prior to modifications, and this information will be helpful to understand some file system management operations such as removing files or datasets (see section :ref:`filesystem`).
+   This is the reason that annexed files need to be unlocked prior to modifications, and this information will be helpful to understand some file system management operations such as removing files or datasets (see section :ref:`file system`).
 
    The next part of the symlink contains the actual hash.
    There are different hash functions available.
@@ -291,7 +291,7 @@ Broken symlinks
 
 Whenever a symlink points to a non-existent target, this symlink is called
 *broken*, and opening the symlink would not work as it does not resolve. The
-section :ref:`filesystem` will give a thorough demonstration of how symlinks can
+section :ref:`file system` will give a thorough demonstration of how symlinks can
 break, and how one can fix them again. Even though *broken* sounds
 troublesome, most types of broken symlinks you will encounter can be fixed,
 or are not problematic. At this point, you actually have already seen broken
@@ -340,7 +340,7 @@ If so, please take a look into the Windows Wit below.
    pair: log; Git command
 .. windows-wit:: Accessing symlinked files from your Windows system
 
-   If you are using WSL2 you have access to a Linux kernel and POSIX filesystem, including symlink support.
+   If you are using WSL2 you have access to a Linux kernel and POSIX file system, including symlink support.
    Your DataLad experience has therefore been exactly as it has been for macOS or Linux users.
    But one thing that bears the need for additional information is sharing files in dataset between your Linux and Windows system.
 

+ 1 - 1
docs/basics/101-122-config.rst

@@ -290,7 +290,7 @@ remaining sections in that file, and the :ref:`that dissects this config file fu
    The value to the ``url`` variable is a *path*. If at any point
    either your superdataset or the remote moves on your file system,
    the association between the two datasets breaks -- this can be fixed by adjusting this
-   path, and a demonstration of this is in section :ref:`filesystem`.
+   path, and a demonstration of this is in section :ref:`file system`.
    `fetch` contains a specification which parts of the repository are
    updated -- in this case everything (all of the branches).
    Lastly, the ``annex-ignore = false`` configuration allows git-annex

+ 1 - 1
docs/basics/101-132-advancednesting.rst

@@ -75,7 +75,7 @@ interested in this, checkout the :ref:`dedicated Findoutmore <fom-status>`.
      that is properly registered in the superdataset
 
    And you have seen the following *content states*: ``modified`` and ``untracked``.
-   The section :ref:`filesystem` will show you many instances of ``deleted`` content
+   The section :ref:`file system` will show you many instances of ``deleted`` content
    state as well.
 
    But beyond understanding the report of :dlcmd:`status`, there is also

+ 1 - 1
docs/basics/101-133-containersrun.rst

@@ -83,7 +83,7 @@ Both of these tools share core terminology:
    It is made by a human user.
 
 **Image**
-   This is *built* from the recipe file. It is a static filesystem inside a file,
+   This is *built* from the recipe file. It is a static file system inside a file,
    populated with the software specified in the recipe, and some initial configuration.
 
 **Container**

+ 1 - 1
docs/basics/101-136-filesystem.rst

@@ -1,4 +1,4 @@
-.. _filesystem:
+.. _file system:
 
 Miscellaneous file system operations
 ------------------------------------

+ 1 - 1
docs/basics/101-141-push.rst

@@ -182,6 +182,6 @@ For more information on this, and other error messages during push, please check
 
 .. rubric:: Footnotes
 
-.. [#f1]  RIA siblings are filesystem-based, scalable storage solutions for
+.. [#f1]  RIA siblings are file system based, scalable storage solutions for
           DataLad datasets. You can find out more about them in the online version.
 .. [#f2] For information on the ``numcopies`` and ``wanted`` settings of git-annex see its documentation at `git-annex.branchable.com/git-annex-wanted/ <https://git-annex.branchable.com/git-annex-wanted>`_ and `git-annex.branchable.com/git-annex-numcopies/ <https://git-annex.branchable.com/git-annex-numcopies>`_.

+ 1 - 1
docs/basics/basics-help.rst

@@ -8,7 +8,7 @@ Help yourself
 
 .. toctree::
    :maxdepth: 1
-   :caption: Dealing with problems, filesystems, and version histories
+   :caption: Dealing with problems, file systems, and version histories
 
    101-135-intro
    101-136-filesystem

+ 2 - 2
docs/beyond_basics/101-164-dataladdening.rst

@@ -23,7 +23,7 @@ If you're really pressed for time because your dog is sick, your toddler keeps e
 
    To gain a good understanding of some important parts of DataLad, please read chapter :ref:`chapter_datasets`, :ref:`chapter_run`, and :ref:`chapter_gitannex` (reading time: 60 minutes).
 
-   To become confident in using DataLad, sections :ref:`help`, :ref:`filesystem` can be very useful. Depending on your aim, :ref:`chapter_collaboration` (for collaborative workflows), :ref:`chapter_thirdparty` (for data sharing), or :ref:`chapter_yoda` (for data analysis) may contain the relevant background for you.
+   To become confident in using DataLad, sections :ref:`help`, :ref:`file system` can be very useful. Depending on your aim, :ref:`chapter_collaboration` (for collaborative workflows), :ref:`chapter_thirdparty` (for data sharing), or :ref:`chapter_yoda` (for data analysis) may contain the relevant background for you.
 
 Prior to transforming your project, regardless of how advanced of a user you are, **we recommend to create a copy of it**.
 We don't believe there is much that can go wrong from the software-side of things, but data is precious and backups a necessity, so better be safe than sorry.
@@ -132,7 +132,7 @@ Summary
 
 Existing projects and analysis can be DataLad-ified with a few standard commands.
 Be mindful about dataset sizes and whether you save contents into Git or git-annex, though, as these choices could potentially spoil your DataLad experience.
-The sections :ref:`filesystem` and :ref:`cleanup` can help you to undo unwanted changes, but it's better to do things right instead of having to fix them up.
+The sections :ref:`file system` and :ref:`cleanup` can help you to undo unwanted changes, but it's better to do things right instead of having to fix them up.
 If you can, read up on the DataLad Basics to understand what you are doing, and create a backup in case things go not as planned in your first attempts.
 
 .. rubric:: Footnotes

+ 1 - 1
docs/beyond_basics/101-170-dataladrun.rst

@@ -174,7 +174,7 @@ The solution is as easy as it is stubborn: We simply create one throw-away datas
 .. find-out-more:: how does one create throw-away clones?
 
     One way to do this are :term:`ephemeral clone`\s, an alternative is to make :term:`git-annex` disregard the datasets annex completely using ``git annex dead here``.
-    The latter is more appropriate for this context -- we could use an ephemeral clone, but that might deposit data of failed jobs at the origin location, if the job runs on a shared filesystem.
+    The latter is more appropriate for this context -- we could use an ephemeral clone, but that might deposit data of failed jobs at the origin location, if the job runs on a shared file system.
 
 Using throw-away clones involves a build-up, result-push, and tear-down routine for each job.
 It sounds complex and tedious, but this actually works well since datasets are by nature made for such decentralized, collaborative workflows.

+ 3 - 3
docs/beyond_basics/101-171-enki.rst

@@ -145,7 +145,7 @@ This initial sketch serves to highlight key differences and adjustments due to t
    :emphasize-lines: 10, 13, 19-20, 24, 43-44
 
    # everything is running under /tmp inside a compute job,
-   # /tmp is job-specific local filesystem not shared between jobs
+   # /tmp is job-specific local file system not shared between jobs
    $ cd /tmp
 
    # clone the superdataset with locking
@@ -224,7 +224,7 @@ At this point, the workflow misses a tweak that is necessary in fMRIprep to enab
       subid=$(basename $1)
 
       # this is all running under /tmp inside a compute job, /tmp is a performant
-      # local filesystem
+      # local file system
       cd /tmp
       # get the output dataset, which includes the inputs as well
       # flock makes sure that this does not interfere with another job
@@ -247,7 +247,7 @@ At this point, the workflow misses a tweak that is necessary in fMRIprep to enab
       # let git-annex know that we do not want to remember any of these clones
       # (we could have used an --ephemeral clone, but that might deposite data
       # of failed jobs at the origin location, if the job runs on a shared
-      # filesystem -- let's stay self-contained)
+      # file system -- let's stay self-contained)
       git submodule foreach --recursive git annex dead here
 
    .. code-block:: bash

+ 1 - 1
docs/beyond_basics/101-183-gooey.rst

@@ -26,7 +26,7 @@ Speed
 """""
 
 ``datalad-gooey`` has internal helpers for faster annotations (whether a file is annexed, committed to Git, modified, or untracked) and file tree overviews.
-This makes it more convenient especially on filesystems that are generally slower for Git operations, such as Window's NTFS.
+This makes it more convenient especially on file systems that are generally slower for Git operations, such as Window's NTFS.
 
 Credential Management
 """""""""""""""""""""

+ 1 - 1
docs/code_from_chapters/DLBasicsMPI.rst

@@ -721,7 +721,7 @@ Afterwards, :dlcmd:`status` reports the file to be deleted::
    datalad status
 
 (Side-note: While the file is deleted in the most recent dataset state, it can be brought back to life as it still exists in the datasets history.
-You can find out more about this and also how to remove also past copies of a file in the section :ref:`filesystem`)
+You can find out more about this and also how to remove also past copies of a file in the section :ref:`file system`)
 
 The deletion of a file must be saved::
 

+ 1 - 1
docs/code_from_chapters/usecase_ml_code.rst

@@ -707,7 +707,7 @@ Afterwards, :dlcmd:`status` reports the file to be deleted::
    datalad status
 
 (Side-note: While the file is deleted in the most recent dataset state, it can be brought back to life as it still exists in the datasets history.
-You can find out more about this and also how to remove also past copies of a file in the section :ref:`filesystem`)
+You can find out more about this and also how to remove also past copies of a file in the section :ref:`file system`)
 
 The deletion of a file must be saved::
 

+ 4 - 4
docs/glossary.rst

@@ -22,7 +22,7 @@ Glossary
       The adjusted branch is called "adjusted/<branchname>(unlocked)" and on an the adjusted branch", all files handled by :term:`git-annex` are not locked --
       They will stay "unlocked" and thus modifiable.
       Instead of referencing data in the :term:`annex` with a :term:`symlink`, unlocked files need to be copies of the data in the annex.
-      Adjusted branches primarily exist as the default branch on so-called :term:`crippled filesystem`\s such as Windows.
+      Adjusted branches primarily exist as the default branch on so-called :term:`crippled file system`\s such as Windows.
 
    annex
       .. index::
@@ -119,11 +119,11 @@ Glossary
       .. index:: ! Container concept; image
 
       Container images are *built* from :term:`container recipe` files.
-      They are a static filesystem inside a file, populated with the software specified in the recipe, and some initial configuration.
+      They are a static file system inside a file, populated with the software specified in the recipe, and some initial configuration.
 
-   crippled filesystem
+   crippled file system
       .. index::
-         pair: crippled filesystem; git-annex concept
+         pair: crippled file system; git-annex concept
 
       git-annex concept: A file system that does not allow making symlinks or removing write :term:`permissions` from files. Examples for this are `FAT <https://en.wikipedia.org/wiki/Design_of_the_FAT_file_system>`_ (likely used by your USB sticks) or `NTFS <https://en.wikipedia.org/wiki/NTFS>`_ (used on Windows systems of the last three decades).
 

+ 3 - 3
docs/intro/filenaming.rst

@@ -65,12 +65,12 @@ The others may have an extra bit of fun in their lives when software can not han
 Even though certain names look identical across file systems or operating systems, their underlying unicode character sequences can differ.
 For example, the character "é" can be represented as the single Unicode character u+00E9 (latin small letter e with acute), or as the two Unicode characters u+0065 and u+0301 (the letter "e" plus a combining acute symbol).
 This is called `canonical equivalence <https://en.wikipedia.org/wiki/Unicode_equivalence>`_ and can be  confusing: While file names are visually indistinguishable, certain tools, operating systems, or file systems can normalize their underlying unicode differently and cause errors in the process.
-It becomes a problem, potentially even leading to permanent data loss, when `one tool or filesystem won't recognize a file anymore that has been normalized by a different tool or filesystem <https://web.archive.org/web/20100109162824/http://forums.macosxhints.com/archive/index.php/t-99344.html>`_.
+It becomes a problem, potentially even leading to permanent data loss, when `one tool or file system won't recognize a file anymore that has been normalized by a different tool or file system <https://web.archive.org/web/20100109162824/http://forums.macosxhints.com/archive/index.php/t-99344.html>`_.
 
-Apple's HFS Plus filesystem always normalizes file names to a `fully decomposed form <https://developer.apple.com/library/archive/technotes/tn/tn1150.html#UnicodeSubtleties>`_.
+Apple's HFS Plus file system always normalizes file names to a `fully decomposed form <https://developer.apple.com/library/archive/technotes/tn/tn1150.html#UnicodeSubtleties>`_.
 "é" would be represented as two Unicode characters u+0065 and u+0301, in that order.
 Windows treats filenames as opaque character sequences and will store and return the encoded bytes exactly as provided.
-Linux and other common Unix systems are generally similar to Windows in storing and returning opaque byte streams, but this behavior is technically dependent on the filesystem.
+Linux and other common Unix systems are generally similar to Windows in storing and returning opaque byte streams, but this behavior is technically dependent on the file system.
 And utilities used for file management, transfer, and archiving may ignore this issue, apply an arbitrary normalization form, or allow the user to control how normalization is applied.
 Having special characters in your file names thus is a bit like a data management version of russian roulette.
 Most things will likely be fine, but at some point, with some tool, sharing to some system, things could just blow up.

+ 1 - 1
docs/intro/installation.rst

@@ -214,7 +214,7 @@ One attractive alternative approach is Conda_, a completely different approach i
    for Linux.
 
 Using DataLad on Windows has a few peculiarities. In general, DataLad can feel a bit
-sluggish on non-WSL2 Windows systems. This is due to various filesystem issues
+sluggish on non-WSL2 Windows systems. This is due to various file system issues
 that also affect the version control system :term:`Git` itself, which DataLad
 relies on. The core functionality of DataLad works, and you should be able to
 follow most contents covered in this book.  You will notice, however, that some

+ 1 - 1
docs/intro/user_types.rst

@@ -43,7 +43,7 @@ afterwards.
 
 The section :ref:`help` may give you a good general overview on what to do if
 you encountered a problem. If you're dealing with file system operations,
-:ref:`filesystem` could be a resource to help you, and for all things configuration,
+:ref:`file system` could be a resource to help you, and for all things configuration,
 the chapter :ref:`chapter_config` is your place to go to. If you are confused by
 symlinks or "permission denied" error in your dataset, checkout section
 :ref:`symlink` for some Basics on :term:`git-annex`. The "Quick search" bar at

+ 2 - 2
docs/usecases/datasets.rst

@@ -25,7 +25,7 @@ a hands-on experience.
 
       $ datalad install https://github.com/psychoinformatics-de/studyforrest-data-phase2.git
 
-Once installed, a DataLad dataset looks like any other directory on your filesystem:
+Once installed, a DataLad dataset looks like any other directory on your file system:
 
 .. runrecord:: _examples/dataset2
    :language: console
@@ -122,7 +122,7 @@ Dataset Nesting
 
 Within DataLad datasets one can *nest* other DataLad
 datasets arbitrarily deep. This does not seem particularly spectacular -
-after all, any directory on a filesystem can have other directories inside it.
+after all, any directory on a file system can have other directories inside it.
 The possibility for nested Datasets, however, is one of many advantages
 DataLad datasets have:
 Any lower-level DataLad dataset (the *subdataset*) has a stand-alone

+ 1 - 1
docs/usecases/datastorage_for_institutions.rst

@@ -85,7 +85,7 @@ scientific computing (infrastructure).
 The RIA store is configured as a git-annex ORA-remote ("optional remote archive")
 special remote for access to annexed keys in the store and so that full
 datasets can be (compressed) 7-zip archives.
-The latter is especially useful in case of filesystem inode
+The latter is especially useful in case of file system inode
 limitations, such as on HPC storage systems: Regardless of a dataset's number of
 files and size, (compressed) 7zipped datasets use only few inodes, but retain the
 ability to query available files.

+ 1 - 1
docs/usecases/encrypted_annex.rst

@@ -279,7 +279,7 @@ But to learn about new files that were added in the remote server since we last
 
 Let's add it then (note that when working with ``datalad
 siblings`` or ``git remote`` commands, we cannot use the
-``ria+ssh://...#~alias`` URL, and need to use the actual SSH URL and filesystem path).
+``ria+ssh://...#~alias`` URL, and need to use the actual SSH URL and file system path).
 
 .. code-block:: bash
 

+ 1 - 1
docs/usecases/openneuro.rst

@@ -86,7 +86,7 @@ While DataLad datasets -- in our opinion -- have many advantages, it may be good
   Here's what you should do if you want to copy or move a file out of a dataset into a non-dataset location: Make sure that the file content is present (:dlcmd:`get`), and copy or move the file with a tool that can *dereference* (i.e., resolve to canonical paths) :term:`symlink`\s.
   The command line tool ``cp`` for copying can do this with the ``-L/--dereference`` option, for example, any command can do it if the file path is wrapped in a ``readlink -f <path>`` command.
   Alternatively, run :dlcmd:`unlock` prior to moving with any tool of your choice.
-  See also the FAQ on :ref:`Getting data out of datasets <copydata>` or the section :ref:`filesystem`.
+  See also the FAQ on :ref:`Getting data out of datasets <copydata>` or the section :ref:`file system`.
 
 * **Don't force-overwrite files**: Many files in datasets are *annexed* for version control and, by default (on any non-Windows operating system), write-protected to ensure file integrity.
   If you encounter a file that will not let you change it right away and responds, for example, with a "permission denied" error, it is important to not forcefully modify this data.

+ 1 - 1
docs/usecases/provenance_tracking.rst

@@ -228,4 +228,4 @@ again and remembers this small project fondly.
 .. [#f2] If you want to learn more about :dlcmd:`run`, read on from
          section :ref:`run`.
 .. [#f3] Find out more about working with the history of a dataset with Git in
-         section :ref:`filesystem`
+         section :ref:`file system`

+ 1 - 1
docs/usecases/reproducible-paper.rst

@@ -162,7 +162,7 @@ and recompute all results
 when running the script after cloning and setting up the necessary software.
 This requires minor preparation:
 
-* The final analysis should be able to run on anyone's filesystem.
+* The final analysis should be able to run on anyone's file system.
   It is therefore important to reference datafiles with the scripts in ``code/`` as
   :term:`relative path`\s instead of hard-coding :term:`absolute path`\s.
 

+ 1 - 1
docs/usecases/using_globus_as_datastore.rst

@@ -91,7 +91,7 @@ Users log into the CONP portal and install Datalad datasets with
 ``datalad install -r <dataset>``. This gives them access to the annexed files
 (as mentioned in the findoutmore above, large files replaced by their symlinks).
 To request the content of the annexed files, they simply download those files
-locally in their filesystem using ``datalad get path/to/file``. So simple!
+locally in their file system using ``datalad get path/to/file``. So simple!
 
 On a technical level, under the hood, :term:`git-annex` needs to have a connection
 established with the primary data source, the :term:`special remote`, that hosts