123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844 |
- .. _history:
- Back and forth in time
- ----------------------
- Almost everyone inadvertently deleted or overwrote files at some point with
- a hasty operation that caused data fatalities or at least troubles to
- reobtain or restore data.
- With DataLad, no mistakes are forever: One powerful feature of datasets
- is the ability to revert data to a previous state and thus view earlier content or
- correct mistakes. As long as the content was version controlled (i.e., tracked),
- it is possible to look at previous states of the data, or revert changes --
- even years after they happened -- thanks to the underlying version control
- system :term:`Git`.
- .. figure:: ../artwork/src/versioncontrol.svg
- :width: 70%
- To get a glimpse into how to work with the history of a dataset, today's lecture
- has an external Git-expert as a guest lecturer.
- "I do not have enough time to go through all the details in only
- one lecture. But I'll give you the basics, and an idea of what is possible.
- Always remember: Just google what you need. You will find thousands of helpful tutorials
- or questions on `Stack Overflow <https://stackoverflow.com>`_ right away.
- Even experts will *constantly* seek help to find out which Git command to
- use, and how to use it.", he reassures with a wink.
- The basis of working with the history is to *look at it* with tools such
- as :term:`tig`, :term:`gitk`, or simply the :gitcmd:`log` command.
- The most important information in an entry (commit) in the history is
- the :term:`shasum` (or hash) associated with it.
- This hash is how dataset modifications in the history are identified,
- and with this hash you can communicate with DataLad or :term:`Git` about these
- modifications or version states [#f1]_.
- Here is an excerpt from the ``DataLad-101`` history to show a
- few abbreviated hashes of the 15 most recent commits [#f2]_:
- .. runrecord:: _examples/DL-101-137-101
- :workdir: dl-101/DataLad-101
- :language: console
- $ git log -15 --oneline
- "I'll let you people direct this lecture", the guest lecturer proposes.
- "You tell me what you would be interested in doing, and I'll show you how it's
- done. For the rest of the lecture, call me Google!"
- Fixing (empty) commit messages
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- From the back of the lecture hall comes a question you are really glad
- someone asked: "It has happened to me that I accidentally did a
- :dlcmd:`save` and forgot to specify the commit message,
- how can I fix this?".
- The room nods in agreement -- apparently, others have run into this
- premature slip of the ``Enter`` key as well.
- Let's demonstrate a simple example. First, let's create some random files.
- Do this right in your dataset.
- .. runrecord:: _examples/DL-101-137-102
- :language: console
- :workdir: dl-101/DataLad-101
- $ cat << EOT > Gitjoke1.txt
- Git knows what you did last summer!
- EOT
- $ cat << EOT > Gitjoke2.txt
- Knock knock. Who's there? Git.
- Git-who?
- Sorry, 'who' is not a git command - did you mean 'show'?
- EOT
- $ cat << EOT > Gitjoke3.txt
- In Soviet Russia, git commits YOU!
- EOT
- This will generate three new files in your dataset. Run a
- :dlcmd:`status` to verify this:
- .. runrecord:: _examples/DL-101-137-103
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- And now:
- .. runrecord:: _examples/DL-101-137-104
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad save
- Whooops! A :dlcmd:`save` without a
- commit message that saved all of the files.
- .. runrecord:: _examples/DL-101-137-105
- :language: console
- :workdir: dl-101/DataLad-101
- :emphasize-lines: 6
- $ git log -p -1
- As expected, all of the modifications present prior to the
- command are saved into the most recent commit, and the commit
- message DataLad provides by default, ``[DATALAD] Recorded changes``,
- is not very helpful.
- Changing the commit message of the most recent commit can be done with
- the command :gitcmd:`commit --amend`. Running this command will open
- an editor (the default, as configured in Git), and allow you
- to change the commit message. Make sure to read the :ref:`find-out-more on changing other than the most recent commit <fom-rebase1>` in case you want to improve the commit message of more commits than only the latest.
- Try running the :gitcmd:`commit --amend` command right now and give
- the commit a new commit message (you can just delete the one created by
- DataLad in the editor)!
- .. index::
- pair: save --amend; DataLad command
- pair: add changes to previous commit; with DataLad
- pair: change the last commit message; with DataLad
- .. gitusernote:: 'git commit --amend' versus 'datalad save --amend'
- Similar to ``git commit``, ``datalad save`` also has an ``--amend`` option.
- Like its Git equivalent, it can be used to record changes not in a new, separate commit, but integrate them with the previously saved state.
- Though this has not been the use case for ``git commit --amend`` here, experienced Git users will be accustomed to using ``git commit --amend`` to achieve something similar in their Git workflows.
- In contrast to ``git commit --amend``, ``datalad save --amend`` will not open up an interactive editor to potentially change a commit message (unless the configuration ``datalad.save.no-message`` is set to ``interactive``), but a new commit message can be supplied with the ``-m``/``--message`` option.
- .. index::
- pair: change historical commit messages; with Git
- pair: rebase; Git command
- pair: rewrite history; with Git
- .. find-out-more:: Changing the commit messages of not-the-most-recent commits
- :name: fom-rebase1
- :float:
- The :gitcmd:`commit --amend` command will let you
- rewrite the commit message of the most recent commit. If you
- however need to rewrite commit messages of older commits, you
- can do so during a so-called "interactive rebase". The command
- for this is
- .. code-block:: console
- $ git rebase -i HEAD~N
- where ``N`` specifies how far back you want to rewrite commits.
- ``git rebase -i HEAD~3``, for example, lets you apply changes to the
- any number of commit messages within the last three commits.
- Be aware that an interactive rebase lets you *rewrite* history.
- This can lead to confusion or worse if the history you are rewriting
- is shared with others, e.g., in a collaborative project. Be also aware
- that rewriting history that is *pushed*/*published* (e.g., to GitHub)
- will require a force-push!
- Running this command gives you a list of the N most recent commits
- in your text editor (which may be :term:`vim`!), sorted with
- the most recent commit on the bottom.
- This is how it may look like:
- .. code-block:: bash
- pick 8503f26 Add note on adding siblings
- pick 23f0a52 add note on configurations and git config
- pick c42cba4 add note on DataLad's procedures
- # Rebase b259ce8..c42cba4 onto b259ce8 (3 commands)
- #
- # Commands:
- # p, pick <commit> = use commit
- # r, reword <commit> = use commit, but edit the commit message
- # e, edit <commit> = use commit, but stop for amending
- # s, squash <commit> = use commit, but meld into previous commit
- # f, fixup <commit> = like "squash", but discard this commit's log message
- # x, exec <command> = run command (the rest of the line) using shell
- # b, break = stop here (continue rebase later with 'git rebase --continue')
- # d, drop <commit> = remove commit
- # l, label <label> = label current HEAD with a name
- An interactive rebase allows to apply various modifying actions to any
- number of commits in the list. Below the list are descriptions of these
- different actions. Among them is "reword", which lets you "edit the commit
- message". To apply this action and reword the top-most commit message in this list
- (``8503f26 Add note on adding siblings``, three commits back in the history),
- exchange the word ``pick`` in the beginning of the line with the word
- ``reword`` or simply ``r`` like this:
- .. code-block:: bash
- r 8503f26 Add note on adding siblings
- If you want to reword more than one commit message, exchange several
- ``pick``\s. Any commit with the word ``pick`` at the beginning of the line will
- be kept as is. Once you are done, save and close the editor. This will
- sequentially open up a new editor for each commit you want to reword. In
- it, you will be able to change the commit message. Save to proceed to
- the next commit message until the rebase is complete.
- But be careful not to delete any lines in the above editor view --
- **An interactive rebase can be dangerous, and if you remove a line, this commit will be lost!**
- .. index::
- pair: stop content tracking; with Git
- Untracking accidentally saved contents (tracked in Git)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- The next question comes from the front:
- "It happened that I forgot to give a path to the :dlcmd:`save`
- command when I wanted to only start tracking a very specific file.
- Other times I just didn't remember that
- additional, untracked files existed in the dataset and saved unaware of
- those. I know that it is good practice to only save
- those changes together that belong together, so is there a way to
- disentangle an accidental :dlcmd:`save` again?"
- Let's say instead of saving *all three* previously untracked Git jokes
- you intended to save *only one* of those files. What we
- want to achieve is to keep all of the files and their contents
- in the dataset, but get them out of the history into an
- *untracked* state again, and save them *individually* afterwards.
- .. importantnote:: Untracking is different for Git versus git-annex!
- Note that this is a case with *text files* (stored in Git)! For
- accidental annexing of files, please make sure to check out
- the next paragraph!
- This is a task for the :gitcmd:`reset` command. It essentially allows to
- undo commits by resetting the history of a dataset to an earlier version.
- :gitcmd:`reset` comes with several *modes* that determine the
- exact behavior it, but the relevant one for this aim is ``--mixed`` [#f3]_.
- Specifying the command:
- .. code-block:: console
- $ git reset --mixed COMMIT
- will preserve all changes made to files since the specified
- commit in the dataset but remove them from the dataset's history.
- This means all commits *since* ``COMMIT`` (but *not including* ``COMMIT``)
- will not be in your history anymore and become "untracked files" or
- "unsaved changes" instead. In other words, the modifications
- you made in these commits that are "undone" will still be present
- in your dataset -- just not written to the history anymore. Let's
- try this to get a feel for it.
- The COMMIT in the command can either be a hash or a reference
- with the HEAD pointer.
- .. index::
- pair: branch; Git concept
- pair: HEAD; Git concept
- .. find-out-more:: Git terminology: branches and HEADs?
- A Git repository (and thus any DataLad dataset) is built up as a tree of
- commits. A *branch* is a named pointer (reference) to a commit, and allows you
- to isolate developments. The default branch is called ``main``. ``HEAD`` is
- a pointer to the branch you are currently on, and thus to the last commit
- in the given branch.
- .. image:: ../artwork/src/git_branch_HEAD.png
- :width: 50%
- Using ``HEAD``, you can identify the most recent commit, or count backwards
- starting from the most recent commit. ``HEAD~1`` is the ancestor of the most
- recent commit, i.e., one commit back (``f30ab`` in the figure above). Apart from
- the notation ``HEAD~N``, there is also ``HEAD^N`` used to count backwards, but
- `less frequently used and of importance primarily in the case of merge
- commits <https://stackoverflow.com/q/2221658/10068927>`__.
- Let's stay with the hash, and reset to the commit prior to saving the Git jokes.
- First, find out the shasum, and afterwards, reset it.
- .. runrecord:: _examples/DL-101-137-106
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 3 --oneline
- .. runrecord:: _examples/DL-101-137-107
- :language: console
- :workdir: dl-101/DataLad-101
- :realcommand: echo "$ git reset --mixed $(git rev-parse HEAD~1)" && git reset --mixed $(git rev-parse HEAD~1)
- Let's see what has happened. First, let's check the history:
- .. runrecord:: _examples/DL-101-137-108
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 2 --oneline
- As you can see, the commit in which the jokes were tracked
- is not in the history anymore! Go on to see what :dlcmd:`status`
- reports:
- .. runrecord:: _examples/DL-101-137-109
- :workdir: dl-101/DataLad-101
- :language: console
- $ datalad status
- Nice, the files are present, and untracked again. Do they contain
- the content still? We will read all of them with :shcmd:`cat`:
- .. runrecord:: _examples/DL-101-137-110
- :workdir: dl-101/DataLad-101
- :language: console
- $ cat Gitjoke*
- Great. Now we can go ahead and save only the file we intended
- to track:
- .. runrecord:: _examples/DL-101-137-111
- :workdir: dl-101/DataLad-101
- :language: console
- $ datalad save -m "save my favorite Git joke" Gitjoke2.txt
- Finally, let's check how the history looks afterwards:
- .. runrecord:: _examples/DL-101-137-112
- :workdir: dl-101/DataLad-101
- :language: console
- $ git log -2
- Wow! You have rewritten history [#f4]_!
- .. index::
- pair: stop content tracking; with git-annex
- Untracking accidentally saved contents (stored in git-annex)
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- The previous :gitcmd:`reset` undid the tracking of *text* files.
- However, those files are stored in Git, and thus their content
- is also stored in Git. Files that are annexed, however, have
- their content stored in git-annex, and not the file itself is stored
- in the history, but a symlink pointing to the location of the file
- content in the dataset's annex. This has consequences for
- a :gitcmd:`reset` command: Reverting a save of a file that is
- annexed would revert the save of the symlink into Git, but it will
- not revert the *annexing* of the file.
- Thus, what will be left in the dataset is an untracked symlink.
- To undo an accidental save of that annexed a file, the annexed file
- has to be "unlocked" first with a :dlcmd:`unlock` command.
- We will simulate such a situation by creating a PDF file that
- gets annexed with an accidental :dlcmd:`save`:
- .. runrecord:: _examples/DL-101-137-113
- :language: console
- :workdir: dl-101/DataLad-101
- $ # create an empty pdf file
- $ convert xc:none -page Letter apdffile.pdf
- $ # accidentally save it
- $ datalad save
- This accidental :dlcmd:`save` has thus added both text files
- stored in Git, but also a PDF file to the history of the dataset.
- As an :shcmd:`ls -l` reveals, the PDF file has been annexed and is
- thus a :term:`symlink`:
- .. runrecord:: _examples/DL-101-137-114
- :language: console
- :realcommand: ls -l --time-style=long-iso apdffile.pdf
- :workdir: dl-101/DataLad-101
- $ ls -l apdffile.pdf
- Prior to resetting, the PDF file has to be unannexed.
- To unannex files, i.e., get the contents out of the object tree,
- the :dlcmd:`unlock` command is relevant:
- .. runrecord:: _examples/DL-101-137-115
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad unlock apdffile.pdf
- The file is now no longer symlinked:
- .. runrecord:: _examples/DL-101-137-116
- :language: console
- :realcommand: ls -l --time-style=long-iso apdffile.pdf
- :workdir: dl-101/DataLad-101
- $ ls -l apdffile.pdf
- Finally, :gitcmd:`reset --mixed` can be used to revert the
- accidental :dlcmd:`save`. Again, find out the shasum first, and
- afterwards, reset it.
- .. runrecord:: _examples/DL-101-137-117
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 3 --oneline
- .. runrecord:: _examples/DL-101-137-118
- :language: console
- :workdir: dl-101/DataLad-101
- :realcommand: echo "$ git reset --mixed $(git rev-parse HEAD~1)" && git reset --mixed $(git rev-parse HEAD~1)
- To see what has happened, let's check the history:
- .. runrecord:: _examples/DL-101-137-119
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 2 --oneline
- ... and also the status of the dataset:
- .. runrecord:: _examples/DL-101-137-120
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- The accidental save has been undone, and the file is present
- as untracked content again.
- As before, this action has not been recorded in your history.
- Viewing previous versions of files and datasets
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- The next question is truly magical: How does one *see*
- data as it was at a previous state in history?
- This magic trick can be performed with the :gitcmd:`checkout`.
- It is a very heavily used command for various tasks, but among
- many it can send you back in time to view the state of a dataset
- at the time of a specific commit.
- Let's say you want to find out which notes you took in the first
- few chapters of the handbook. Find a commit :term:`shasum` in your history
- to specify the point in time you want to go back to:
- .. runrecord:: _examples/DL-101-137-121
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 16 --oneline
- Let's go 15 commits back in time:
- .. runrecord:: _examples/DL-101-137-122
- :language: console
- :workdir: dl-101/DataLad-101
- :realcommand: echo "$ git checkout $(git rev-parse HEAD~15)" && git checkout $(git rev-parse HEAD~15)
- How did your ``notes.txt`` file look at this point?
- .. runrecord:: _examples/DL-101-137-123
- :language: console
- :workdir: dl-101/DataLad-101
- $ tail notes.txt
- Neat, isn't it? By checking out a commit shasum you can explore a previous
- state of a datasets history. And this does not only apply to simple text
- files, but every type of file in your dataset, regardless of size.
- The checkout command however led to something that Git calls a "detached HEAD state".
- While this sounds scary, a :gitcmd:`checkout main` will bring you
- back into the most recent version of your dataset and get you out of the
- "detached HEAD state":
- .. runrecord:: _examples/DL-101-137-124
- :language: console
- :workdir: dl-101/DataLad-101
- $ git checkout main
- Note one very important thing: The previously untracked files are still
- there.
- .. runrecord:: _examples/DL-101-137-125
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- The contents of ``notes.txt`` will now be the most recent version again:
- .. runrecord:: _examples/DL-101-137-126
- :language: console
- :workdir: dl-101/DataLad-101
- $ tail notes.txt
- ... Wow! You traveled back and forth in time!
- But an even more magical way to see the contents of files in previous
- versions is Git's :shcmd:`cat-file` command: Among many other things, it lets
- you read a file's contents as of any point in time in the history, without a
- prior :gitcmd:`checkout` (note that the output is shortened for brevity and shows only the last few lines of the file):
- .. runrecord:: _examples/DL-101-137-127
- :language: console
- :workdir: dl-101/DataLad-101
- :lines: 2, 48-
- :realcommand: echo "$ git cat-file --textconv $(git rev-parse HEAD~15):notes.txt" && git cat-file --textconv $(git rev-parse HEAD~15):notes.txt
- .. index::
- pair: cat-file; Git command
- The cat-file command is very versatile, and
- `it's documentation <https://git-scm.com/docs/git-cat-file>`_ will list all
- of its functionality. To use it to see the contents of a file at a previous
- state as done above, this is how the general structure looks like:
- .. code-block:: console
- $ git cat-file --textconv SHASUM:<path/to/file>
- Undoing latest modifications of files
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- Previously, we saw how to remove files from a datasets history that
- were accidentally saved and thus tracked for the first time.
- How does one undo a *modification* to a tracked file?
- Let's modify the saved ``Gitjoke1.txt``:
- .. runrecord:: _examples/DL-101-137-128
- :language: console
- :workdir: dl-101/DataLad-101
- $ echo "this is by far my favorite joke!" >> Gitjoke2.txt
- .. runrecord:: _examples/DL-101-137-129
- :language: console
- :workdir: dl-101/DataLad-101
- $ cat Gitjoke2.txt
- .. runrecord:: _examples/DL-101-137-130
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- .. runrecord:: _examples/DL-101-137-131
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad save -m "add joke evaluation to joke" Gitjoke2.txt
- How could this modification to ``Gitjoke2.txt`` be undone?
- With the :gitcmd:`reset` command again. If you want to
- "unsave" the modification but keep it in the file, use
- :gitcmd:`reset --mixed` as before. However, if you want to
- get rid of the modifications entirely, use the option ``--hard``
- instead of ``--mixed``:
- .. runrecord:: _examples/DL-101-137-132
- :language: console
- :workdir: dl-101/DataLad-101
- $ git log -n 2 --oneline
- .. runrecord:: _examples/DL-101-137-133
- :language: console
- :workdir: dl-101/DataLad-101
- :realcommand: echo "$ git reset --hard $(git rev-parse HEAD~1)" && git reset --hard $(git rev-parse HEAD~1)
- .. runrecord:: _examples/DL-101-137-134
- :language: console
- :workdir: dl-101/DataLad-101
- $ cat Gitjoke2.txt
- The change has been undone completely. This method will work with
- files stored in Git and annexed files.
- Note that this operation only restores this one file, because the commit that
- was undone only contained modifications to this one file. This is a
- demonstration of one of the reasons why one should strive for commits to
- represent meaningful logical units of change -- if necessary, they can be
- undone easily.
- Undoing past modifications of files
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- What :gitcmd:`reset` did was to undo commits from
- the most recent version of your dataset. How
- would one undo a change that happened a while ago, though,
- with important changes being added afterwards that you want
- to keep?
- Let's save a bad modification to ``Gitjoke2.txt``,
- but also a modification to ``notes.txt``:
- .. runrecord:: _examples/DL-101-137-140
- :language: console
- :workdir: dl-101/DataLad-101
- $ echo "bad modification" >> Gitjoke2.txt
- .. runrecord:: _examples/DL-101-137-141
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad save -m "did a bad modification" Gitjoke2.txt
- .. runrecord:: _examples/DL-101-137-142
- :language: console
- :workdir: dl-101/DataLad-101
- $ cat << EOT >> notes.txt
- Git has many handy tools to go back in forth in time and work with the
- history of datasets. Among many other things you can rewrite commit
- messages, undo changes, or look at previous versions of datasets.
- A superb resource to find out more about this and practice such Git
- operations is this chapter in the Pro-git book:
- https://git-scm.com/book/en/v2/Git-Tools-Rewriting-History
- EOT
- .. runrecord:: _examples/DL-101-137-143
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad save -m "add note on helpful git resource" notes.txt
- The objective is to remove the first, "bad" modification, but
- keep the more recent modification of ``notes.txt``. A :gitcmd:`reset`
- command is not convenient, because resetting would need to reset
- the most recent, "good" modification as well.
- One way to accomplish it is with an *interactive rebase*, using the
- :gitcmd:`rebase -i` command [#f5]_. Experienced Git-users will know
- under which situations and how to perform such an interactive rebase.
- However, outlining an interactive rebase here in the handbook could lead to
- problems for readers without (much) Git experience: An interactive rebase,
- even if performed successfully, can lead to many problems if it is applied with
- too little experience, for example, in any collaborative real-world project.
- .. index::
- pair: revert; Git command
- Instead, we demonstrate a different, less intrusive way to revert one or more
- changes at any point in the history of a dataset: the :gitcmd:`revert`
- command.
- Instead of *rewriting* the history, it will add an additional commit in which
- the changes of an unwanted commit are reverted.
- The command looks like this:
- .. code-block:: console
- $ git revert SHASUM
- where ``SHASUM`` specifies the commit hash of the modification that should
- be reverted.
- .. index::
- pair: revert multiple commit; with Git
- .. find-out-more:: Reverting more than a single commit
- You can also specify a range of commits like this:
- .. code-block:: console
- $ git revert OLDER_SHASUM..NEWERSHASUM
- This command will revert all commits starting with the one after
- ``OLDER_SHASUM`` (i.e. **not including** this commit) until and **including**
- the one specified with ``NEWERSHASUM``.
- For each reverted commit, one new commit will be added to the history that
- reverts it. Thus, if you revert a range of three commits, there will be three
- reversal commits. If you however want the reversal of a range of commits
- saved in a single commit, supply the ``--no-commit`` option as in
- .. code-block:: console
- $ git revert --no-commit OLDER_SHASUM..NEWERSHASUM
- After running this command, run a single ``git commit`` to conclude the
- reversal and save it in a single commit.
- Let's see how it looks like:
- .. runrecord:: _examples/DL-101-137-144
- :language: console
- :workdir: dl-101/DataLad-101
- :realcommand: echo "$ git revert $(git rev-parse HEAD~1)" && git revert $(git rev-parse HEAD~1)
- This is the state of the file in which we reverted a modification:
- .. runrecord:: _examples/DL-101-137-145
- :language: console
- :workdir: dl-101/DataLad-101
- $ cat Gitjoke2.txt
- It does not contain the bad modification anymore. And this is what happened in
- the history of the dataset:
- .. runrecord:: _examples/DL-101-137-146
- :language: console
- :workdir: dl-101/DataLad-101
- :emphasize-lines: 6-8, 20
- $ git log -n 3
- The commit that introduced the bad modification is still present, but it
- transparently gets undone with the most recent commit. At the same time, the
- good modification of ``notes.txt`` was not influenced in any way. The
- :gitcmd:`revert` command is thus a transparent and safe way of undoing past
- changes. Note though that this command can only be used efficiently if the
- commits in your datasets history are meaningful, independent units -- having
- several unrelated modifications in a single commit may make an easy solution
- with :gitcmd:`revert` impossible and instead require a complex
- :shcmd:`checkout`, :shcmd:`revert`, or :shcmd:`rebase` operation.
- Finally, let's take a look at the state of the dataset after this operation:
- .. runrecord:: _examples/DL-101-137-147
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- As you can see, unsurprisingly, the :gitcmd:`revert` command had no
- effects on anything else but the specified commit, and previously untracked
- files are still present.
- .. index::
- pair: resolve merge conflict; with Git
- Oh no! I'm in a merge conflict!
- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
- When working with the history of a dataset, especially when rewriting
- the history with an interactive rebase or when reverting commits, it is
- possible to run into so-called *merge conflicts*.
- Merge conflicts happen when Git needs assistance in deciding
- which changes to keep and which to apply. It will require
- you to edit the file the merge conflict is happening in with
- a text editor, but such merge conflict are by far not as scary as
- they may seem during the first few times of solving merge conflicts.
- This section is not a guide on how to solve merge-conflicts, but a broad
- overview on the necessary steps, and a pointer to a more comprehensive guide.
- - The first thing to do if you end up in a merge conflict is
- to read the instructions Git is giving you -- they are a useful guide.
- - Also, it is reassuring to remember that you can always get out of
- a merge conflict by aborting the operation that led to it (e.g.,
- ``git rebase --abort``).
- - To actually solve a merge conflict, you will have to edit files: In the
- documents the merge conflict applies to, Git marks the sections it needs
- help with with markers that consists of ``>``, ``<``, and ``=``
- signs and commit shasums or branch names.
- There will be two marked parts, and you have to delete the one you do not
- want to keep, as well as all markers.
- - Afterwards, run ``git add <path/to/file>`` and finally a ``git commit``.
- GitHub has an `excellent resource on how to deal with merge conflicts <https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/addressing-merge-conflicts/resolving-a-merge-conflict-using-the-command-line>`_.
- Summary
- ^^^^^^^
- This guest lecture has given you a glimpse into how to work with the
- history of your DataLad datasets.
- To conclude this section, let's remove all untracked contents from
- the dataset. This can be done with :gitcmd:`clean`: The command
- :gitcmd:`clean -f` swipes your dataset clean and removes any untracked
- file.
- **Careful! This is not revertible, and content lost with this commands cannot be recovered!**
- If you want to be extra sure, run :gitcmd:`clean -fn` beforehand -- this will
- give you a list of the files that would be deleted.
- .. runrecord:: _examples/DL-101-137-148
- :language: console
- :workdir: dl-101/DataLad-101
- $ git clean -f
- Afterwards, the :dlcmd:`status` returns nothing, indicating a
- clean dataset state with no untracked files or modifications.
- .. runrecord:: _examples/DL-101-137-149
- :language: console
- :workdir: dl-101/DataLad-101
- $ datalad status
- Finally, if you want, apply your new knowledge about reverting commits
- to remove the ``Gitjoke2.txt`` file.
- .. only:: adminmode
- Add a tag at the section end.
- .. runrecord:: _examples/DL-101-137-160
- :language: console
- :workdir: dl-101/DataLad-101
- $ git branch sct_back_and_forth_in_time
- .. rubric:: Footnotes
- .. [#f1] For example, the :dlcmd:`rerun` command introduced in section
- :ref:`run2` takes such a hash as an argument, and re-executes
- the ``datalad run`` or ``datalad rerun`` :term:`run record` associated with
- this hash. Likewise, the :gitcmd:`diff` command can work with commit hashes.
- .. [#f2] There are other alternatives to reference commits in the history of a dataset,
- for example, "counting" ancestors of the most recent commit using the notation
- ``HEAD~2``, ``HEAD^2`` or ``HEAD@{2}``. However, using hashes to reference
- commits is a very fail-save method and saves you from accidentally miscounting.
- .. [#f3] The option ``--mixed`` is the default mode for a :gitcmd:`reset`
- command, omitting it (i.e., running just ``git reset``) leads to the
- same behavior. It is explicitly stated in this book to make the mode
- clear, though.
- .. [#f4] Note though that rewriting history can be dangerous, and you should
- be aware of what you are doing. For example, rewriting parts of the
- dataset's history that have been published (e.g., to a GitHub repository)
- already or that other people have copies of, is not advised.
- .. [#f5] When in need to interactively rebase, please consult further documentation
- and tutorials. It is out of the scope of this handbook to be a complete
- guide on rebasing, and not all interactive rebasing operations are
- complication-free. However, you can always undo mistakes that occur
- during rebasing with the help of the `reflog <https://git-scm.com/docs/git-reflog>`_.
|