adswa
/
handbook


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
							.. _run5:

Clean desk
----------

Just now you realize that you need to fit both logos onto the same slide.
"Ah, damn, I might then really need to have them 400 by 400 pixel to fit",
you think. "Good that I know how to not run into the permission denied errors anymore!"

Therefore, we need to do the :dlcmd:`run` command yet again - we wanted to have
the image in 400x400 px size. "Now this definitely will be the last time I'm running this",
you think.

.. runrecord:: _examples/DL-101-112-101
   :language: console
   :workdir: dl-101/DataLad-101
   :emphasize-lines: 5
   :notes: mhh, 450x450px seems a bit large, we have to go back to 400. Lets make yet another, complete run command
   :exitcode: 1
   :cast: 02_reproducible_execution

   $ datalad run -m "Resize logo for slides" \
   --input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
   --output "recordings/interval_logo_small.jpg" \
   "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"

Oh for f**** sake... run is "impossible"?
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Weird. After the initial annoyance about yet another error message faded,
and you read on,
DataLad informs that a "clean dataset" is required.
Run a :dlcmd:`status` to see what is meant by this:

.. runrecord:: _examples/DL-101-112-102
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: What happened? The dataset is not "clean"
   :cast: 02_reproducible_execution

   $ datalad status

Ah right. We forgot to save the notes we added, and thus there are
unsaved modifications present in ``DataLad-101``.
But why is this a problem?

By default, at the end of a :dlcmd:`run` is a :dlcmd:`save`.
Remember the section :ref:`populate`: A general :dlcmd:`save` without
a path specification will save *all* of the modified or untracked
contents to the dataset.

Therefore, in order to not mix any changes in the dataset that are unrelated
to the command plugged into :dlcmd:`run`, by default it will only run
on a clean dataset with no changes or untracked files present.

There are two ways to get around this error message:
The more obvious -- and recommended -- one is to save the modifications,
and run the command in a clean dataset.
We will try this way with the ``logo_interval.jpg``.
It would look like this:
First, save the changes,


.. runrecord:: _examples/DL-101-112-103
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: One way to prevent this is to have a clean dataset state
   :cast: 02_reproducible_execution

   $ datalad save -m "add additional notes on run options"

and then try again:

.. runrecord:: _examples/DL-101-112-104
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: let's try again with a clean dataset
   :cast: 02_reproducible_execution

   $ datalad run -m "Resize logo for slides" \
   --input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
   --output "recordings/interval_logo_small.jpg" \
   "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"

Note how in this execution of :dlcmd:`run`, output unlocking was actually
necessary and DataLad provides a summary of this action in its output.

Add a quick addition to your notes about this way of cleaning up prior
to a :dlcmd:`run`:

.. runrecord:: _examples/DL-101-112-105
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: we'll make a note on clean datasets (which we won't save)
   :cast: 02_reproducible_execution

   $ cat << EOT >> notes.txt
   Important! If the dataset is not "clean" (a datalad status output is
   empty), datalad run will not work - you will have to save
   modifications present in your dataset.
   EOT


.. index::
   pair: run command on dirty dataset; with DataLad run

A way of executing a :dlcmd:`run` *despite* an "unclean" dataset,
though, is to add the ``--explicit`` flag to :dlcmd:`run`.
We will try this flag with the remaining ``logo_salt.jpg``. Note that
we have an "unclean dataset" again because of the
additional note in ``notes.txt``.


.. runrecord:: _examples/DL-101-112-106
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: alternatively, the --explicit flag allows run despite an unclean dataset. However, this will only save changes to --output
   :cast: 02_reproducible_execution

   $ datalad run -m "Resize logo for slides" \
   --input "recordings/longnow/.datalad/feed_metadata/logo_salt.jpg" \
   --output "recordings/salt_logo_small.jpg" \
   --explicit \
   "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"

With this flag, DataLad considers the specification of inputs and outputs to be "explicit".
It does not warn if the repository is dirty, but importantly, it
**only** saves modifications to the *listed outputs* (which is a problem in the
vast amount of cases where one does not exactly know which outputs are produced).

.. index::
   pair: explicit input/output declaration; with DataLad run
.. importantnote:: Put explicit first!

   The ``--explicit`` flag has to be given anywhere *prior* to the command that
   should be run -- the command needs to be the last element of a
   :dlcmd:`run` call.

A :dlcmd:`status` will show that your previously modified ``notes.txt``
is still modified:

.. runrecord:: _examples/DL-101-112-110
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: the previously modified ``notes.txt`` is still modified:
   :cast: 02_reproducible_execution

   $ datalad status

Add an additional note on the ``--explicit`` flag, and finally save your changes to ``notes.txt``.

.. runrecord:: _examples/DL-101-112-107
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: Note on --explicit flag
   :cast: 02_reproducible_execution

   $ cat << EOT >> notes.txt
   A suboptimal alternative is the --explicit flag, used to record only
   those changes done to the files listed with --output flags.

   EOT

.. runrecord:: _examples/DL-101-112-108
   :language: console
   :workdir: dl-101/DataLad-101
   :notes: and save it
   :cast: 02_reproducible_execution

   $ datalad save -m "add note on clean datasets"

To conclude this section on :dlcmd:`run`, take a look at the last :dlcmd:`run`
commit to see a :term:`run record` with more content:

.. runrecord:: _examples/DL-101-112-109
   :language: console
   :workdir: dl-101/DataLad-101
   :lines: 1, 24-50
   :emphasize-lines: 10, 14-19
   :notes: finally, lets see a more complex runrecord
   :cast: 02_reproducible_execution

   $ git log -p -n 2


.. only:: adminmode

   Add a tag at the section end.

     .. runrecord:: _examples/DL-101-112-110
        :language: console
        :workdir: dl-101/DataLad-101

        $ git branch sct_clean_desk