101-112-run4.rst 6.7 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194
  1. .. _run5:
  2. Clean desk
  3. ----------
  4. Just now you realize that you need to fit both logos onto the same slide.
  5. "Ah, damn, I might then really need to have them 400 by 400 pixel to fit",
  6. you think. "Good that I know how to not run into the permission denied errors anymore!"
  7. Therefore, we need to do the :dlcmd:`run` command yet again - we wanted to have
  8. the image in 400x400 px size. "Now this definitely will be the last time I'm running this",
  9. you think.
  10. .. runrecord:: _examples/DL-101-112-101
  11. :language: console
  12. :workdir: dl-101/DataLad-101
  13. :emphasize-lines: 5
  14. :notes: mhh, 450x450px seems a bit large, we have to go back to 400. Lets make yet another, complete run command
  15. :exitcode: 1
  16. :cast: 02_reproducible_execution
  17. $ datalad run -m "Resize logo for slides" \
  18. --input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
  19. --output "recordings/interval_logo_small.jpg" \
  20. "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"
  21. Oh for f**** sake... run is "impossible"?
  22. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  23. Weird. After the initial annoyance about yet another error message faded,
  24. and you read on,
  25. DataLad informs that a "clean dataset" is required.
  26. Run a :dlcmd:`status` to see what is meant by this:
  27. .. runrecord:: _examples/DL-101-112-102
  28. :language: console
  29. :workdir: dl-101/DataLad-101
  30. :notes: What happened? The dataset is not "clean"
  31. :cast: 02_reproducible_execution
  32. $ datalad status
  33. Ah right. We forgot to save the notes we added, and thus there are
  34. unsaved modifications present in ``DataLad-101``.
  35. But why is this a problem?
  36. By default, at the end of a :dlcmd:`run` is a :dlcmd:`save`.
  37. Remember the section :ref:`populate`: A general :dlcmd:`save` without
  38. a path specification will save *all* of the modified or untracked
  39. contents to the dataset.
  40. Therefore, in order to not mix any changes in the dataset that are unrelated
  41. to the command plugged into :dlcmd:`run`, by default it will only run
  42. on a clean dataset with no changes or untracked files present.
  43. There are two ways to get around this error message:
  44. The more obvious -- and recommended -- one is to save the modifications,
  45. and run the command in a clean dataset.
  46. We will try this way with the ``logo_interval.jpg``.
  47. It would look like this:
  48. First, save the changes,
  49. .. runrecord:: _examples/DL-101-112-103
  50. :language: console
  51. :workdir: dl-101/DataLad-101
  52. :notes: One way to prevent this is to have a clean dataset state
  53. :cast: 02_reproducible_execution
  54. $ datalad save -m "add additional notes on run options"
  55. and then try again:
  56. .. runrecord:: _examples/DL-101-112-104
  57. :language: console
  58. :workdir: dl-101/DataLad-101
  59. :notes: let's try again with a clean dataset
  60. :cast: 02_reproducible_execution
  61. $ datalad run -m "Resize logo for slides" \
  62. --input "recordings/longnow/.datalad/feed_metadata/logo_interval.jpg" \
  63. --output "recordings/interval_logo_small.jpg" \
  64. "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_interval.jpg recordings/interval_logo_small.jpg"
  65. Note how in this execution of :dlcmd:`run`, output unlocking was actually
  66. necessary and DataLad provides a summary of this action in its output.
  67. Add a quick addition to your notes about this way of cleaning up prior
  68. to a :dlcmd:`run`:
  69. .. runrecord:: _examples/DL-101-112-105
  70. :language: console
  71. :workdir: dl-101/DataLad-101
  72. :notes: we'll make a note on clean datasets (which we won't save)
  73. :cast: 02_reproducible_execution
  74. $ cat << EOT >> notes.txt
  75. Important! If the dataset is not "clean" (a datalad status output is
  76. empty), datalad run will not work - you will have to save
  77. modifications present in your dataset.
  78. EOT
  79. .. index::
  80. pair: run command on dirty dataset; with DataLad run
  81. A way of executing a :dlcmd:`run` *despite* an "unclean" dataset,
  82. though, is to add the ``--explicit`` flag to :dlcmd:`run`.
  83. We will try this flag with the remaining ``logo_salt.jpg``. Note that
  84. we have an "unclean dataset" again because of the
  85. additional note in ``notes.txt``.
  86. .. runrecord:: _examples/DL-101-112-106
  87. :language: console
  88. :workdir: dl-101/DataLad-101
  89. :notes: alternatively, the --explicit flag allows run despite an unclean dataset. However, this will only save changes to --output
  90. :cast: 02_reproducible_execution
  91. $ datalad run -m "Resize logo for slides" \
  92. --input "recordings/longnow/.datalad/feed_metadata/logo_salt.jpg" \
  93. --output "recordings/salt_logo_small.jpg" \
  94. --explicit \
  95. "convert -resize 400x400 recordings/longnow/.datalad/feed_metadata/logo_salt.jpg recordings/salt_logo_small.jpg"
  96. With this flag, DataLad considers the specification of inputs and outputs to be "explicit".
  97. It does not warn if the repository is dirty, but importantly, it
  98. **only** saves modifications to the *listed outputs* (which is a problem in the
  99. vast amount of cases where one does not exactly know which outputs are produced).
  100. .. index::
  101. pair: explicit input/output declaration; with DataLad run
  102. .. importantnote:: Put explicit first!
  103. The ``--explicit`` flag has to be given anywhere *prior* to the command that
  104. should be run -- the command needs to be the last element of a
  105. :dlcmd:`run` call.
  106. A :dlcmd:`status` will show that your previously modified ``notes.txt``
  107. is still modified:
  108. .. runrecord:: _examples/DL-101-112-110
  109. :language: console
  110. :workdir: dl-101/DataLad-101
  111. :notes: the previously modified ``notes.txt`` is still modified:
  112. :cast: 02_reproducible_execution
  113. $ datalad status
  114. Add an additional note on the ``--explicit`` flag, and finally save your changes to ``notes.txt``.
  115. .. runrecord:: _examples/DL-101-112-107
  116. :language: console
  117. :workdir: dl-101/DataLad-101
  118. :notes: Note on --explicit flag
  119. :cast: 02_reproducible_execution
  120. $ cat << EOT >> notes.txt
  121. A suboptimal alternative is the --explicit flag, used to record only
  122. those changes done to the files listed with --output flags.
  123. EOT
  124. .. runrecord:: _examples/DL-101-112-108
  125. :language: console
  126. :workdir: dl-101/DataLad-101
  127. :notes: and save it
  128. :cast: 02_reproducible_execution
  129. $ datalad save -m "add note on clean datasets"
  130. To conclude this section on :dlcmd:`run`, take a look at the last :dlcmd:`run`
  131. commit to see a :term:`run record` with more content:
  132. .. runrecord:: _examples/DL-101-112-109
  133. :language: console
  134. :workdir: dl-101/DataLad-101
  135. :lines: 1, 24-50
  136. :emphasize-lines: 10, 14-19
  137. :notes: finally, lets see a more complex runrecord
  138. :cast: 02_reproducible_execution
  139. $ git log -p -n 2
  140. .. only:: adminmode
  141. Add a tag at the section end.
  142. .. runrecord:: _examples/DL-101-112-110
  143. :language: console
  144. :workdir: dl-101/DataLad-101
  145. $ git branch sct_clean_desk