tutorial.rst 35 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455456457458459460461462463464465466467468469470471472473474475476477478479480481482483484485486487488489490491492493494495496497498499500501502503504505506507508509510511512513514515516517518519520521522523524525526527528529530531532533534535536537538539540541542543544545546547548549550551552553554555556557558559560561562563564565566567568569570571572573574575576577578579580581582583584585586587588589590591592593594595596597598599600601602603604605606607608609610611612613614615616617618619620621622623624625626627628629630631632633634635636637638639640641642643644645646647648649650651652653654655656657658659660661662663664665666667668669670671672673674675676677678679680681682683684685686687688689690691692693694695696697698699700701702703704705706707708709710711712713714715716717718719720721722723724725726727728729730731732733734735736737738739740741742743744745746747748749750751752753754755756757758759760761762763764765766767768769770771772773774775776777778779780781782783784785786787788789790791792793794795796797798799800801802803804805806807808809810811812813814815816817818819820821822823824825826827828829830831832833834835836837838839840841842843844845846847848849850851852853854855856857858859860861862863864865866867868869870871872873874875876877878879880881882883884885886887888889890891892893894895896897898899900901902903904905906907908909910911912913914915916917918919920921922923924925926927928929930931932933934935936937938939940941942943944945946947948949950951952953954955956957958959960961962963
  1. =============
  2. odML Tutorial
  3. =============
  4. :Author:
  5. Lyuba Zehl;
  6. based on work by Hagen Fritsch
  7. :Release:
  8. 1.4
  9. :License:
  10. Creative Commons Attribution-ShareAlike 4.0 International
  11. `License <http://creativecommons.org/licenses/by-sa/4.0/>`_
  12. -------------------------------------------------------------------------------
  13. odML (open metadata Markup Language)
  14. ====================================
  15. odML (open metadata Markup Language) is a framework, proposed by `Grewe et al.
  16. (2011) <http://journal.frontiersin.org/article/10.3389/fninf.2011.00016/full>`_,
  17. to organize and store experimental metadata in a human- and machine-readable,
  18. XML based format (odml). In this tutorial we will illustrate the conceptual
  19. design of the odML framework and show hands-on how you can generate your own
  20. odML metadata file collection. A well organized metadata management of your
  21. experiment is a key component to guarantee the reproducibility of your research
  22. and facilitate the provenance tracking of your analysis projects.
  23. What are metadata and why are they needed?
  24. Metadata are data about data. They describe the conditions under which the
  25. actual raw-data of an experimental study were acquired. The organization of
  26. such metadata and their accessibility may sound like a trivial task, and
  27. most laboratories developed their home-made solutions to keep track of
  28. their metadata. Most of these solutions, however, break down if data and
  29. metadata need to be shared within a collaboration, because implicit
  30. knowledge of what is important and how it is organized is often
  31. underestimated.
  32. While maintaining the relation to the actual raw-data, odML can help to
  33. collect all metadata which are usually distributed over several files and
  34. formats, and to store them unitedly which facilitates sharing data and
  35. metadata.
  36. Key features of odML
  37. - open, XML based language, to collect, store and share metadata
  38. - Machine- and human-readable
  39. - Python-odML library
  40. - Interactive odML-Editor
  41. -------------------------------------------------------------------------------
  42. Structure of this tutorial
  43. ==========================
  44. The scientific background of the possible user community of odML varies
  45. enormously (e.g. physics, informatics, mathematics, biology, medicine,
  46. psychology). Some users will be trained programmers, others probably have never
  47. learned a programming language.
  48. To cover the different demands of all users, we provide a slow introduction to
  49. the odML framework that even allows programming beginners to learn the basic
  50. concepts. We will demonstrate how to generate an odML file and present more
  51. advanced possibilities of the Python-odML library (e.g., how to search for
  52. certain metadata or how to integrate existing terminologies).
  53. At the end of this tutorial we will provide a few guidelines that will help you
  54. to create an odML file structure that is optimised for your individual
  55. experimental project and complements the special needs of your laboratory.
  56. The code for the example odML files, which we use within this tutorial is part
  57. of the documentation package (see doc/example_odMLs/).
  58. A summary of available odML terminologies and templates can be found `here
  59. <http://portal.g-node.org/odml/terminologies/v1.1/terminologies.xml>`_.
  60. -------------------------------------------------------------------------------
  61. Download and Installation
  62. =========================
  63. The odML framework is an open source project of the German Neuroinformatics
  64. Node (`G-Node <http://www.g-node.org/>`_, `odML project website
  65. <http://www.g-node.org/projects/odml>`_) of the International Neuroinformatics
  66. Coordination Facility (`INCF <http://www.g-node.org/>`_). The source code for
  67. the Python-odML library is available on `GitHub <https://github.com/>`_ under
  68. the project name `python-odml <https://github.com/G-Node/python-odml>`_.
  69. Dependencies
  70. ------------
  71. The Python-odML library (version 1.4) runs under Python 2.7 or 3.5.
  72. Additionally, the Python-odML library depends on Enum, lxml, pyyaml and rdflib.
  73. When the odML-Python library is installed via pip or the setup.py, these
  74. packages will be automatically downloaded and installed. Alternatively, they
  75. can be installed from the OS package manager.
  76. On Ubuntu, the dependency packages are available as ``python-enum`` and
  77. ``python-lxml``.
  78. Note that on Ubuntu 14.04, the latter package additionally requires the
  79. installation of ``libxml2-dev``, ``libxslt1-dev``, and ``lib32z1-dev``.
  80. Installation...
  81. ---------------
  82. ... via pip:
  83. ************
  84. The simplest way to install the Python-odML library is from `PyPI
  85. <https://pypi.python.org/>`_ using `pip <https://pip.pypa.io/en/stable/>`_::
  86. $ pip install odml
  87. The appropriate Python dependencies will be automatically
  88. downloaded and installed.
  89. If you are not familiar with PyPI and pip, please have a look at the available
  90. online documentation.
  91. Installation
  92. ------------
  93. To download the Python-odML library please either use git and clone the
  94. repository from GitHub::
  95. $ cd /home/usr/toolbox/
  96. $ git clone https://github.com/G-Node/python-odml.git
  97. ... or if you don't want to use git, download the ZIP file also provided on
  98. GitHub to your computer (e.g. as above on your home directory under a "toolbox"
  99. folder).
  100. To install the Python-odML library, enter the corresponding directory and run::
  101. $ cd /home/usr/toolbox/python-odml/
  102. $ python setup.py install
  103. Bugs & Questions
  104. ----------------
  105. Should you find a behaviour that is likely a bug, please file a bug report at
  106. `the github bug tracker <https://github.com/G-Node/python-odml/issues>`_.
  107. If you have questions regarding the use of the library or the editor, ask
  108. the question on `Stack Overflow <http://stackoverflow.com/>`_, be sure to tag
  109. it with `odml` and we'll do our best to quickly solve the problem.
  110. -------------------------------------------------------------------------------
  111. Basic knowledge on odML
  112. =======================
  113. Before we start, it is important to know the basic structure of an odML
  114. file. Within an odML file metadata are grouped and stored in a
  115. hierarchical tree structure which consists of three basic odML
  116. objects.
  117. Document:
  118. - description: *root of the tree*
  119. - parent: *no parent*
  120. - children: *Section*
  121. Section:
  122. - description: *branches of the tree*
  123. - parent: *Document or Section*
  124. - children: *Section and/or Property*
  125. Property:
  126. - description: *leafs of the tree (contains metadata values)*
  127. - parent: *Section*
  128. - children: *none*
  129. Each of these odML objects has a certain set of attributes where the
  130. user can describe the object and its contents. Which attribute belongs
  131. to which object and what the attributes are used for, is better explained
  132. in an example odML file (e.g., "THGTTG.odml").
  133. A first look
  134. ============
  135. If you want to get familiar with the concept behind the odML framework and how
  136. to handle odML files in Python, you can have a first look at the example odML
  137. file provided in the Python-odML library. For this you first need to run the
  138. python code ("thgttg.py") to generate the example odML file ("THGTTG.odml").
  139. When using the following commands, make sure you adapt the paths to the
  140. python-odml module to your owns!::
  141. $ cd /home/usr/.../python-odml
  142. $ ls doc/example_odMLs
  143. thgttg.py
  144. $ python doc/example_odMLs/example_odMLs.py "/home/usr/.../python-odml"
  145. $ ls doc/example_odMLs
  146. THGTTG.odml thgttg.py
  147. Now open a Python shell within the Python-odML library directory, e.g. with
  148. IPython::
  149. $ ipython
  150. In the IPython shell, first import the odml package::
  151. >>> import odml
  152. Second, load the example odML file with the following command lines::
  153. >>> to_load = './doc/example_odMLs/THGTTG.odml'
  154. >>> odmlEX = odml.load(to_load)
  155. If you open a Python shell outside of the Python-odML library directory, please
  156. adapt your Python-Path and the path to the "THGTTG.odml" file accordingly.
  157. How you can access the different odML objects and their attributes once you
  158. loaded an odML file and how you can make use of the attributes is described in
  159. more detail in the following chapters for each odML object type (Document,
  160. Section, Property).
  161. How you can create the different odML objects on your own and how to connect
  162. them to build your own metadata odML file will be described in later chapters.
  163. Further advanced functions you can use to navigate through your odML files, or to
  164. create an odML template file, or to make use of common odML terminologies
  165. provided via `the G-Node repository
  166. <http://portal.g-node.org/odml/terminologies/v1.1/terminologies.xml>`_ can also
  167. be found later on in this tutorial.
  168. But now, let us first have a look at the example odML file (THGTTG.odml)!
  169. The Document
  170. ------------
  171. If you loaded the example odML file, let's have a first look at the Document::
  172. >>> print odmlEX
  173. Document 42 {author = D. N. Adams, 2 sections}
  174. As you can see, the printout gives you a short summary of the Document of the
  175. loaded example odML file.
  176. The print out gives you already the follwing information about the odML file:
  177. - ``Document`` tells you that you are looking at an odML Document
  178. - ``42`` is the user defined version of this odML file
  179. - ``{...}`` provides ``author`` and number of attached sections
  180. - ``author`` states the author of the odML file, "D. N. Adams" in the example case
  181. - ``2 sections`` tells you that this odML Document has 2 Section directly
  182. appended
  183. Note that the Document printout tells you nothing about the depth of the
  184. complete tree structure, because it is not displaying the children of its
  185. directly attached Sections. It also does not display all Document attributes.
  186. In total, a Document has the following 4 attributes:
  187. author
  188. - Returns the author (returned as string) of this odML file.
  189. date
  190. - Returns ta user defined date. Could for example be used to state
  191. the date of first creation or the date of last changes.
  192. document
  193. - Returns the current Document object.
  194. parent
  195. - Returns the parent object (which is ``None`` for a Document).
  196. repository
  197. - Returns the URL (returned as string) to a user defined repository of
  198. terminologies used in this Document. Could be the URL to the G-Node
  199. terminologies or to a user defined template.
  200. version
  201. - Returns the user defined version (returned as string) of this odML file.
  202. id
  203. - id is a UUID (universally unique identifiers) that uniquely identifies
  204. the current document. If not otherwise specified, this id is automatically
  205. created and assigned.
  206. Let's check out all attributes with the following commands::
  207. >>> print(odmlEX.author)
  208. D. N. Adams
  209. >>> print(odmlEX.date)
  210. 1979-10-12
  211. >>> print(odmlEX.document)
  212. Document 42 {author = D. N. Adams, 2 sections}
  213. >>> print(odmlEX.parent)
  214. None
  215. >>> print(odmlEX.repository)
  216. http://portal.g-node.org/odml/terminologies/v1.1/terminologies.xml
  217. >>> print(odmlEX.version)
  218. 42
  219. As expected for a Document, the attributes author and version match the
  220. information given in the Document printout, the document attribute just returns
  221. the Document, and the parent attribute is ``None``.
  222. As you learned in the beginning, Sections can be attached to a Document. They
  223. represent the next hierarchy level of an odML file. Let's have a look which
  224. Sections were attached to the Document of our example odML file using the
  225. following command::
  226. >>> print(odmlEX.sections)
  227. [Section[4|2] {name = TheCrew, type = crew, id = ...},
  228. Section[1|7] {name = TheStarship, type = starship, id = ...}]
  229. As expected from the Document printout our example contains two Sections. The
  230. printout and attributes of a Section are explained in the next chapter.
  231. The Sections
  232. ------------
  233. There are several ways to access Sections. You can either call them by name or
  234. by index using either explicitly the function that returns the list of
  235. Sections (see last part of `The Document`_ chapter) or using again a short cut
  236. notation. Let's test all the different ways to access a Section, by having a
  237. look at the first Section in the sections list attached to the Document in our
  238. example odML file::
  239. >>> print(odmlEX.sections['TheCrew'])
  240. Section[4|2] {name = TheCrew, type = crew, id = ...}
  241. >>> print(odmlEX.sections[0])
  242. Section[4|2] {name = TheCrew, type = crew, id = ...}
  243. >>> print(odmlEX['TheCrew'])
  244. Section[4|2] {name = TheCrew, type = crew, id = ...}
  245. >>> print(odmlEX[0])
  246. Section[4|2] {name = TheCrew, type = crew, id = ...}
  247. In the following we will call Sections explicitly by their name using the
  248. short cut notation.
  249. The printout of a Section is similar to the Document printout and gives you
  250. already the following information:
  251. - ``Section`` tells you that you are looking at an odML Section
  252. - ``[4|2]`` states that this Section has four Sections and two Properties directly attached to it
  253. - ``{...}`` provides ``name``, ``type`` and ``id`` of the Section
  254. - ``name`` is the name of this Section, 'TheCrew' in the example case
  255. - ``type`` provides the type of the Section, 'crew' in the example case
  256. - ``id`` provides the uuid of the Section, the actual value has been omitted in the example to improve readability.
  257. Note that the Section printout tells you nothing about the depth of a possible
  258. sub-Section tree below the directly attached ones. It also only list the type
  259. of the Section as one of the Section attributes. In total, a Section can be
  260. defined by the following 5 attributes:
  261. name
  262. - Returns the name of this Section. Should indicate what kind of
  263. information can be found in this Section.
  264. definition
  265. - Returns the definition of the content within this Section. Should
  266. describe what kind of information can be found in this Section.
  267. document
  268. - Returns the Document to which this Section belongs to. Note that this
  269. attribute is set automatically for a Section and all its children when
  270. it is attached to a Document.
  271. parent
  272. - Returns the parent to which this Section was directly attached to. Can be
  273. either a Document or another Section.
  274. type
  275. - Returns the classification type which allows to connect related Sections
  276. due to a superior semantic context.
  277. reference
  278. - Returns a reference that can be used to state the origin or source file
  279. of the metadata stored in the Properties that are grouped by this
  280. Section.
  281. repository
  282. - Returns the URL (returned as string) to a user defined repository of
  283. terminologies used in this Document. Could be the URL to the G-Node
  284. terminologies or to a user defined template.
  285. id
  286. - id is a UUID (universally unique identifiers) that uniquely identifies
  287. the current section. If not otherwise specified, this id is automatically
  288. created and assigned.
  289. Let's have a look at the attributes for the Section 'TheCrew'::
  290. >>> print(odmlEX['TheCrew'].name)
  291. TheCrew
  292. >>> print(odmlEX['TheCrew'].definition)
  293. Information on the crew
  294. >>> print(odmlEX['TheCrew'].document)
  295. Document 42 {author = D. N. Adams, 2 sections}
  296. >>> print(odmlEX['TheCrew'].parent)
  297. Document 42 {author = D. N. Adams, 2 sections}
  298. >>> print(odmlEX['TheCrew'].type)
  299. crew
  300. >>> print(odmlEX['TheCrew'].reference)
  301. None
  302. >>> print(odmlEX['TheCrew'].repository)
  303. None
  304. >>> print(odmlEX['TheCrew'].id)
  305. None
  306. As expected for this Section, the name and type attribute match the information
  307. given in the Section printout, and the document and parent attribute return the
  308. same object, namely our example Document.
  309. To see which Sections are directly attached to the Section 'TheCrew' again use
  310. the following command::
  311. >>> print(odmlEX['TheCrew'].sections)
  312. [Section[0|5] {name = Arthur Philip Dent, type = crew/person, id = ...},
  313. Section[0|5] {name = Zaphod Beeblebrox, type = crew/person, id = ...},
  314. Section[0|5] {name = Tricia Marie McMillan, type = crew/person, id = ...},
  315. Section[0|5] {name = Ford Prefect, type = crew/person, id = ...}]
  316. Or, for accessing these sub-Sections::
  317. >>> print(odmlEX['TheCrew'].sections['Ford Prefect'])
  318. Section[0|5] {name = Ford Prefect, type = crew/person, id = ...}
  319. >>> print(odmlEX['TheCrew'].sections[3])
  320. Section[0|5] {name = Ford Prefect, type = crew/person, id = ...}
  321. >>> print(odmlEX['TheCrew']['Ford Prefect'])
  322. Section[0|5] {name = Ford Prefect, type = crew/person, id = ...}
  323. >>> print(odmlEX['TheCrew'][3])
  324. Section[0|5] {name = Ford Prefect, type = crew/person, id = ...}
  325. As you learned, besides sub-Sections, a Section can also have Properties
  326. attached. Let's see which Properties are attached to the Section 'TheCrew'::
  327. >>> print(odmlEX['TheCrew'].properties)
  328. [Property: {name = NameCrewMembers},
  329. Property: {name = NoCrewMembers}]
  330. The printout and attributes of a Property are explained in the next chapter.
  331. The Properties
  332. --------------
  333. Properties need to be called explicitly via the properties function of a
  334. Section. You can then either call a Property by name or by index::
  335. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'])
  336. Property: {name = NoCrewMembers}
  337. >>> print(odmlEX['TheCrew'].properties[1])
  338. Property: {name = NoCrewMembers}
  339. In the following we will only call Properties explicitly by their name.
  340. The Property printout is reduced and only gives you information about the
  341. following:
  342. - ``Property`` tells you that you are looking at an odML Property
  343. - ``{...}`` provides the ``name`` of the Property
  344. - ``NoCrewMembers`` is the name of this Property
  345. Note that the Property printout tells you nothing about the number of Values,
  346. and very little about the Property attributes. In total, a Property can be
  347. defined by the following 9 attributes:
  348. name
  349. - Returns the name of the Property. Should indicate what kind of metadata
  350. are stored in this Property.
  351. definition
  352. - Returns the definition of this Property. Should describe what kind of
  353. metadata are stored in this Property.
  354. document
  355. - Returns the Document to which the parent Section of this Property belongs
  356. to. Note that this attribute is set automatically for a Section and all
  357. its children when it is attached to a Document.
  358. parent
  359. - Returns the parent Section to which this Property was attached to.
  360. values
  361. - Returns the metadata of this Property. Can be either a single metadata or
  362. multiple, but homogeneous metadata (all with same dtype and unit). For
  363. this reason, the output is always provided as a list.
  364. dtype
  365. - Returns the odml data type of the stored metadata.
  366. unit
  367. - Returns the unit of the stored metadata.
  368. uncertainty
  369. - recommended
  370. - Can be used to specify the uncertainty of the given metadata value.
  371. reference
  372. - Returns a reference that can be used to state an external definition
  373. of the metadata value.
  374. dependency
  375. - optional
  376. - A name of another Property within the same section, which this property
  377. depends on.
  378. dependency_value
  379. - optional
  380. - Value of the other Property specified in the 'dependency' attribute on
  381. which this Property depends on.
  382. value_origin
  383. - A reference to state the origin of the metadata value e.g. a file name.
  384. Let's check which attributes were defined for the Property 'NoCrewMembers'::
  385. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].name)
  386. NoCrewMembers
  387. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].definition)
  388. Number of crew members
  389. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].document)
  390. Document 42 {author = D. N. Adams, 2 sections}
  391. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].values)
  392. [4]
  393. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].dtype)
  394. int
  395. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].unit)
  396. None
  397. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].uncertainty)
  398. 1
  399. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].reference)
  400. The Hitchhiker's guide to the Galaxy (novel)
  401. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].dependency)
  402. None
  403. >>> print(odmlEX['TheCrew'].properties['NoCrewMembers'].dependency_value)
  404. None
  405. As mentioned the values attribute of a Property can only contain multiple
  406. metadata when they have the same ``dtype`` and ``unit``, as it is the case for
  407. the Property 'NameCrewMembers'::
  408. >>> print(odmlEX['TheCrew'].properties['NameCrewMembers'].values)
  409. ['Arthur Philip Dent',
  410. 'Zaphod Beeblebrox',
  411. 'Tricia Marie McMillan',
  412. 'Ford Prefect']
  413. >>> print(odmlEX['TheCrew'].properties['NameCrewMembers'].dtype)
  414. person
  415. >>> print(odmlEX['TheCrew'].properties['NameCrewMembers'].unit)
  416. None
  417. NOTE: 'property.values' will always return a copy! Any direct changes to the
  418. returned list will have no affect on the actual property values. If you want to
  419. make changes to a property value, either use the 'append', 'extend' and 'remove'
  420. methods or assign a new value list to the property.
  421. -------------------------------------------------------------------------------
  422. Generating an odML-file
  423. =======================
  424. After getting familiar with the different odML objects and their attributes,
  425. you will now learn how to generate your own odML file by reproducing some parts
  426. of the example THGTTG.odml.
  427. We will show you first how to create the different odML objects with their
  428. attributes. Please note that some attributes are obligatory, some are
  429. recommended and others are optional when creating the corresponding odML
  430. objects. A few are automatically generated in the process of creating an odML
  431. file. Furthermore, all attributes of an odml object can be edited at any time.
  432. If you opened a new IPython shell, please import first again the odml package::
  433. >>> import odml
  434. Create a document
  435. -----------------
  436. Let's start by creating the Document. Note that none of the Document attributes
  437. are obligatory::
  438. >>> MYodML = odml.Document()
  439. You can check if your new Document contains actually what you created by using
  440. some of the commands you learned before::
  441. >>> MYodML
  442. >>> Document None {author = None, 0 sections}
  443. As you can see, we created an "empty" Document where the version and the author
  444. attributes are not defined and no section is yet attached. You will learn how to create
  445. and add a Section to a Document in the next chapter. Let's focus here on defining
  446. the Document attributes::
  447. >>> MYodML.author = 'D. N. Adams'
  448. >>> MYodML.version = 42
  449. For the date attribute you require a datetime object as entry. For this reason,
  450. you need to first import the Python package datetime::
  451. >>> import datetime as dt
  452. Now, let's define the date attribute of the Document::
  453. >>> MYodML.date = dt.date(1979, 10, 12)
  454. Next, let us also add a repository attribute. Exemplary, we can import the
  455. Python package os to extract the absolute path to our previously used example
  456. odML file and add this as repository::
  457. >>> import os
  458. >>> url2odmlEX = 'file:///' + os.path.abspath(to_load)
  459. >>> MYodML.repository = url2odmlEX
  460. The document and parent attribute are automatically set and should not be
  461. fiddled with.
  462. Check if your new Document contains actually all attributes now::
  463. >>> print(MYodML.author)
  464. D. N. Adams
  465. >>> print(MYodML.date)
  466. 1979-10-12
  467. >>> print(MYodML.document)
  468. Document 42 {author = D. N. Adams, 0 sections}
  469. >>> print(MYodML.parent)
  470. None
  471. >>> print(MYodML.repository)
  472. file:///home/usr/.../python-odml/doc/example_odMLs/THGTTG.odml
  473. >>> print(MYodML.version)
  474. 42
  475. Note that you can also define all attributes when first creating a Document::
  476. >>> MYodML = odml.Document(author='D. N. Adams',
  477. version=42,
  478. date=dt.date(1979, 10, 12),
  479. repository=url2odmlEX)
  480. Our new created Document is, though, still "empty", because it does not contain
  481. yet Sections. Let's change this!
  482. Create a section
  483. ----------------
  484. We now create a Section by reproducing the Section "TheCrew" of the example
  485. odml file from the beginning::
  486. >>> sec1 = odml.Section(name="TheCrew",
  487. definition="Information on the crew",
  488. type="crew")
  489. Note that only the attribute name is obligatory. The attributes definition and
  490. type are recommended, but could be either not defined at all or defined later
  491. on.
  492. Let us now attach this Section to our previously generated Document. With this,
  493. the attribute document and parent of our new Section are automatically
  494. updated::
  495. >>> MYodML.append(sec1)
  496. >>> print(MYodML)
  497. Document 42 {author = D. N. Adams, 1 sections}
  498. >>> print(MYodML.sections)
  499. [Section[0|0] {name = TheCrew, type = crew, id = ...}]
  500. >>> print(sec1.document)
  501. Document 42 {author = D. N. Adams, 1 sections}
  502. >>> print(sec1.parent)
  503. Document 42 {author = D. N. Adams, 1 sections}
  504. It is also possible to connect a Section directly to a parent object.
  505. Let's try this with the next Section we create::
  506. >>> sec2 = odml.Section(name="Arthur Philip Dent",
  507. definition="Information on Arthur Dent",
  508. type="crew/person",
  509. parent=sec1)
  510. >>> print(sec2)
  511. Section[0|0] {name = Arthur Philip Dent, type = crew/person, id = ...}
  512. >>> print(sec2.document)
  513. Document 42 {author = D. N. Adams, 1 sections}
  514. >>> print(sec2.parent)
  515. [Section[1|0] {name = TheCrew, type = crew, id = ...}
  516. Note that all of our created Sections do not contain any Properties yet. Let's
  517. see if we can change this...
  518. Create a Property:
  519. ------------------
  520. Let's create our first Property::
  521. >>> prop1 = odml.Property(name="Gender",
  522. definition="Sex of the subject",
  523. values="male")
  524. Note that again, only the name attribute is obligatory for creating a Property.
  525. The remaining attributes can be defined later on, or are automatically
  526. generated in the process.
  527. If a value is defined, but the dtype is not, as it is the case for our example
  528. above, the dtype is deduced automatically::
  529. >>> print(prop1.dtype)
  530. string
  531. Generally, you can use the following odML data types to describe the format of
  532. the stored metadata:
  533. +-----------------------------------+---------------------------------------+
  534. | dtype | required data examples |
  535. +===================================+=======================================+
  536. | odml.DType.int or 'int' | 42 |
  537. +-----------------------------------+---------------------------------------+
  538. | odml.DType.float or 'float' | 42.0 |
  539. +-----------------------------------+---------------------------------------+
  540. | odml.DType.boolean or 'boolean' | True or False |
  541. +-----------------------------------+---------------------------------------+
  542. | odml.DType.string or 'string' | 'Earth' |
  543. +-----------------------------------+---------------------------------------+
  544. | odml.DType.date or 'date' | dt.date(1979, 10, 12) |
  545. +-----------------------------------+---------------------------------------+
  546. | odml.DType.datetime or 'datetime' | dt.datetime(1979, 10, 12, 11, 11, 11) |
  547. +-----------------------------------+---------------------------------------+
  548. | odml.DType.time or 'time' | dt.time(11, 11, 11) |
  549. +-----------------------------------+---------------------------------------+
  550. | odml.DType.person or 'person' | 'Zaphod Beeblebrox' |
  551. +-----------------------------------+---------------------------------------+
  552. | odml.DType.text or 'text' | 'any text containing \n linebreaks' |
  553. +-----------------------------------+---------------------------------------+
  554. | odml.DType.url or 'url' | "https://en.wikipedia.org/wiki/Earth" |
  555. +-----------------------------------+---------------------------------------+
  556. | odml.DType.tuple | "(39.12; 67.19)" |
  557. +-----------------------------------+---------------------------------------+
  558. The available types are implemented in the 'odml.dtypes' Module. Note that the
  559. last four data types, if not defined, cannot be deduced, but are instead
  560. always interpreted as string.
  561. If we append now our new Property to the previously created sub-Section
  562. 'Arthur Philip Dent', the Property will also inherit the document attribute and
  563. automatically update its parent attribute::
  564. >>> MYodML['TheCrew']['Arthur Philip Dent'].append(prop1)
  565. >>> print(prop1.document)
  566. Document 42 {author = D. N. Adams, 1 sections}
  567. >>> print(prop1.parent)
  568. Section[0|1] {name = Arthur Philip Dent, type = crew/person, id = ...}
  569. Next, let us create a Property with multiple metadata entries::
  570. >>> prop2 = odml.Property(name="NameCrewMembers",
  571. definition="List of crew members names",
  572. values=["Arthur Philip Dent",
  573. "Zaphod Beeblebrox",
  574. "Tricia Marie McMillan",
  575. "Ford Prefect"],
  576. dtype=odml.DType.person)
  577. As you learned before, in such a case, the metadata entries must be
  578. homogeneous! That means they have to be of the same dtype, unit, and
  579. uncertainty (here ``odml.DType.person``, None, and None, respectively).
  580. To further build up our odML file, let us attach now this new Property to the
  581. previously created Section 'TheCrew'::
  582. >>> MYodML['TheCrew'].append(prop2)
  583. Note that it is also possible to add a metadata entry later on::
  584. >>> prop2.append("Blind Passenger")
  585. >>> print(MYodML['TheCrew'].properties['NameCrewMembers'].values)
  586. ['Arthur Philip Dent',
  587. 'Zaphod Beeblebrox',
  588. 'Tricia Marie McMillan',
  589. 'Ford Prefect',
  590. 'Blind Passenger']
  591. The tuple datatype you might have noticed in the dtype table above has to be
  592. specially handled. It is intended to enforce a specific number of datapoints
  593. for each value entry. This is useful in case of 2D or 3D data, where all
  594. datapoints always have to be present for each entry.
  595. The dtype itself has to contain the number corresponding to the required value
  596. data points. For the value data points themselves, they have to be enclosed
  597. by brackets and separated by a semicolon.
  598. >>> pixel_prop = odml.Property(name="pixel map")
  599. >>> pixel_prop.dtype = "2-tuple"
  600. >>> pixel_prop.values = ["(1; 2)", "(3; 4)"]
  601. >>> voxel_prop = odml.Property(name="voxel map")
  602. >>> voxel_prop.dtype = "3-tuple"
  603. >>> voxel_prop.values = "(1; 2; 3)"
  604. Please note, that inconsistent tuple values will raise an error:
  605. >>> tprop = odml.Property(name="tuple fail")
  606. >>> tprop.dtype = "3-tuple"
  607. >>> tprop.values = ["(1; 2)"]
  608. Printing XML-representation of an odML file:
  609. --------------------------------------------
  610. Although the XML-representation of an odML file is a bit hard to read, it is
  611. sometimes helpful to check, especially during a generation process, how the
  612. hierarchical structure of the odML file looks like.
  613. Let's have a look at the XML-representation of our small odML file we just
  614. generated::
  615. >>> print(odml.tools.xmlparser.XMLWriter(MYodML))
  616. <odML version="1.1">
  617. <date>1979-10-12</date>
  618. <section>
  619. <definition>Information on the crew</definition>
  620. <property>
  621. <definition>List of crew members names</definition>
  622. <name>NameCrewMembers</name>
  623. <type>person</type>
  624. <value>[Arthur Philip Dent,Zaphod Beeblebrox,Tricia Marie McMillan,Ford Prefect,Blind Passenger&#13;]</value>
  625. </property>
  626. <name>TheCrew</name>
  627. <section>
  628. <definition>Information on Arthur Dent</definition>
  629. <property>
  630. <definition>Sex of the subject</definition>
  631. <name>Gender</name>
  632. <type>string</type>
  633. <value>[male&#13;]</value>
  634. </property>
  635. <name>Arthur Philip Dent</name>
  636. <type>crew/person</type>
  637. </section>
  638. <type>crew</type>
  639. </section>
  640. <version>42</version>
  641. <repository>file:///home/zehl/Projects/toolbox/python-odml/doc/example_odMLs/THGTTG.odml</repository>
  642. <author>D. N. Adams</author>
  643. </odML>
  644. Saving an odML file:
  645. --------------------
  646. You can save your odML file using the following command::
  647. >>> save_to = '/home/usr/toolbox/python-odml/doc/example_odMLs/myodml.odml'
  648. >>> odml.save(MYodML, save_to)
  649. Loading an odML file:
  650. ---------------------
  651. You already learned how to load the example odML file. Here just as a reminder
  652. you can try to reload your own saved odML file::
  653. >>> my_reloaded_odml = odml.load(save_to)
  654. -------------------------------------------------------------------------------
  655. Advanced odML-Features
  656. ======================
  657. Advanced knowledge on Values
  658. ----------------------------
  659. Data type conversions
  660. *********************
  661. After creating a Property with metadata the data type can be changed and the
  662. format of the corresponding entry will converted to the new data type, if the
  663. new format is valid for the given metadata::
  664. >>> test_dtype_conv = odml.Property('p', values=1.0)
  665. >>> print(test_dtype_conv.values)
  666. [1.0]
  667. >>> print(test_dtype_conv.dtype)
  668. float
  669. >>> test_dtype_conv.dtype = odml.DType.int
  670. >>> print(test_dtype_conv.values)
  671. [1]
  672. >>> print(test_dtype_conv.dtype)
  673. int
  674. If the conversion is invalid a ValueError is raised.
  675. Also note, that during such a process metadata loss may occur if a float is
  676. converted to an integer and then back to a float::
  677. >>> test_dtype_conv = odml.Property('p', values=42.42)
  678. >>> print(test_dtype_conv.values)
  679. [42.42]
  680. >>> test_dtype_conv.dtype = odml.DType.int
  681. >>> test_dtype_conv.dtype = odml.DType.float
  682. >>> print(test_dtype_conv.values)
  683. [42.0]
  684. Advanced knowledge on Properties
  685. --------------------------------
  686. Advanced knowledge on Sections
  687. ------------------------------
  688. Links & Includes
  689. ****************
  690. (DEPRECATED; new version coming soon)
  691. Sections can be linked to other Sections, so that they include their defined
  692. attributes. A link can be within the document (``link`` property) or to an
  693. external one (``include`` property).
  694. After parsing a document, these links are not yet resolved, but can be using
  695. the :py:meth:`odml.doc.BaseDocument.finalize` method::
  696. >>> d = xmlparser.load("sample.odml")
  697. >>> d.finalize()
  698. Note: Only the parser does not automatically resolve link properties, as the referenced
  699. sections may not yet be available.
  700. However, when manually setting the ``link`` (or ``include``) attribute, it will
  701. be immediately resolved. To avoid this behaviour, set the ``_link`` (or ``_include``)
  702. attribute instead.
  703. The object remembers to which one it is linked in its ``_merged`` attribute.
  704. The link can be unresolved manually using :py:meth:`odml.section.BaseSection.unmerge`
  705. and merged again using :py:meth:`odml.section.BaseSection.merge`.
  706. Unresolving means to remove sections and properties that do not differ from their
  707. linked equivalents. This should be done globally before saving using the
  708. :py:meth:`odml.doc.BaseDocument.clean` method::
  709. >>> d.clean()
  710. >>> xmlparser.XMLWriter(d).write_file('sample.odml')
  711. Changing a ``link`` (or ``include``) attribute will first unmerge the section and
  712. then set merge with the new object.
  713. Terminologies
  714. *************
  715. (deprecated; new version coming soon)
  716. odML supports terminologies that are data structure templates for typical use cases.
  717. Sections can have a ``repository`` attribute. As repositories can be inherited,
  718. the current applicable one can be obtained using the
  719. :py:meth:`odml.section.BaseSection.get_repository` method.
  720. To see whether an object has a terminology equivalent, use the
  721. :py:meth:`odml.property.BaseProperty.get_terminology_equivalent`
  722. method, which returns the corresponding object of the terminology.