some default

Christian Mönch ddaab0882c fix a lying comment пре 2 година
.github 1eed3cb1ba update nosetest workflow пре 2 година
dataladmetadatamodel ddaab0882c fix a lying comment пре 2 година
simple_test 21754ace05 TEMP пре 2 година
tools 262f8f1513 fix wrong instantiation, add asserts пре 2 година
.gitignore 6429186ba8 NF: add DatasetTree and its mapper пре 3 година
LICENSE 47efdc66f7 Initial commit пре 3 година
README.md b81185f6c5 DOC: minor fixup to steps in README.md which worked for me пре 3 година
pyproject.toml 96850a711f NF: improve package definition пре 3 година
requirements.txt 416e3d9f66 add nose to requirements пре 2 година
setup.cfg 99d451bbcf add proper PEP440 version пре 2 година
setup.py be3b85a8f7 fix versioneer import пре 2 година
versioneer.py 99d451bbcf add proper PEP440 version пре 2 година

README.md

Datalad Metadata Model

This repository contains the metadata model that datalad and datalad-metalad (will) use for their metadata.

The model is separated into individual components that can be independently loaded and saved in order to have a focus-based view on the potentially very large metadata model instance (an application of the "proxy design pattern").

The implementation is divided into a user facing API-layer and a storage layer. Both are independent from each other (as long as the model does not change). The API layer defines an abstract data type, which represents the metadata model. The storage layer is responsible for persisting the model instance.

The two layers communicate through a defined interface. This allows for the use of multiple different storage layers with the same model instance. This can be even done in parallel, for example, if you want to copy a model from one storage layer to another storage layer. It also allows for the independent development of storage backends

Test it with datalad-metalad

There is a datalad-metalad (aka metalad) fork, i.e. https://github.com/christian-monch/datalad-metalad with the branch "metadata_model". This branch uses the metadata model to operate on metadata.

Currently the metadata_model branch of datalad-metalad implements the following commands based on the model:

  • meta-dump
  • ... (more to come)

Consequently there is also a repository, that contains "test" metadata (which has been created with the mdc-tool in this distribution).

Installation instructions

(These instructions were tested on Debian 10) Create a virtual environment, activate it, and upgrade pip, e.g.:

python3 -m venv ~/venv/datalad-metadata-model
source ~/venv/datalad-metadata-model/bin/activate
pip install --upgrade pip

Clone datalad-metalad and checkout the branch "metadata_model".

git clone https://github.com/christian-monch/datalad-metalad
cd datalad-metalad
git checkout metadata_model

Install the checked out version of metalad, i.e.

pip install -r requirements.txt

Invoking datalad meta-dump should now output:

[WARNING] No git-mapped datalad metadata model found in: .

Now, clone the demo-metadata repository into a directory of your choice, change into it and fetch all remote references

git clone https://github.com/christian-monch/datalad-metadata-demo-2.git

Change into the directory and fetch some remote references

cd datalad-metadata-demo-2
git fetch origin refs/datalad/dataset-tree-version-list:refs/datalad/dataset-tree-version-list
git fetch origin refs/datalad/dataset-uuid-set:refs/datalad/dataset-uuid-set
git fetch origin refs/datalad/object-references/dataset-tree:refs/datalad/object-references/dataset-tree
git fetch origin refs/datalad/object-references/file-tree:refs/datalad/object-references/file-tree
git fetch origin refs/datalad/object-references/metadata:refs/datalad/object-references/metadata

Invocation

Now you are all set to give it a try. Execute:

datalad -f json_pp meta-dump -r

That should output a few JSON objects describing datasets and files.

(The metadata was created with the "mdc" tool that comes with the datalad-metadata-model package. The dataset hierarchy and file names are taken from a local clone of the datasets.datalad.org dataset.)