stephenjboyle 600d42a278 Change prefixes to w3id.org 11 months ago
..
Convert_to_mass.py ebd807d628 Initial code 1 year ago
README.md ebd807d628 Initial code 1 year ago
convert_data.py 600d42a278 Change prefixes to w3id.org 11 months ago
doit_utils.py ebd807d628 Initial code 1 year ago
load_data.rdfox ebd807d628 Initial code 1 year ago
load_data_prodcom.rdfox 600d42a278 Change prefixes to w3id.org 11 months ago
load_data_prodcom_correspondence.rdfox c78b57b198 Convert PRODCOM Correspondence table 1 year ago
load_data_prodcom_list.rdfox 600d42a278 Change prefixes to w3id.org 11 months ago
load_data_units.rdfox 4ef51808ac Created rules dlog for unit conversion 1 year ago
map_prodcom.dlog 4ef51808ac Created rules dlog for unit conversion 1 year ago
map_prodcom_correspondence.dlog 34a2570bd7 Modify :CurrentImport to ufu:CurrentImport 1 year ago
map_prodcom_list.dlog 34a2570bd7 Modify :CurrentImport to ufu:CurrentImport 1 year ago
metrics_units.csv 25d2ffbb0c Update scripts to convert PRODCOM years separately 1 year ago
preprocess.py 26cb6c99bc Convert PRODCOM only 1 year ago
unit_conversion.dlog 4ef51808ac Created rules dlog for unit conversion 1 year ago

README.md

Running the PRObs system

The PRObs system can perform different operations. Each of them is encoded in a separate RDFox master script.

Modules

Ontology conversion

Converts the Turtle ontology into Functional-Style OWL.

How to execute it:

RDFox sandbox <root> 'exec scripts/ontology-conversion/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Data pre-processing

Converts 'raw' data into CSV files for RDFox.

How to execute it:

python Ontologies/scripts/preprocess.py

Data conversion

Reads CSV files, and converts all of them into RDF (probs_original_data).

How to execute it:

RDFox sandbox <root> 'exec scripts/data-conversion/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Data validation

Reads the RDF file (probs_original_data), and checks if some constraints are verified.

How to execute it:

RDFox sandbox <root> 'exec scripts/data-validation/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Data enhancement

Reads the RDF file (probs_original_data), runs all the rules, and converts all of them into RDF.

How to execute it:

RDFox sandbox <root> 'exec scripts/data-enhancement/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Test queries

Reads the RDF file with the data (probs_enhanced_data), and answers some queries.

How to execute it:

RDFox sandbox <root> 'exec scripts/test-queries/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Reasoning

Reads the RDF file with the data (probs_enhanced_data), adds the reasoning rules, and opens the SPARQL endpoint.

How to execute it:

RDFox sandbox <root> 'exec scripts/reasoning/master'

where <root> is the path to the "Ontologies" folder (. if you are inside it).

Then go to http://localhost:12110/console/default to run your SPARQL queries.

Operations

Get an RDFox-friendly version of the ontology

Simply run the ontology conversion module.

Convert data from CSV files (or other data sources supported by RDFox) to an RDF file compatible with the PRObs ontology

If you need only one load_data file and one map file (this should generally be the case):

  1. Overwrite the load_data.rdfox and the map.dlog files in "data-conversion" (obviously, keeping the same names)
  2. Run the data conversion module

If you need a more complex loading (it might happen, but we do not have any example of this at the moment):

  1. Overwrite the input.rdfox file in "data-conversion" (obviously, keeping the same name)
  2. Run the data conversion module

Check if the data are "valid"

  1. Run the data validation module
  2. Check that all queries have 0 answers

Add another validation check

  1. Add a file "check_CUSTOM-NAME_rules" with the rules to check
  2. Add a file "check_CUSTOM-NAME_queries" with the ASK queries to check (note that the queries must return no answer if the check is passed)
  3. Add a new command exec check CUSTOM-NAME in the "validate" file

Enhance an RDF file containing data compatible with the PRObs ontology

  1. Save the file as probs_original_data.nt.gz in the data folder
  2. Run the data enhancement module

Run the test queries on an RDF file containing data compatible with the PRObs ontology

  1. Save the file as probs_enhanced_data.nt.gz in the data folder
  2. Run the test queries module

Open an endpoint to run SPARQL queries on an RDF file containing data compatible with the PRObs ontology

  1. Save the file as probs_enhanced_data.nt.gz in the data folder
  2. Run the reasoning module

Execute everything all at once

You should (almost) never need this, but you could achieve it by simply running:

doit

Execute from one specific step onwards

We also provide master-pipeline scripts if you want to execute multiple commands without saving intermediate files.

To achieve this, you need to use master-pipeline version of the module you are interested in instead of the master one.

Relationships

DOT code
digraph G {
    
  subgraph cluster_ontology_conversion {
    label = <<B>Ontology-conversion</B>>;
    colorscheme=paired10;
    color=1;
    
    original_ontology -> ontology_conversion;
    ontology_conversion -> ontology_fss;
  }
  
  subgraph cluster_data_pre_processing {
    label = <<B>Data pre-processing</B>>;
    colorscheme=paired10;
    color=2;
    
    raw_data -> data_pre_processing;
    data_pre_processing -> csv_files;
  }
  
  subgraph cluster_data_conversion {
    label = <<B>Data conversion</B>>;
    colorscheme=paired10;
    color=3;
    
    ontology_fss -> data_conversion;
    csv_files -> data_conversion;
    data_conversion -> probs_original_data;
  }
  
  subgraph cluster_data_enhancement {
    label = <<B>Data enhancement</B>>;
    colorscheme=paired10;
    color=4;
    
    ontology_fss -> data_enhancement;
    probs_original_data -> data_enhancement;
    enhancement_rules -> data_enhancement;
    data_enhancement -> probs_enhanced_data;
  }
  
  subgraph cluster_test_queries {
    label = <<B>Test queries</B>>;
    colorscheme=paired10;
    color=5;
    
    probs_enhanced_data -> test_queries;
    test_queries -> answers;
  }
  
  subgraph cluster_reasoning {
    label = <<B>Reasoning</B>>;
    colorscheme=paired10;
    color=6;
    
    probs_enhanced_data -> reasoning;
    reasoning_rules -> reasoning;
    reasoning -> endpoint;
  }

    original_ontology [label="original ontology", shape=tripleoctagon]
    ontology_conversion [label="ontology conversion", shape=rectangle]
    ontology_fss [label="Functional-Style OWL ontology", shape=tripleoctagon]

    raw_data [label="raw_data", shape=cylinder]
    data_pre_processing [label="data pre-processing", shape=rectangle]
    csv_files [label="data", shape=folder]

    data_conversion [label="data conversion", shape=rectangle]
    probs_original_data [label="probs_original_data", shape=tripleoctagon]

    enhancement_rules [label="enhancement rules", shape=hexagon]
    data_enhancement [label="data enhancement", shape=rectangle]
    probs_enhanced_data [label="probs_enhanced_data", shape=tripleoctagon]

    test_queries [label="test queries", shape=rectangle]
    answers [label="output", shape=folder]

    reasoning_rules [label="reasoning/rules", shape=hexagon]
    reasoning [label="reasoning", shape=rectangle]
    endpoint [label="endpoint", shape=component]

  subgraph cluster_legend {
    label = <<B>Legend</B>>;
    colorscheme=paired10;
    color=7;
    // rankdir=TB;
    // {rank = same; rdf_owl process }
    
    rdf_owl [label="RDF/OWL", shape=tripleoctagon]
    process [label="Process", shape=rectangle]
    datasource [label="Datasource", shape=cylinder]
    folder [label="Multiple files", shape=folder]
    datalog [label="Datalog", shape=hexagon]
    rdfox_endpoint [label="RDFox endpoint", shape=component]
    
    edge[style=invis];
    rdf_owl -> process -> datasource -> folder -> datalog -> rdfox_endpoint
  }

}

Relationships diagram