Rick Lupton 81689300f9 Add log level option to convert_data.py | 9 months ago | |
---|---|---|
.. | ||
Convert_to_mass.py | 1 year ago | |
README.md | 1 year ago | |
convert_data.py | 9 months ago | |
doit_utils.py | 1 year ago | |
geonames.csv | 1 year ago | |
load_data.rdfox | 1 year ago | |
load_data_geonames.rdfox | 1 year ago | |
load_data_prodcom.rdfox | 1 year ago | |
load_data_prodcom_bulk.rdfox | 1 year ago | |
load_data_prodcom_correspondence.rdfox | 1 year ago | |
load_data_prodcom_list.rdfox | 1 year ago | |
load_data_units.rdfox | 1 year ago | |
map_prodcom.dlog | 1 year ago | |
map_prodcom_bulk_sold.dlog | 9 months ago | |
map_prodcom_bulk_total.dlog | 9 months ago | |
map_prodcom_correspondence.dlog | 1 year ago | |
map_prodcom_list.dlog | 1 year ago | |
metrics_units.csv | 1 year ago | |
preprocess.py | 1 year ago | |
split_by_country.py | 1 year ago | |
split_by_country_year.py | 1 year ago | |
unit_conversion.dlog | 9 months ago |
The PRObs system can perform different operations.
Each of them is encoded in a separate RDFox master
script.
Converts the Turtle ontology into Functional-Style OWL.
How to execute it:
RDFox sandbox <root> 'exec scripts/ontology-conversion/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Converts 'raw' data into CSV files for RDFox.
How to execute it:
python Ontologies/scripts/preprocess.py
Reads CSV files, and converts all of them into RDF (probs_original_data).
How to execute it:
RDFox sandbox <root> 'exec scripts/data-conversion/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Reads the RDF file (probs_original_data), and checks if some constraints are verified.
How to execute it:
RDFox sandbox <root> 'exec scripts/data-validation/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Reads the RDF file (probs_original_data), runs all the rules, and converts all of them into RDF.
How to execute it:
RDFox sandbox <root> 'exec scripts/data-enhancement/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Reads the RDF file with the data (probs_enhanced_data), and answers some queries.
How to execute it:
RDFox sandbox <root> 'exec scripts/test-queries/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Reads the RDF file with the data (probs_enhanced_data), adds the reasoning rules, and opens the SPARQL endpoint.
How to execute it:
RDFox sandbox <root> 'exec scripts/reasoning/master'
where <root>
is the path to the "Ontologies" folder (.
if you are inside it).
Then go to http://localhost:12110/console/default
to run your SPARQL queries.
Simply run the ontology conversion module.
If you need only one load_data
file and one map
file (this should generally be the case):
load_data.rdfox
and the map.dlog
files in "data-conversion" (obviously, keeping the same names)If you need a more complex loading (it might happen, but we do not have any example of this at the moment):
ASK
queries to check (note that the queries must return no answer if the check is passed)exec check CUSTOM-NAME
in the "validate" fileprobs_original_data.nt.gz
in the data
folderprobs_enhanced_data.nt.gz
in the data
folderprobs_enhanced_data.nt.gz
in the data
folderYou should (almost) never need this, but you could achieve it by simply running:
doit
We also provide master-pipeline
scripts if you want to execute multiple commands without saving intermediate files.
To achieve this, you need to use master-pipeline
version of the module you are interested in instead of the master
one.
DOT code
digraph G {
subgraph cluster_ontology_conversion {
label = <<B>Ontology-conversion</B>>;
colorscheme=paired10;
color=1;
original_ontology -> ontology_conversion;
ontology_conversion -> ontology_fss;
}
subgraph cluster_data_pre_processing {
label = <<B>Data pre-processing</B>>;
colorscheme=paired10;
color=2;
raw_data -> data_pre_processing;
data_pre_processing -> csv_files;
}
subgraph cluster_data_conversion {
label = <<B>Data conversion</B>>;
colorscheme=paired10;
color=3;
ontology_fss -> data_conversion;
csv_files -> data_conversion;
data_conversion -> probs_original_data;
}
subgraph cluster_data_enhancement {
label = <<B>Data enhancement</B>>;
colorscheme=paired10;
color=4;
ontology_fss -> data_enhancement;
probs_original_data -> data_enhancement;
enhancement_rules -> data_enhancement;
data_enhancement -> probs_enhanced_data;
}
subgraph cluster_test_queries {
label = <<B>Test queries</B>>;
colorscheme=paired10;
color=5;
probs_enhanced_data -> test_queries;
test_queries -> answers;
}
subgraph cluster_reasoning {
label = <<B>Reasoning</B>>;
colorscheme=paired10;
color=6;
probs_enhanced_data -> reasoning;
reasoning_rules -> reasoning;
reasoning -> endpoint;
}
original_ontology [label="original ontology", shape=tripleoctagon]
ontology_conversion [label="ontology conversion", shape=rectangle]
ontology_fss [label="Functional-Style OWL ontology", shape=tripleoctagon]
raw_data [label="raw_data", shape=cylinder]
data_pre_processing [label="data pre-processing", shape=rectangle]
csv_files [label="data", shape=folder]
data_conversion [label="data conversion", shape=rectangle]
probs_original_data [label="probs_original_data", shape=tripleoctagon]
enhancement_rules [label="enhancement rules", shape=hexagon]
data_enhancement [label="data enhancement", shape=rectangle]
probs_enhanced_data [label="probs_enhanced_data", shape=tripleoctagon]
test_queries [label="test queries", shape=rectangle]
answers [label="output", shape=folder]
reasoning_rules [label="reasoning/rules", shape=hexagon]
reasoning [label="reasoning", shape=rectangle]
endpoint [label="endpoint", shape=component]
subgraph cluster_legend {
label = <<B>Legend</B>>;
colorscheme=paired10;
color=7;
// rankdir=TB;
// {rank = same; rdf_owl process }
rdf_owl [label="RDF/OWL", shape=tripleoctagon]
process [label="Process", shape=rectangle]
datasource [label="Datasource", shape=cylinder]
folder [label="Multiple files", shape=folder]
datalog [label="Datalog", shape=hexagon]
rdfox_endpoint [label="RDFox endpoint", shape=component]
edge[style=invis];
rdf_owl -> process -> datasource -> folder -> datalog -> rdfox_endpoint
}
}