Skip to content

MaRDI4NFDI/MathModDB-Importer

Repository files navigation

MathModDB Importer

Import data from the MathModDB OWL ontology into the MaRDI Portal.

The script parses the MathModDB OWL file and creates/updates items (mathematical models, formulations, quantities, research fields, research problems, computational tasks) on the MaRDI portal, including their properties and inverse relations.

Prerequisites

  • Python >= 3.11
  • uv (handles dependencies automatically)

No manual installation of dependencies is needed. Running the script with uv run will automatically install rdflib and mardiclient.

Project structure

importer/
├── import_mathmoddb.py       # Main script
├── config.json               # MaRDI portal property/item mappings
├── mardi.json                # MathModDB individual -> QID mapping (auto-generated)
├── mathmoddb.owl             # The OWL ontology file
├── mapping.md                # Documentation of OWL-to-portal mappings
├── pyproject.toml
├── .env.example
└── README.md

Configuration

The config.json file contains all mappings between the MathModDB ontology and the MaRDI portal:

Connection settings

Key Description
wikibase_host Portal hostname. API endpoints are derived from this unless overridden below.
mediawiki_api_url (optional) Full MediaWiki API URL. Default: https://<wikibase_host>/w/api.php
sparql_endpoint_url (optional) Full SPARQL endpoint URL. Default: https://query.<wikibase_host>/sparql
wikibase_url (optional) Full Wikibase URL. Default: https://<wikibase_host>
importer_api_url (optional) MaRDI importer API URL. Default: https://importer.<wikibase_host>

Entity mappings

Key Description
instance_mapping Maps OWL class names to their "instance of" QIDs.
profile_mapping Maps OWL class names to their MaRDI profile type QIDs.
property_mapping Maps OWL object property names and identifier types to Wikibase property IDs.
describes_qualifier_mapping Maps "describedAs..." relation names to qualifier QIDs (for publication links).
containment_role_mapping Maps typed containment properties (e.g. containsBoundaryCondition) to role QIDs.
data_property_mapping Maps boolean data properties (e.g. isDeterministic) to Wikibase property IDs.
community_item QID of the MathModDB community item.
instance_of_property Property ID for "instance of" claims.
profile_type_property Property ID for MaRDI profile type claims.
community_property Property ID for community claims.
object_has_role_property Property ID used as qualifier for role-based claims.
documented_in_property Property ID for publication-related qualified claims.
contains_property Property ID for containment claims.
defining_formula_property Property ID for LaTeX formula claims.
formula_symbol_property Property ID for symbol definition claims.
quantity_qualifier_property Qualifier property linking symbols to quantities.
long_description_property Property ID for long descriptions (rdfs:comment).

Empty string values ("") in the config are skipped during import. Fill them in as the corresponding properties become available in the portal.

Mapping file (mardi.json)

This file maps MathModDB individual names to MaRDI portal QIDs. It is auto-generated from mardiID annotations in the OWL file on first run, and updated when new items are created. Use --ignore-mardi-ids to skip this on fresh instances.

Environment variables

Variable Required Description
WIKIBASE_USER Yes (unless --dry-run) MaRDI portal username (bot or regular account)
WIKIBASE_PASSWORD Yes (unless --dry-run) MaRDI portal password

Copy the template and fill in your values:

cp .env.example .env

Usage

Dry run (no credentials needed)

Parses the OWL file and shows what would be created/updated:

uv run import_mathmoddb.py --dry-run

Import to the MaRDI portal

uv run import_mathmoddb.py

Import to a local Wikibase instance

For testing against a local instance with a regular user account:

uv run import_mathmoddb.py --client-login --ignore-mardi-ids

This requires a config.json with the appropriate local PIDs/QIDs and connection URLs (e.g. mediawiki_api_url, importer_api_url set to "").

Use a different OWL file

uv run import_mathmoddb.py --owl-file path/to/mathmoddb.owl

What the script does

  1. Parse OWL -- Extracts all named individuals with their types, labels, descriptions, relationships, formulas, identifiers, and boolean properties.

  2. Build mapping -- Reads mardiID annotations from the OWL to identify individuals already in the portal. Merges with any existing mardi.json. Skipped with --ignore-mardi-ids.

  3. Create items -- For each non-publication individual without an existing QID, creates a new Wikibase item with label, aliases, description, instance-of, profile type, community, and MathModDB identifier claims.

  4. Add properties -- Adds object property claims (specializes, models, etc.), formulas with symbol definitions (avoiding duplicate containment claims for quantities already linked via symbol_represents), typed containment with role qualifiers, publication links with qualifier roles, identifiers (DOI, arXiv, Wikidata QID, QUDT), boolean data properties, and long descriptions.

  5. Add inverse relations -- For properties with inverses (e.g. specializedByspecializes, containedIncontains), adds corresponding claims to target items.

CLI reference

usage: import_mathmoddb.py [-h] [--owl-file OWL_FILE] [--client-login]
                           [--ignore-mardi-ids] [--dry-run]

options:
  --owl-file          Path to OWL file (default: mathmoddb.owl in script directory)
  --client-login      Use client login instead of bot login (for regular user accounts)
  --ignore-mardi-ids  Ignore mardiID annotations in OWL and mardi.json (for fresh
                      instances without pre-existing entities)
  --dry-run           Parse and prepare data without writing to Wikibase

About

Import MathModDB OWL ontology data into the MaRDI portal

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages