Ontology-based knowledge graph infrastructure for interoperable atomistic simulation data
Abril Azocar Guzman, Sarath Menon, Tilmann Hickel, and Stefan Sandfeld

TL;DR
This paper introduces an ontology-based knowledge graph infrastructure that standardizes and integrates atomistic simulation data, enhancing data reuse, interoperability, and analysis capabilities.
Contribution
It presents a novel ontology-driven framework for capturing, normalizing, and querying atomistic simulation data and workflows as a comprehensive knowledge graph.
Findings
Integrated over 750,000 triples of simulation data
Enabled cross-dataset analysis of material properties
Represented workflows for provenance and reconstruction
Abstract
The reuse of atomistic simulation data is often limited by heterogeneous formats, incomplete metadata, and a lack of standardized representations of workflows and provenance. Here we present an ontology-based infrastructure for representing and integrating atomistic simulation data as a knowledge graph. The approach combines domain ontologies with a software framework that enables data capture both from existing datasets and directly from simulation workflows at the point of generation. Heterogeneous data from multiple sources are normalized into a common, ontology-aligned representation, enabling consistent querying and analysis across datasets. We demonstrate these capabilities through the integration of grain boundary data, cross-dataset analysis of material properties, and extraction of derived thermodynamic quantities from existing simulations. In addition, workflows are…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
