A Unified Ontology for Scalable Knowledge Graph-Driven Operational Data Analytics in High-Performance Computing Systems
Junaid Ahmed Khan, Andrea Bartolini

TL;DR
This paper introduces a unified ontology for operational data analytics in HPC systems, enabling semantic interoperability and reducing storage overhead, thus facilitating scalable cross-system telemetry analysis.
Contribution
It presents the first comprehensive ontology model for HPC telemetry data that supports interoperability across diverse systems and optimizes storage efficiency.
Findings
Ontology reduces KG storage by up to 38.84%.
Validated with 36 competency questions reflecting real-world needs.
Supports cross-system telemetry analysis in heterogeneous HPC environments.
Abstract
Modern high-performance computing (HPC) systems generate massive volumes of heterogeneous telemetry data from millions of sensors monitoring compute, memory, power, cooling, and storage subsystems. As HPC infrastructures scale to support increasingly complex workloads-including generative AI-the need for efficient, reliable, and interoperable telemetry analysis becomes critical. Operational Data Analytics (ODA) has emerged to address these demands; however, the reliance on schema-less storage solutions limits data accessibility and semantic integration. Ontologies and knowledge graphs (KG) provide an effective way to enable efficient and expressive data querying by capturing domain semantics, but they face challenges such as significant storage overhead and the limited applicability of existing ontologies, which are often tailored to specific HPC systems only. In this paper, we present…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Semantic Web and Ontologies · Graph Theory and Algorithms
