Knowledge Graphs for Processing Scientific Data: Challenges and Prospects
Masoud Salehpour, Joseph G. Davis

TL;DR
This paper evaluates the performance of various data management systems in querying scientific knowledge graphs, revealing significant limitations and performance variability, and proposes approaches to improve KG processing.
Contribution
It provides a comparative analysis of major DMSs for scientific KGs and discusses challenges and potential solutions for efficient KG querying.
Findings
DMSs show limitations in processing complex KG queries.
Performance varies significantly depending on query type.
No single DMS outperforms others across all scenarios.
Abstract
There is growing interest in the use of Knowledge Graphs (KGs) for the representation, exchange, and reuse of scientific data. While KGs offer the prospect of improving the infrastructure for working with scalable and reusable scholarly data consistent with the FAIR (Findability, Accessibility, Interoperability, and Reusability) principles, the state-of-the-art Data Management Systems (DMSs) for processing large KGs leave somewhat to be desired. In this paper, we studied the performance of some of the major DMSs in the context of querying KGs with the goal of providing a finely-grained, comparative analysis of DMSs representing each of the four major DMS types. We experimented with four well-known scientific KGs, namely, Allie, Cellcycle, DrugBank, and LinkedSPL against Virtuoso, Blazegraph, RDF-3X, and MongoDB as the representative DMSs. Our results suggest that the DMSs display…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Quality and Management · Research Data Management Practices
