A Polystore Architecture Using Knowledge Graphs to Support Queries on Heterogeneous Data Stores
Leonardo Guerreiro Azevedo, Renan Francisco Santos Souza, Elton F. de, S. Soares, Raphael M. Thiago, Julio Cesar Cardoso Tesolin, Ann C. Oliveira,, Marcio Ferreira Moreno

TL;DR
This paper introduces a polystore architecture leveraging knowledge graphs to enable integrated querying across heterogeneous data stores, improving query simplicity with minimal performance overhead, demonstrated in an Oil & Gas industry scenario.
Contribution
It presents a novel federated database architecture using knowledge graphs and provenance to unify heterogeneous data sources with a global schema.
Findings
Query complexity is reduced by 50% compared to traditional systems.
Query processing time increases by no more than 30%.
Architecture effectively supports integrated queries on diverse data stores.
Abstract
Modern applications commonly need to manage dataset types composed of heterogeneous data and schemas, making it difficult to access them in an integrated way. A single data store to manage heterogeneous data using a common data model is not effective in such a scenario, which results in the domain data being fragmented in the data stores that best fit their storage and access requirements (e.g., NoSQL, relational DBMS, or HDFS). Besides, organization workflows independently consume these fragments, and usually, there is no explicit link among the fragments that would be useful to support an integrated view. The research challenge tackled by this work is to provide the means to query heterogeneous data residing on distinct data repositories that are not explicitly connected. We propose a federated database architecture by providing a single abstract global conceptual schema to users,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Data Quality and Management
