Hierarchical Characteristic Set Merging for Optimizing SPARQL Queries in Heterogeneous RDF
Marios Meimaris, George Papastefanatos

TL;DR
This paper introduces a hierarchical merging technique for characteristic sets in RDF data, reducing schema heterogeneity effects and improving storage and query performance in SPARQL processing.
Contribution
It proposes a novel hierarchy-based merging method for characteristic sets, effectively managing schema heterogeneity in RDF datasets.
Findings
Significant reduction in the number of characteristic sets.
Improved query performance and storage efficiency.
Effective handling of schema heterogeneity in RDF data.
Abstract
Characteristic sets (CS) organize RDF triples based on the set of properties characterizing their subject nodes. This concept is recently used in indexing techniques, as it can capture the implicit schema of RDF data. While most CS-based approaches yield significant improvements in space and query performance, they fail to perform well in the presence of schema heterogeneity, i.e., when the number of CSs becomes very large, resulting in a highly partitioned data organization. In this paper, we address this problem by introducing a novel technique, for merging CSs based on their hierarchical structure. Our technique employs a lattice to capture the hierarchical relationships between CSs, identifies dense CSs and merges dense CSs with their ancestors, thus reducing the size of the CSs as well as the links between them. We implemented our algorithm on top of a relational backbone, where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Biomedical Text Mining and Ontologies
