A role-free approach to indexing large RDF data sets in secondary memory for efficient SPARQL evaluation
George H. L. Fletcher, Peter W. Beck

TL;DR
This paper introduces TripleT, a novel secondary-memory index for RDF data that improves SPARQL query efficiency by indexing individual atoms without role distinctions, outperforming existing methods significantly.
Contribution
The paper presents TripleT, a role-free, atom-based indexing technique for RDF data that enhances query performance and reduces storage costs compared to prior approaches.
Findings
TripleT achieves multiple orders of magnitude improvement in query processing.
TripleT reduces storage requirements significantly.
Empirical results demonstrate superior performance over existing RDF indexes.
Abstract
Massive RDF data sets are becoming commonplace. RDF data is typically generated in social semantic domains (such as personal information management) wherein a fixed schema is often not available a priori. We propose a simple Three-way Triple Tree (TripleT) secondary-memory indexing technique to facilitate efficient SPARQL query evaluation on such data sets. The novelty of TripleT is that (1) the index is built over the atoms occurring in the data set, rather than at a coarser granularity, such as whole triples occurring in the data set; and (2) the atoms are indexed regardless of the roles (i.e., subjects, predicates, or objects) they play in the triples of the data set. We show through extensive empirical evaluation that TripleT exhibits multiple orders of magnitude improvement over the state of the art on RDF indexing, in terms of both storage and query processing costs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Database Systems and Queries · Data Management and Algorithms
