Estimating the Cardinality of Conjunctive Queries over RDF Data Using Graph Summarisation
Giorgio Stefanoni, Boris Motik, Egor V. Kostylev

TL;DR
This paper introduces a novel graph summarisation-based method for accurately estimating the number of answers to conjunctive queries over RDF data, addressing the complexity of navigational queries involving many joins.
Contribution
It formalizes a new cardinality estimation technique using graph summaries with a closed-form formula, improving accuracy over existing methods.
Findings
The proposed method is more accurate than state-of-the-art techniques.
It provides consistent estimations often by orders of magnitude.
Empirical results demonstrate significant improvements in estimation quality.
Abstract
Estimating the cardinality (i.e., the number of answers) of conjunctive queries is particularly difficult in RDF systems: queries over RDF data are navigational and thus tend to involve many joins. We present a new, principled cardinality estimation technique based on graph summarisation. We interpret a summary of an RDF graph using a possible world semantics and formalise the estimation problem as computing the expected cardinality over all RDF graphs represented by the summary, and we present a closed-form formula for computing the expectation of arbitrary queries. We also discuss approaches to RDF graph summarisation. Finally, we show empirically that our cardinality technique is more accurate and more consistent, often by orders of magnitude, than the state of the art.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
