TL;DR
This paper introduces the Degree Sequence Bound, a new method for more accurately estimating query output sizes by considering full degree sequences, which improves over previous bounds based only on maximum degrees.
Contribution
It extends existing cardinality bounding techniques by incorporating full degree sequences and max tuple multiplicity, providing a more precise upper bound for query cardinality estimation.
Findings
The Degree Sequence Bound improves estimation accuracy over previous methods.
Practical computation of the bound is achieved through learned approximations.
The approach reduces the risk of overly optimistic query plans.
Abstract
Recent work has demonstrated the catastrophic effects of poor cardinality estimates on query processing time. In particular, underestimating query cardinality can result in overly optimistic query plans which take orders of magnitude longer to complete than one generated with the true cardinality. Cardinality bounding avoids this pitfall by computing a strict upper bound on the query's output size using statistics about the database such as table sizes and degrees, i.e. value frequencies. In this paper, we extend this line of work by proving a novel bound called the Degree Sequence Bound which takes into account the full degree sequences and the max tuple multiplicity. This bound improves upon previous work incorporating degree constraints which focused on the maximum degree rather than the degree sequence. Further, we describe how to practically compute this bound using a learned…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
