Accurate Summary-based Cardinality Estimation Through the Lens of Cardinality Estimation Graphs
Jeremy Chen, Yuqing Huang, Mushi Wang, Semih Salihoglu, Ken Salem

TL;DR
This paper introduces a graph-based framework for improving the accuracy of summary-based cardinality estimators in graph databases, demonstrating significant improvements on various datasets and connecting optimistic and pessimistic estimation methods.
Contribution
It models optimistic estimators as paths in a cardinality estimation graph, proposes heuristics based on query structure, and links optimistic and pessimistic estimators through this graph framework.
Findings
Optimistic estimators achieve up to three orders of magnitude more accuracy with effective heuristics.
Using maximum-weight paths addresses underestimation in acyclic and small-cycle queries.
The framework reveals alternative solutions for pessimistic estimators' linear programs.
Abstract
We study two classes of summary-based cardinality estimators that use statistics about input relations and small-size joins in the context of graph database management systems: (i) optimistic estimators that make uniformity and conditional independence assumptions; and (ii) the recent pessimistic estimators that use information theoretic linear programs. We begin by addressing the problem of how to make accurate estimates for optimistic estimators. We model these estimators as picking bottom-to-top paths in a cardinality estimation graph (CEG), which contains sub-queries as nodes and weighted edges between sub-queries that represent average degrees. We outline a space of heuristics to make an optimistic estimate in this framework and show that effective heuristics depend on the structure of the input queries. We observe that on acyclic queries and queries with small-size cycles, using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Management and Algorithms · Data Quality and Management
