CUBE: A Cardinality Estimator Based on Neural CDF
Xiao Yan, Tiezheng Nie, Boyang Fang, Derong Shen, Kou Yue, Yu Ge

TL;DR
This paper introduces CUBE, a neural CDF-based cardinality estimator that provides fast, accurate, and scalable range query estimates without sampling, outperforming existing data-driven methods in speed and stability.
Contribution
CUBE is the first neural CDF-based cardinality estimator that guarantees low-latency, high-accuracy, and scalability for high-dimensional data without sampling or integration.
Findings
Over 10x faster inference than state-of-the-art methods
Maintains high accuracy across increasing data dimensionality
Provides stable and predictable estimation results
Abstract
Modern database optimizer relies on cardinality estimator, whose accuracy directly affects the optimizer's ability to choose an optimal execution plan. Recent work on data-driven methods has leveraged probabilistic models to achieve higher estimation accuracy, but these approaches cannot guarantee low inference latency at the same time and neglect scalability. As data dimensionality grows, optimization time can even exceed actual query execution time. Furthermore, inference with probabilistic models by sampling or integration procedures unpredictable estimation result and violate stability, which brings unstable performance with query execution and make database tuning hard for database users. In this paper, we propose a novel approach to cardinality estimation based on cumulative distribution function(CDF), which calculates range query cardinality without sampling or integration,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Cloud Computing and Resource Management
