NeuroCard: One Cardinality Estimator for All Tables
Zongheng Yang, Amog Kamsetty, Sifei Luan, Eric Liang, Yan Duan, Xi, Chen, and Ion Stoica

TL;DR
NeuroCard is a neural density estimator that accurately predicts query result sizes across all database tables by capturing inter-table correlations without independence assumptions, significantly improving over prior methods.
Contribution
It introduces NeuroCard, a novel neural estimator that models correlations across all tables in a database, overcoming limitations of previous independent assumptions.
Findings
Achieves 8.5× lower maximum error on JOB-light benchmark.
Scales to dozens of tables with fast update times.
Provides a compact model in several MBs.
Abstract
Query optimizers rely on accurate cardinality estimates to produce good execution plans. Despite decades of research, existing cardinality estimators are inaccurate for complex queries, due to making lossy modeling assumptions and not capturing inter-table correlations. In this work, we show that it is possible to learn the correlations across all tables in a database without any independence assumptions. We present NeuroCard, a join cardinality estimator that builds a single neural density estimator over an entire database. Leveraging join sampling and modern deep autoregressive models, NeuroCard makes no inter-table or inter-column independence assumptions in its probabilistic modeling. NeuroCard achieves orders of magnitude higher accuracy than the best prior methods (a new state-of-the-art result of 8.5 maximum error on JOB-light), scales to dozens of tables, while being…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Visualization and Analytics
