Analyzing Query Optimizer Performance in the Presence and Absence of Cardinality Estimates
Asoke Datta, Brian Tsan, Yesdaulet Izenov, Florin Rusu

TL;DR
This paper compares query optimizer performance with and without cardinality estimates across different database settings, revealing that estimates have limited impact in non-indexed environments and can sometimes mislead in indexed, parallel systems.
Contribution
It provides the first analytical comparison of optimizer performance with and without cardinality estimates across various database configurations.
Findings
Cardinality estimates have marginal impact in non-indexed settings.
Inaccurate estimates can lead to sub-optimal operators when indexes are present.
Impact of estimates is less significant in highly-parallel main-memory databases.
Abstract
Most query optimizers rely on cardinality estimates to determine optimal execution plans. While traditional databases such as PostgreSQL, Oracle, and Db2 utilize many types of synopses -- including histograms, samples, and sketches -- recent main-memory databases like DuckDB and Heavy.AI often operate with minimal or no estimates, yet their performance does not necessarily suffer. To the best of our knowledge, no analytical comparison has been conducted between optimizers with and without cardinality estimates to understand their performance characteristics in different settings, such as indexed, non-indexed, and multi-threaded. In this paper, we present a comparative analysis between optimizers that use cardinality estimates and those that do not. We use the Join Order Benchmark (JOB) for our evaluation and true cardinalities as the baseline. Our investigation reveals that cardinality…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCloud Computing and Resource Management · Data Management and Algorithms · Graph Theory and Algorithms
