A Survey on Advancing the DBMS Query Optimizer: Cardinality Estimation, Cost Model, and Plan Enumeration
Hai Lan, Zhifeng Bao, Yuwei Peng

TL;DR
This survey reviews recent advancements in database query optimizer components—cardinality estimation, cost models, and plan enumeration—highlighting challenges and future research directions to improve optimization accuracy and efficiency.
Contribution
It provides a comprehensive review of techniques to enhance key components of cost-based query optimizers and offers insights into future research directions.
Findings
Identifies causes of inaccuracy in cardinality estimation.
Reviews techniques improving cost model precision.
Discusses plan enumeration strategies for complex queries.
Abstract
Query optimizer is at the heart of the database systems. Cost-based optimizer studied in this paper is adopted in almost all current database systems. A cost-based optimizer introduces a plan enumeration algorithm to find a (sub)plan, and then uses a cost model to obtain the cost of that plan, and selects the plan with the lowest cost. In the cost model, cardinality, the number of tuples through an operator, plays a crucial role. Due to the inaccuracy in cardinality estimation, errors in cost model, and the huge plan space, the optimizer cannot find the optimal execution plan for a complex query in a reasonable time. In this paper, we first deeply study the causes behind the limitations above. Next, we review the techniques used to improve the quality of the three key components in the cost-based optimizer, cardinality estimation, cost model, and plan enumeration. We also provide our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
