An Empirical Analysis of Deep Learning for Cardinality Estimation

Jennifer Ortiz; Magdalena Balazinska; Johannes Gehrke; S. Sathiya; Keerthi

arXiv:1905.06425·cs.DB·September 13, 2019·23 cites

An Empirical Analysis of Deep Learning for Cardinality Estimation

Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, S. Sathiya, Keerthi

PDF

Open Access

TL;DR

This paper empirically evaluates deep learning models for cardinality estimation, demonstrating significant accuracy improvements and runtime reductions in query planning, while discussing practical challenges of deployment.

Contribution

It provides the first comprehensive empirical analysis of deep learning for cardinality estimation and explores their integration into database query optimizers.

Findings

01

Deep learning models reduce estimation error by 72%-98% compared to PostgreSQL.

02

Estimated cardinalities improve query plan quality, reducing runtimes by up to 49%.

03

Challenges in deploying deep learning models in practice are identified and addressed.

Abstract

We implement and evaluate deep learning for cardinality estimation by studying the accuracy, space and time trade-offs across several architectures. We find that simple deep learning models can learn cardinality estimations across a variety of datasets (reducing the error by 72% - 98% on average compared to PostgreSQL). In addition, we empirically evaluate the impact of injecting cardinality estimates produced by deep learning models into the PostgreSQL optimizer. In many cases, the estimates from these models lead to better query plans across all datasets, reducing the runtimes by up to 49% on select-project-join workloads. As promising as these models are, we also discuss and address some of the challenges of using them in practice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Machine Learning and Algorithms