NN-based Transformation of Any SQL Cardinality Estimator for Handling DISTINCT, AND, OR and NOT
Rojeh Hayek, Oded Shmueli

TL;DR
This paper introduces a deep learning-based approach and a recursive algorithm to extend existing SQL cardinality estimators, enabling accurate estimation of queries with DISTINCT, AND, OR, and NOT operators, which are traditionally challenging.
Contribution
The paper presents PUNQ, a deep learning scheme for predicting unique row percentages, and GenCrd, a recursive algorithm to extend any conjunctive query estimator to handle more complex queries.
Findings
Accurate cardinality estimation for complex SQL queries.
Effective transformation of existing estimators to handle DISTINCT.
Maintained estimation accuracy after applying proposed methods.
Abstract
SQL queries, with the AND, OR, and NOT operators, constitute a broad class of highly used queries. Thus, their cardinality estimation is important for query optimization. In addition, a query planner requires the set-theoretic cardinality (i.e., without duplicates) for queries with DISTINCT as well as in planning; for example, when considering sorting options. Yet, despite the importance of estimating query cardinalities in the presence of DISTINCT, AND, OR, and NOT, many cardinality estimation methods are limited to estimating cardinalities of only conjunctive queries with duplicates counted. The focus of this work is on two methods for handling this deficiency that can be applied to any limited cardinality estimation model. First, we describe a specialized deep learning scheme, PUNQ, which is tailored to representing conjunctive SQL queries and predicting the percentage of unique…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Web Data Mining and Analysis
