CoLSE: A Lightweight and Robust Hybrid Learned Model for Single-Table Cardinality Estimation using Joint CDF
Lankadinee Rathuwadu, Guanli Liu, Christopher Leckie, Renata Borovica-Gajic

TL;DR
CoLSE is a hybrid machine learning model that efficiently estimates query result sizes by modeling joint data distributions with copulas and neural networks, balancing accuracy, speed, and memory.
Contribution
It introduces a novel hybrid approach combining copula-based joint distribution modeling with neural correction for improved cardinality estimation.
Findings
Outperforms existing models in accuracy and efficiency
Balances trade-offs among accuracy, training time, and model size
Achieves superior inference latency and memory footprint
Abstract
Cardinality estimation (CE), the task of predicting the result size of queries is a critical component of query optimization. Accurate estimates are essential for generating efficient query execution plans. Recently, machine learning techniques have been applied to CE, broadly categorized into query-driven and data-driven approaches. Data-driven methods learn the joint distribution of data, while query-driven methods construct regression models that map query features to cardinalities. Ideally, a CE technique should strike a balance among three key factors: accuracy, efficiency, and memory footprint. However, existing state-of-the-art models often fail to achieve this balance. To address this, we propose CoLSE, a hybrid learned approach for single-table cardinality estimation. CoLSE directly models the joint probability over queried intervals using a novel algorithm based on copula…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Quality and Management · Cloud Computing and Resource Management
