Duet: efficient and scalable hybriD neUral rElation undersTanding
Kaixin Zhang, Hongzhi Wang, Yabin Lu, Ziqi Li, Chang Shu, Yu Yan,, Donghua Yang

TL;DR
Duet is a novel hybrid neural network approach that improves cardinality estimation by eliminating sampling, reducing inference complexity, and achieving higher accuracy and practicality on high-dimensional data.
Contribution
Introduces Duet, a stable, efficient, and scalable hybrid method that directly estimates cardinality without sampling, addressing workload drift and high-dimensional challenges.
Findings
Reduces inference complexity from O(n) to O(1)
Achieves higher accuracy on high-dimensional tables
Lower inference cost on CPU compared to GPU-based methods
Abstract
Learned cardinality estimation methods have achieved high precision compared to traditional methods. Among learned methods, query-driven approaches have faced the workload drift problem for a long time. Although both data-driven and hybrid methods are proposed to avoid this problem, most of them suffer from high training and estimation costs, limited scalability, instability, and long-tail distribution problems on high-dimensional tables, which seriously affects the practical application of learned cardinality estimators. In this paper, we prove that most of these problems are directly caused by the widely used progressive sampling. We solve this problem by introducing predicate information into the autoregressive model and propose Duet, a stable, efficient, and scalable hybrid method to estimate cardinality directly without sampling or any non-differentiable process, which can not only…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Bayesian Modeling and Causal Inference · Machine Learning and Data Classification
