TL;DR
GRASP is a data-agnostic system for cardinality estimation that effectively handles imperfect, incomplete, and imbalanced workloads, outperforming existing models and rivaling traditional data-driven methods without accessing raw data.
Contribution
It introduces a novel, robust, data-agnostic cardinality learning approach that generalizes to unseen join templates and handles value distribution shifts, addressing real-world workload imperfections.
Findings
Outperforms existing query-driven models on imperfect workloads
Achieves accuracy comparable to data-driven methods without data access
Operates effectively with only 10% of join templates
Abstract
Cardinality estimation (CardEst) is a critical aspect of query optimization. Traditionally, it leverages statistics built directly over the data. However, organizational policies (e.g., regulatory compliance) may restrict global data access. Fortunately, query-driven cardinality estimation can learn CardEst models using query workloads. However, existing query-driven models often require access to data or summaries for best performance, and they assume perfect training workloads with complete and balanced join templates (or join graphs). Such assumptions rarely hold in real-world scenarios, in which join templates are incomplete and imbalanced. We present GRASP, a data-agnostic cardinality learning system designed to work under these real-world constraints. GRASP's compositional design generalizes to unseen join templates and is robust to join template imbalance. It also introduces a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsBalanced Selection
