Join Cardinality Estimation with OmniSketches
David Justen, Matthias Boehm

TL;DR
This paper extends OmniSketch, a probabilistic data structure, to improve multi-join cardinality estimation, leading to more accurate query plans and significant performance gains in certain scenarios.
Contribution
We introduce OmniSketch join estimator, enabling accurate multi-join cardinality estimation without assuming uniformity or independence, and demonstrate its integration into query optimization.
Findings
Intermediate result sizes reduced up to 1,077x
Execution time decreased up to 3.19x
Mixed results on JOB-light with some improvements and some losses
Abstract
Join ordering is a key factor in query performance, yet traditional cost-based optimizers often produce sub-optimal plans due to inaccurate cardinality estimates in multi-predicate, multi-join queries. Existing alternatives such as learning-based optimizers and adaptive query processing improve accuracy but can suffer from high training costs, poor generalization, or integration challenges. We present an extension of OmniSketch - a probabilistic data structure combining count-min sketches and K-minwise hashing - to enable multi-join cardinality estimation without assuming uniformity and independence. Our approach introduces the OmniSketch join estimator, ensures sketch interoperability across tables, and provides an algorithm to process alpha-acyclic join graphs. Our experiments on SSB-skew and JOB-light show that OmniSketch-enhanced cost-based optimization can improve estimation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
