Selectivity Estimation with Attribute Value Dependencies using Linked Bayesian Networks
Max Halford, Philippe Saint-Pierre, Franck Morvan

TL;DR
This paper introduces a Bayesian network-based method for selectivity estimation in relational databases that efficiently captures attribute dependencies across relations, improving accuracy with minimal computational overhead.
Contribution
The paper presents a novel Bayesian network approach that models cross-relation attribute dependencies efficiently, addressing limitations of existing multidimensional methods.
Findings
Achieves an order of magnitude efficiency improvement over existing methods.
Maintains high accuracy in selectivity estimation.
Validated on large benchmark workloads.
Abstract
Relational query optimisers rely on cost models to choose between different query execution plans. Selectivity estimates are known to be a crucial input to the cost model. In practice, standard selectivity estimation procedures are prone to large errors. This is mostly because they rely on the so-called attribute value independence and join uniformity assumptions. Therefore, multidimensional methods have been proposed to capture dependencies between two or more attributes both within and across relations. However, these methods require a large computational cost which makes them unusable in practice. We propose a method based on Bayesian networks that is able to capture cross-relation attribute value dependencies with little overhead. Our proposal is based on the assumption that dependencies between attributes are preserved when joins are involved. Furthermore, we introduce a parameter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
