# An Approach Based on Bayesian Networks for Query Selectivity Estimation

**Authors:** Max Halford, Philippe Saint-Pierre, Frank Morvan

arXiv: 1907.06295 · 2019-07-16

## TL;DR

This paper introduces a Bayesian network-based method using Chow-Liu trees to improve query selectivity estimation by capturing attribute correlations, significantly enhancing accuracy over traditional independence assumptions.

## Contribution

The paper presents a novel approach employing Chow-Liu trees for better selectivity estimation, reducing errors caused by attribute independence assumptions in query optimization.

## Key findings

- Achieves an order of magnitude more precise estimates than existing methods.
- Maintains reasonable efficiency in time and space.
- Demonstrates effectiveness on the TPC-DS benchmark.

## Abstract

The efficiency of a query execution plan depends on the accuracy of the selectivity estimates given to the query optimiser by the cost model. The cost model makes simplifying assumptions in order to produce said estimates in a timely manner. These assumptions lead to selectivity estimation errors that have dramatic effects on the quality of the resulting query execution plans. A convenient assumption that is ubiquitous among current cost models is to assume that attributes are independent with each other. However, it ignores potential correlations which can have a huge negative impact on the accuracy of the cost model. In this paper we attempt to relax the attribute value independence assumption without unreasonably deteriorating the accuracy of the cost model. We propose a novel approach based on a particular type of Bayesian networks called Chow-Liu trees to approximate the distribution of attribute values inside each relation of a database. Our results on the TPC-DS benchmark show that our method is an order of magnitude more precise than other approaches whilst remaining reasonably efficient in terms of time and space.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1907.06295/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/1907.06295/full.md

## References

38 references — full list in the complete paper: https://tomesphere.com/paper/1907.06295/full.md

---
Source: https://tomesphere.com/paper/1907.06295