Selectivity correction with online machine learning
Max Halford, Philippe Saint-Pierre, Franck Morvan

TL;DR
This paper explores online machine learning techniques to improve database selectivity estimation, demonstrating that simple online models can adapt to changing workloads and perform comparably to advanced deep learning methods.
Contribution
It introduces online machine learning as a lightweight, adaptive alternative to batch methods for selectivity estimation in databases, capable of handling concept drift.
Findings
Online models compete with deep learning approaches.
Online models adapt to workload changes and schema modifications.
Simple online models are effective for selectivity estimation.
Abstract
Computer systems are full of heuristic rules which drive the decisions they make. These rules of thumb are designed to work well on average, but ignore specific information about the available context, and are thus sub-optimal. The emerging field of machine learning for systems attempts to learn decision rules with machine learning algorithms. In the database community, many recent proposals have been made to improve selectivity estimation with batch machine learning methods. Such methods are all batch methods which require retraining and cannot handle concept drift, such as workload changes and schema modifications. We present online machine learning as an alternative approach. Online models learn on the fly and do not require storing data, they are more lightweight than batch models, and finally may adapt to concept drift. As an experiment, we teach models to improve the selectivity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Advanced Database Systems and Queries · Data Quality and Management
