Selectivity correction with online machine learning

Max Halford; Philippe Saint-Pierre; Franck Morvan

arXiv:2009.09884·cs.DB·September 22, 2020·1 cites

Selectivity correction with online machine learning

Max Halford, Philippe Saint-Pierre, Franck Morvan

PDF

Open Access

TL;DR

This paper explores online machine learning techniques to improve database selectivity estimation, demonstrating that simple online models can adapt to changing workloads and perform comparably to advanced deep learning methods.

Contribution

It introduces online machine learning as a lightweight, adaptive alternative to batch methods for selectivity estimation in databases, capable of handling concept drift.

Findings

01

Online models compete with deep learning approaches.

02

Online models adapt to workload changes and schema modifications.

03

Simple online models are effective for selectivity estimation.

Abstract

Computer systems are full of heuristic rules which drive the decisions they make. These rules of thumb are designed to work well on average, but ignore specific information about the available context, and are thus sub-optimal. The emerging field of machine learning for systems attempts to learn decision rules with machine learning algorithms. In the database community, many recent proposals have been made to improve selectivity estimation with batch machine learning methods. Such methods are all batch methods which require retraining and cannot handle concept drift, such as workload changes and schema modifications. We present online machine learning as an alternative approach. Online models learn on the fly and do not require storing data, they are more lightweight than batch models, and finally may adapt to concept drift. As an experiment, we teach models to improve the selectivity…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Stream Mining Techniques · Advanced Database Systems and Queries · Data Quality and Management