Detect, Distill and Update: Learned DB Systems Facing Out of Distribution Data
Meghdad Kurmanji, Peter Triantafillou

TL;DR
This paper introduces DDUp, a framework for efficiently updating neural network-based components in learned database systems when faced with out-of-distribution data, ensuring high accuracy without costly retraining.
Contribution
The paper presents a novel framework combining statistical OOD detection and transfer learning with knowledge distillation for updating learned DB components.
Findings
DDUp effectively detects OOD data with a new statistical test.
DDUp updates models efficiently without full retraining.
Experimental results show DDUp improves accuracy and reduces update costs.
Abstract
Machine Learning (ML) is changing DBs as many DB components are being replaced by ML models. One open problem in this setting is how to update such ML models in the presence of data updates. We start this investigation focusing on data insertions (dominating updates in analytical DBs). We study how to update neural network (NN) models when new data follows a different distribution (a.k.a. it is "out-of-distribution" -- OOD), rendering previously-trained NNs inaccurate. A requirement in our problem setting is that learned DB components should ensure high accuracy for tasks on old and new data (e.g., for approximate query processing (AQP), cardinality estimation (CE), synthetic data generation (DG), etc.). This paper proposes a novel updatability framework (DDUp). DDUp can provide updatability for different learned DB system components, even based on different NNs, without the high costs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Stream Mining Techniques · Machine Learning and Data Classification · Data Quality and Management
