Database Learning: Toward a Database that Becomes Smarter Every Time
Yongjoo Park, Ahmad Shahab Tajik, Michael Cafarella, Barzan Mozafari

TL;DR
This paper introduces Database Learning, a novel approach that improves query answering efficiency by learning from past answers, leading to faster responses and better accuracy over time in approximate query processing.
Contribution
It proposes a new paradigm of learning from previous query answers to enhance future query efficiency and accuracy in AQP systems, implemented in the Verdict engine.
Findings
Verdict supports 73.7% of real-world queries
Speeds up query processing by up to 23.0x
Achieves higher accuracy than existing AQP systems
Abstract
In today's databases, previous query answers rarely benefit answering future queries. For the first time, to the best of our knowledge, we change this paradigm in an approximate query processing (AQP) context. We make the following observation: the answer to each query reveals some degree of knowledge about the answer to another query because their answers stem from the same underlying distribution that has produced the entire dataset. Exploiting and refining this knowledge should allow us to answer queries more analytically, rather than by reading enormous amounts of raw data. Also, processing more queries should continuously enhance our knowledge of the underlying distribution, and hence lead to increasingly faster response times for future queries. We call this novel idea---learning from past query answers---Database Learning. We exploit the principle of maximum entropy to produce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Database Systems and Queries · Data Management and Algorithms · Data Stream Mining Techniques
