Approximate Query Processing via Tuple Bubbles

Damjan Gjurovski; Sebastian Michel

arXiv:2212.10150·cs.DB·December 21, 2022

Approximate Query Processing via Tuple Bubbles

Damjan Gjurovski, Sebastian Michel

PDF

Open Access

TL;DR

This paper introduces 'bubbles', compact representations of data tuples, enabling efficient approximate query processing suitable for large-scale and resource-constrained environments, with tunable accuracy and size.

Contribution

It presents a novel approach using tunable, compact tuple representations called bubbles, including Bayesian network-based instantiations, for approximate query processing.

Findings

01

Bubbles improve estimation accuracy over state-of-the-art methods.

02

The approach reduces execution time and disk space requirements.

03

Experimental results validate the effectiveness of bubbles in large data scenarios.

Abstract

We propose a versatile approach to lightweight, approximate query processing by creating compact but tunably precise representations of larger quantities of original tuples, coined bubbles. Instead of working with tables of tuples, the query processing then operates on bubbles but leaves the traditional query processing paradigms conceptually applicable. We believe this is a natural and viable approach to render approximate query processing feasible for large data in disaggregated cloud settings or in resource-limited scenarios. Bubbles are tunable regarding the compactness of enclosed tuples as well as the granularity of statistics and the way they are instantiated. For improved accuracy, we put forward a first working solution that represents bubbles via Bayesian networks, per table, or along foreign-key joins. To underpin the viability of the approach, we report on an experimental…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsData Visualization and Analytics · Data Management and Algorithms · Advanced Database Systems and Queries