Approximate Query Processing via Tuple Bubbles
Damjan Gjurovski, Sebastian Michel

TL;DR
This paper introduces 'bubbles', compact representations of data tuples, enabling efficient approximate query processing suitable for large-scale and resource-constrained environments, with tunable accuracy and size.
Contribution
It presents a novel approach using tunable, compact tuple representations called bubbles, including Bayesian network-based instantiations, for approximate query processing.
Findings
Bubbles improve estimation accuracy over state-of-the-art methods.
The approach reduces execution time and disk space requirements.
Experimental results validate the effectiveness of bubbles in large data scenarios.
Abstract
We propose a versatile approach to lightweight, approximate query processing by creating compact but tunably precise representations of larger quantities of original tuples, coined bubbles. Instead of working with tables of tuples, the query processing then operates on bubbles but leaves the traditional query processing paradigms conceptually applicable. We believe this is a natural and viable approach to render approximate query processing feasible for large data in disaggregated cloud settings or in resource-limited scenarios. Bubbles are tunable regarding the compactness of enclosed tuples as well as the granularity of statistics and the way they are instantiated. For improved accuracy, we put forward a first working solution that represents bubbles via Bayesian networks, per table, or along foreign-key joins. To underpin the viability of the approach, we report on an experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Visualization and Analytics · Data Management and Algorithms · Advanced Database Systems and Queries
