TL;DR
This paper introduces ABae, a novel query processing algorithm that accelerates approximate aggregation queries with expensive deep neural network predicates by using stratified sampling and proxies, reducing costs significantly.
Contribution
The paper develops ABae, a new method that effectively accelerates approximate aggregation queries involving costly predicates, supporting sampling with non-satisfying records and achieving optimal convergence.
Findings
ABae reduces labeling costs by up to 2.3x on real datasets.
ABae converges at an optimal rate in stratified sampling with non-satisfying draws.
The method outperforms baseline approaches in experiments.
Abstract
Researchers and industry analysts are increasingly interested in computing aggregation queries over large, unstructured datasets with selective predicates that are computed using expensive deep neural networks (DNNs). As these DNNs are expensive and because many applications can tolerate approximate answers, analysts are interested in accelerating these queries via approximations. Unfortunately, standard approximate query processing techniques to accelerate such queries are not applicable because they assume the result of the predicates are available ahead of time. Furthermore, recent work using cheap approximations (i.e., proxies) do not support aggregation queries with predicates. To accelerate aggregation queries with expensive predicates, we develop and analyze a query processing algorithm that leverages proxies (ABae). ABae must account for the key challenge that it may sample…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
