Algorithms for Provisioning Queries and Analytics
Sepehr Assadi, Sanjeev Khanna, Yang Li, Val Tannen

TL;DR
This paper explores how to efficiently create small sketches for complex database queries that include relational algebra, grouping, and analytics, enabling quick approximate answers under various hypothetical scenarios.
Contribution
It introduces methods for compact provisioning of approximate answers for certain complex queries, and establishes bounds on when exact provisioning is feasible.
Findings
Quantiles and linear regression can be approximately provisioned with small sketches.
Exact provisioning for these statistics requires exponential sketch size in the number of hypotheticals.
Positive relational algebra queries can be compactly provisioned if the numerical component is also compactly provisioned.
Abstract
Provisioning is a technique for avoiding repeated expensive computations in what-if analysis. Given a query, an analyst formulates hypotheticals, each retaining some of the tuples of a database instance, possibly overlapping, and she wishes to answer the query under scenarios, where a scenario is defined by a subset of the hypotheticals that are "turned on". We say that a query admits compact provisioning if given any database instance and any hypotheticals, one can create a poly-size (in ) sketch that can then be used to answer the query under any of the possible scenarios without accessing the original instance. In this paper, we focus on provisioning complex queries that combine relational algebra (the logical component), grouping, and statistics/analytics (the numerical component). We first show that queries that compute quantiles or linear regression (as well…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
