Provenance for Aggregate Queries
Yael Amsterdamer, Daniel Deutch, Val Tannen

TL;DR
This paper introduces a novel method for capturing provenance in aggregate queries by annotating individual values within tuples, addressing challenges not solvable by existing tuple-based provenance approaches.
Contribution
It proposes a new approach to provenance for aggregate queries by annotating individual values, extending previous tuple-based methods, and handles complex queries with aggregation and difference.
Findings
Provenance for aggregate queries requires value-level annotations.
The new approach works for simple and complex aggregate queries.
Encoding difference queries using aggregation provides new semantics.
Abstract
We study in this paper provenance information for queries with aggregation. Provenance information was studied in the context of various query languages that do not allow for aggregation, and recent work has suggested to capture provenance by annotating the different database tuples with elements of a commutative semiring and propagating the annotations through query evaluation. We show that aggregate queries pose novel challenges rendering this approach inapplicable. Consequently, we propose a new approach, where we annotate with provenance information not just tuples but also the individual values within tuples, using provenance to describe the values computation. We realize this approach in a concrete construction, first for "simple" queries where the aggregation operator is the last one applied, and then for arbitrary (positive) relational algebra queries with aggregation; the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Database Systems and Queries · Distributed and Parallel Computing Systems
