Equivalence-Invariant Algebraic Provenance for Hyperplane Update Queries
Pierre Bourhis, Daniel Deutch, Yuval Moskovitch

TL;DR
This paper introduces an algebraic provenance model for hyperplane update queries that is invariant under set equivalence, providing a scalable and efficient way to handle provenance information in database updates.
Contribution
It presents the first algebraic provenance model for hyperplane update queries invariant under set equivalence, with an efficient normal form for provenance computation and storage.
Findings
Normal form reduces exponential growth in provenance expressions
Normal form can be computed efficiently during query evaluation
Experimental results show scalability and effectiveness of the approach
Abstract
The algebraic approach for provenance tracking, originating in the semiring model of Green et. al, has proven useful as an abstract way of handling metadata. Commutative Semirings were shown to be the "correct" algebraic structure for Union of Conjunctive Queries, in the sense that its use allows provenance to be invariant under certain expected query equivalence axioms. In this paper we present the first (to our knowledge) algebraic provenance model, for a fragment of update queries, that is invariant under set equivalence. The fragment that we focus on is that of hyperplane queries, previously studied in multiple lines of work. Our algebraic provenance structure and corresponding provenance-aware semantics are based on the sound and complete axiomatization of Karabeg and Vianu. We demonstrate that our construction can guide the design of concrete provenance model instances for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Research Data Management Practices · Distributed and Parallel Computing Systems
