PUG: A Framework and Practical Implementation for Why & Why-Not Provenance (extended version)
Seokki Lee, Bertram Ludaescher, Boris Glavic

TL;DR
This paper introduces PUG, a practical graph-based framework for explaining why and why-not answers in first-order queries with negation, supporting reverse reasoning and provenance factorization.
Contribution
It presents the first practical approach for provenance explanations in negation-involving queries, with a graph model and system implementation that encodes various provenance models.
Findings
Efficient explanation generation for complex queries.
Supports reverse reasoning in provenance analysis.
Achieves provenance factorization through query rewriting.
Abstract
Explaining why an answer is (or is not) returned by a query is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. In this work, we present the first practical approach for answering such questions for queries with negation (first- order queries). Specifically, we introduce a graph-based provenance model that, while syntactic in nature, supports reverse reasoning and is proven to encode a wide range of provenance models from the literature. The implementation of this model in our PUG (Provenance Unification through Graphs) system takes a provenance question and Datalog query as an input and generates a Datalog program that computes an explanation, i.e., the part of the provenance that is relevant to answer the question. Furthermore, we demonstrate how a desirable factorization of provenance can be achieved by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Quality and Management · Research Data Management Practices
