PUG: A Framework and Practical Implementation for Why & Why-Not   Provenance (extended version)

Seokki Lee; Bertram Ludaescher; Boris Glavic

arXiv:1808.05752·cs.DB·August 20, 2018

PUG: A Framework and Practical Implementation for Why & Why-Not Provenance (extended version)

Seokki Lee, Bertram Ludaescher, Boris Glavic

PDF

Open Access

TL;DR

This paper introduces PUG, a practical graph-based framework for explaining why and why-not answers in first-order queries with negation, supporting reverse reasoning and provenance factorization.

Contribution

It presents the first practical approach for provenance explanations in negation-involving queries, with a graph model and system implementation that encodes various provenance models.

Findings

01

Efficient explanation generation for complex queries.

02

Supports reverse reasoning in provenance analysis.

03

Achieves provenance factorization through query rewriting.

Abstract

Explaining why an answer is (or is not) returned by a query is important for many applications including auditing, debugging data and queries, and answering hypothetical questions about data. In this work, we present the first practical approach for answering such questions for queries with negation (first- order queries). Specifically, we introduce a graph-based provenance model that, while syntactic in nature, supports reverse reasoning and is proven to encode a wide range of provenance models from the literature. The implementation of this model in our PUG (Provenance Unification through Graphs) system takes a provenance question and Datalog query as an input and generates a Datalog program that computes an explanation, i.e., the part of the provenance that is relevant to answer the question. Furthermore, we demonstrate how a desirable factorization of provenance can be achieved by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Data Quality and Management · Research Data Management Practices