Causality and the semantics of provenance
James Cheney

TL;DR
This paper explores the formal foundations of provenance and causality, proposing a causal semantics for provenance graphs based on structural models to better justify and compare provenance techniques.
Contribution
It introduces a formal causal semantics for provenance graphs using structural models, bridging the gap between informal notions and rigorous mathematical models.
Findings
Review of a causality theory based on structural models
Development of a causal semantics for provenance graphs
Highlighting the need for formal models to justify provenance mechanisms
Abstract
Provenance, or information about the sources, derivation, custody or history of data, has been studied recently in a number of contexts, including databases, scientific workflows and the Semantic Web. Many provenance mechanisms have been developed, motivated by informal notions such as influence, dependence, explanation and causality. However, there has been little study of whether these mechanisms formally satisfy appropriate policies or even how to formalize relevant motivating concepts such as causality. We contend that mathematical models of these concepts are needed to justify and compare provenance techniques. In this paper we review a theory of causality based on structural models that has been developed in artificial intelligence, and describe work in progress on a causal semantics for provenance graphs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Data Quality and Management · Research Data Management Practices
