A Provenance Tracking Model for Data Updates
Gabriel Ciobanu (Romanian Academy, Institute of Computer Science),, Ross Horne (Romanian Academy, Institute of Computer Science)

TL;DR
This paper introduces a formal calculus for tracking data provenance during updates in decentralized systems, providing a semantics that accounts for concurrency and data quality assessment.
Contribution
It presents a novel calculus and semantics for data updates and provenance in concurrent, decentralized systems, extending existing provenance models.
Findings
Provides a formal semantics for data updates and provenance in concurrent systems
Develops a model based on series-parallel DAGs for provenance tracking
Enhances understanding of data quality in decentralized data systems
Abstract
For data-centric systems, provenance tracking is particularly important when the system is open and decentralised, such as the Web of Linked Data. In this paper, a concise but expressive calculus which models data updates is presented. The calculus is used to provide an operational semantics for a system where data and updates interact concurrently. The operational semantics of the calculus also tracks the provenance of data with respect to updates. This provides a new formal semantics extending provenance diagrams which takes into account the execution of processes in a concurrent setting. Moreover, a sound and complete model for the calculus based on ideals of series-parallel DAGs is provided. The notion of provenance introduced can be used as a subjective indicator of the quality of data in concurrent interacting systems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
