Provenance, Anonymisation and Data Environments: a Unifying Construction
Muhammad Aslam Jarwar, Adriane Chapman, Mark Elliot, Fatemeh Raji

TL;DR
This paper explores how provenance information can enhance anonymisation decision-making within data environments, proposing multiple implementation strategies and analyzing their trade-offs.
Contribution
It introduces a novel approach to model data environments using provenance data within the ADF framework, addressing an unmet requirement.
Findings
Provenance can support anonymisation decisions effectively.
Four different methods to implement data environments using W3C PROV.
Analysis of costs and benefits for each implementation approach.
Abstract
The Anonymisation Decision-making Framework (ADF) operationalizes the risk management of data exchange between organizations, referred to as "data environments". The second edition of ADF has increased its emphasis on modeling data flows, highlighting a potential new use of provenance information to support anonymisation decision-making. In this paper, we provide a use case that showcases this functionality more. Based on this use case, we identify how provenance information could be utilized within the ADF framework, and identify a currently un-met requirement which is the modeling of \textit{data environments}. We show how data environments can be implemented within the W3C PROV in four different ways. We analyze the costs and benefits of each approach, and consider another use case as a partial check for completeness. We then summarize our findings and suggest ways forward.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Data Quality and Management
