Provenance, Anonymisation and Data Environments: a Unifying Construction

Muhammad Aslam Jarwar; Adriane Chapman; Mark Elliot; Fatemeh Raji

arXiv:2107.09966·cs.DB·July 22, 2021

Provenance, Anonymisation and Data Environments: a Unifying Construction

Muhammad Aslam Jarwar, Adriane Chapman, Mark Elliot, Fatemeh Raji

PDF

Open Access

TL;DR

This paper explores how provenance information can enhance anonymisation decision-making within data environments, proposing multiple implementation strategies and analyzing their trade-offs.

Contribution

It introduces a novel approach to model data environments using provenance data within the ADF framework, addressing an unmet requirement.

Findings

01

Provenance can support anonymisation decisions effectively.

02

Four different methods to implement data environments using W3C PROV.

03

Analysis of costs and benefits for each implementation approach.

Abstract

The Anonymisation Decision-making Framework (ADF) operationalizes the risk management of data exchange between organizations, referred to as "data environments". The second edition of ADF has increased its emphasis on modeling data flows, highlighting a potential new use of provenance information to support anonymisation decision-making. In this paper, we provide a use case that showcases this functionality more. Based on this use case, we identify how provenance information could be utilized within the ADF framework, and identify a currently un-met requirement which is the modeling of \textit{data environments}. We show how data environments can be implemented within the W3C PROV in four different ways. We analyze the costs and benefits of each approach, and consider another use case as a partial check for completeness. We then summarize our findings and suggest ways forward.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Advanced Data Storage Technologies · Data Quality and Management