An applied Perspective: Estimating the Differential Identifiability Risk of an Exemplary SOEP Data Set
Jonas Allmann, Saskia Nu\~nez von Voigt, Florian Tschorsch

TL;DR
This paper evaluates the privacy risks associated with releasing anonymized real-world data, extending a privacy metric to better understand differential privacy guarantees in practical scenarios.
Contribution
It extends an existing privacy risk metric and demonstrates how to efficiently compute it for basic statistical queries on real-world data.
Findings
Extended the privacy risk metric for practical use
Empirical analysis on real-world data reveals challenges in risk estimation
Provides insights into privacy guarantees in real-world data sharing
Abstract
Using real-world study data usually requires contractual agreements where research results may only be published in anonymized form. Requiring formal privacy guarantees, such as differential privacy, could be helpful for data-driven projects to comply with data protection. However, deploying differential privacy in consumer use cases raises the need to explain its underlying mechanisms and the resulting privacy guarantees. In this paper, we thoroughly review and extend an existing privacy metric. We show how to compute this risk metric efficiently for a set of basic statistical queries. Our empirical analysis based on an extensive, real-world scientific data set expands the knowledge on how to compute risks under realistic conditions, while presenting more challenges than solutions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Quality and Management
