On the Evaluation of the Privacy Breach in Disassociated Set-Valued Datasets
Sara Barakat, Bechara Al Bouna, Mohamed Nassar, Christophe Guyeux

TL;DR
This paper investigates the privacy risks of disassociation, a bucketization technique for anonymizing set-valued datasets, revealing potential privacy breaches and evaluating their severity with a quantitative detection method.
Contribution
It demonstrates the limitations of disassociation in preventing privacy breaches and evaluates these risks using real datasets and a novel detection algorithm.
Findings
Privacy breaches can occur due to cover problems in disassociated datasets.
Disassociation's limits are demonstrated through real dataset analysis.
Quantitative methods can detect privacy breaches effectively.
Abstract
Data anonymization is gaining much attention these days as it provides the fundamental requirements to safely outsource datasets containing identifying information. While some techniques add noise to protect privacy others use generalization to hide the link between sensitive and non-sensitive information or separate the dataset into clusters to gain more utility. In the latter, often referred to as bucketization, data values are kept intact, only the link is hidden to maximize the utility. In this paper, we showcase the limits of disassociation, a bucketization technique that divides a set-valued dataset into -anonymous clusters. We demonstrate that a privacy breach might occur if the disassociated dataset is subject to a cover problem. We finally evaluate the privacy breach using the quantitative privacy breach detection algorithm on real disassociated datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Cryptography and Data Security
