Privacy Preservation by Disassociation
Manolis Terrovitis, John Liagouris, Nikos Mamoulis, Spiros, Skiadopoulos

TL;DR
This paper introduces a novel anonymization technique called disassociation that protects user privacy in sparse multidimensional data by breaking associations between record terms, especially useful for web search query logs.
Contribution
The paper presents the first disassociation-based anonymization method that preserves original data terms while preventing linkage attacks, outperforming existing generalization and differential privacy techniques.
Findings
Disassociation effectively prevents linkage of rare term combinations.
The method outperforms state-of-the-art techniques in preserving data utility.
Experimental results show improved privacy protection on real and synthetic datasets.
Abstract
In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Cryptography and Data Security
