TL;DR
This paper critically examines a previous claim about reidentification risks in anonymized credit card data, demonstrating that proper anonymization techniques can effectively prevent reidentification while maintaining data utility.
Contribution
The authors clarify misunderstandings in prior work and provide a validated method for anonymizing data to eliminate reidentification risks, supported by open datasets and algorithms.
Findings
Reidentification risk was overestimated in prior work.
Proper anonymization can eliminate reidentification while preserving data utility.
Open datasets and algorithms are available for verification and use.
Abstract
The study by De Montjoye et al. ("Science", 30 January 2015, p. 536) claimed that most individuals can be reidentified from a deidentified credit card transaction database and that anonymization mechanisms are not effective against reidentification. Such claims deserve detailed quantitative scrutiny, as they might seriously undermine the willingness of data owners and subjects to share data for research. In a recent Technical Comment published in "Science" (18 March 2016, p. 1274), we demonstrate that the reidentification risk reported by De Montjoye et al. was significantly overestimated (due to a misunderstanding of the reidentification attack) and that the alleged ineffectiveness of anonymization is due to the choice of poor and undocumented methods and to a general disregard of 40 years of anonymization literature. The technical comment also shows how to properly anonymize data, in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
