How Tough Is Location Anonymization? Re-identifying 100K Real-User Trajectories in Japan
Abhishek Kumar Mishra, Mathieu Cunche, Heber H. Arcolezi

TL;DR
This study demonstrates that existing anonymization methods for mobility data in Japan are inadequate, as re-identification is still feasible, emphasizing the need for more robust privacy-preserving techniques.
Contribution
The paper provides a comprehensive analysis of privacy risks in released mobility datasets and evaluates the effectiveness of current sanitization strategies.
Findings
Re-identification is possible using density, urban, and temporal signatures.
Current privacy techniques often compromise data utility or fail to prevent re-identification.
Strong privacy settings significantly reduce data usefulness.
Abstract
Mobility traces are among the most revealing forms of personal data, yet trajectory releases are often protected only by ad hoc transformations. We stress-test such practices on recently-released YJMob100K, an anonymized dataset of 100,000 user trajectories in Japan. First, we show that the applied protection leaves enough spatial and temporal structure to recover both the real-world geographic frame and the actual calendar timeline by exploiting density signatures, urban correlations, and temporal activity profiles. On top of this reconstruction, we quantify privacy risks through trajectory-level metrics that capture spatio-temporal k-anonymity, -point unicity, home-work and multi-anchor uniqueness, and exposure to secluded and sensitive locations. These metrics reveal extensive re-identification surfaces: a small number of observations, anchors, or sensitive venues often suffices to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
