Privacy-Preserving Synthetic Location Data in the Real World
Teddy Cunningham, Graham Cormode, Hakan Ferhatosmanoglu

TL;DR
This paper introduces two differentially private methods for generating synthetic location data that maintain high utility and privacy, enabling accurate location analytics without risking individual privacy breaches.
Contribution
The paper presents two novel approaches for privacy-preserving synthetic location data generation, incorporating kernel density estimation and geographic constraints, with demonstrated high utility on large datasets.
Findings
Synthetic data closely matches real data distributions.
High utility in location query analysis.
Methods satisfy differential privacy guarantees.
Abstract
Sharing sensitive data is vital in enabling many modern data analysis and machine learning tasks. However, current methods for data release are insufficiently accurate or granular to provide meaningful utility, and they carry a high risk of deanonymization or membership inference attacks. In this paper, we propose a differentially private synthetic data generation solution with a focus on the compelling domain of location data. We present two methods with high practical utility for generating synthetic location data from real locations, both of which protect the existence and true location of each individual in the original dataset. Our first, partitioning-based approach introduces a novel method for privately generating point data using kernel density estimation, in addition to employing private adaptations of classic statistical techniques, such as clustering, for private…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
