Privacy-Preserving Synthetic Location Data in the Real World

Teddy Cunningham; Graham Cormode; Hakan Ferhatosmanoglu

arXiv:2108.02089·cs.DB·August 25, 2021

Privacy-Preserving Synthetic Location Data in the Real World

Teddy Cunningham, Graham Cormode, Hakan Ferhatosmanoglu

PDF

TL;DR

This paper introduces two differentially private methods for generating synthetic location data that maintain high utility and privacy, enabling accurate location analytics without risking individual privacy breaches.

Contribution

The paper presents two novel approaches for privacy-preserving synthetic location data generation, incorporating kernel density estimation and geographic constraints, with demonstrated high utility on large datasets.

Findings

01

Synthetic data closely matches real data distributions.

02

High utility in location query analysis.

03

Methods satisfy differential privacy guarantees.

Abstract

Sharing sensitive data is vital in enabling many modern data analysis and machine learning tasks. However, current methods for data release are insufficiently accurate or granular to provide meaningful utility, and they carry a high risk of deanonymization or membership inference attacks. In this paper, we propose a differentially private synthetic data generation solution with a focus on the compelling domain of location data. We present two methods with high practical utility for generating synthetic location data from real locations, both of which protect the existence and true location of each individual in the original dataset. Our first, partitioning-based approach introduces a novel method for privately generating point data using kernel density estimation, in addition to employing private adaptations of classic statistical techniques, such as clustering, for private…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.