Synthetic Data, Similarity-based Privacy Metrics, and Regulatory   (Non-)Compliance

Georgi Ganev

arXiv:2407.16929·cs.CR·July 29, 2024

Synthetic Data, Similarity-based Privacy Metrics, and Regulatory (Non-)Compliance

Georgi Ganev

PDF

TL;DR

This paper critically examines the limitations of similarity-based privacy metrics in ensuring regulatory compliance for synthetic data, highlighting their inability to prevent re-identification and linkability risks.

Contribution

It provides a detailed analysis and counter-examples demonstrating that similarity-based metrics are insufficient for privacy guarantees and overlook key regulatory considerations.

Findings

01

Similarity-based metrics do not prevent re-identification.

02

They fail to address linkability risks.

03

They ignore the motivated intruder test.

Abstract

In this paper, we argue that similarity-based privacy metrics cannot ensure regulatory compliance of synthetic data. Our analysis and counter-examples show that they do not protect against singling out and linkability and, among other fundamental issues, completely ignore the motivated intruder test.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.