Rethinking Anonymity Claims in Synthetic Data Generation: A Model-Centric Privacy Attack Perspective
Georgi Ganev, Emiliano De Cristofaro

TL;DR
This paper critically examines the privacy guarantees of synthetic data generation, emphasizing the importance of considering model-centric attacks and regulatory definitions to ensure meaningful anonymization.
Contribution
It introduces a model-centric perspective on privacy risks in synthetic data, linking regulatory concepts with practical attack scenarios and evaluating privacy mechanisms like Differential Privacy and SBPMs.
Findings
Synthetic data alone does not guarantee privacy.
Differential Privacy provides stronger protections than SBPMs.
Model-centric attacks reveal potential privacy vulnerabilities.
Abstract
Training generative machine learning models to produce synthetic tabular data has become a popular approach for enhancing privacy in data sharing. As this typically involves processing sensitive personal information, releasing either the trained model or generated synthetic datasets can still pose privacy risks. Yet, recent research, commercial deployments, and privacy regulations like the General Data Protection Regulation (GDPR) largely assess anonymity at the level of an individual dataset. In this paper, we rethink anonymity claims about synthetic data from a model-centric perspective and argue that meaningful assessments must account for the capabilities and properties of the underlying generative model and be grounded in state-of-the-art privacy attacks. This perspective better reflects real-world products and deployments, where trained models are often readily accessible for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Ethics and Social Impacts of AI · Privacy, Security, and Data Protection
