Rethinking Anonymity Claims in Synthetic Data Generation: A Model-Centric Privacy Attack Perspective

Georgi Ganev; Emiliano De Cristofaro

arXiv:2601.22434·cs.CR·February 2, 2026

Rethinking Anonymity Claims in Synthetic Data Generation: A Model-Centric Privacy Attack Perspective

Georgi Ganev, Emiliano De Cristofaro

PDF

Open Access

TL;DR

This paper critically examines the privacy guarantees of synthetic data generation, emphasizing the importance of considering model-centric attacks and regulatory definitions to ensure meaningful anonymization.

Contribution

It introduces a model-centric perspective on privacy risks in synthetic data, linking regulatory concepts with practical attack scenarios and evaluating privacy mechanisms like Differential Privacy and SBPMs.

Findings

01

Synthetic data alone does not guarantee privacy.

02

Differential Privacy provides stronger protections than SBPMs.

03

Model-centric attacks reveal potential privacy vulnerabilities.

Abstract

Training generative machine learning models to produce synthetic tabular data has become a popular approach for enhancing privacy in data sharing. As this typically involves processing sensitive personal information, releasing either the trained model or generated synthetic datasets can still pose privacy risks. Yet, recent research, commercial deployments, and privacy regulations like the General Data Protection Regulation (GDPR) largely assess anonymity at the level of an individual dataset. In this paper, we rethink anonymity claims about synthetic data from a model-centric perspective and argue that meaningful assessments must account for the capabilities and properties of the underlying generative model and be grounded in state-of-the-art privacy attacks. This perspective better reflects real-world products and deployments, where trained models are often readily accessible for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Ethics and Social Impacts of AI · Privacy, Security, and Data Protection