Quality Degradation Attack in Synthetic Data

Qinyi Liu; Dong Liu; Sam Urmian; Mohammad Khalil; and Pedro P. Vergara Barrios

arXiv:2601.02947·cs.CR·May 21, 2026

Quality Degradation Attack in Synthetic Data

Qinyi Liu, Dong Liu, Sam Urmian, Mohammad Khalil, and Pedro P. Vergara Barrios

PDF

TL;DR

This paper explores how adversaries with access to real data or control over data generation can intentionally degrade the quality of synthetic data, revealing vulnerabilities in current SDG methods.

Contribution

It formalizes a threat model for quality degradation attacks and empirically demonstrates their impact on synthetic data utility and statistical properties.

Findings

01

Small perturbations can significantly reduce predictive performance.

02

Targeted manipulations increase statistical divergence.

03

Vulnerabilities exist in current synthetic data generation pipelines.

Abstract

Synthetic Data Generation (SDG) can be used to facilitate privacy-preserving data sharing. However, most existing research focuses on privacy attacks where the adversary is the recipient of the released synthetic data and attempts to infer sensitive information from it. This study investigates quality degradation attacks initiated by adversaries who possess access to the real dataset or control over the generation process, such as the data owner, the synthetic data provider, or potential intruders. We formalize a corresponding threat model and empirically evaluate the effectiveness of targeted manipulations of real data (e.g., label flipping and feature-importance-based interventions) on the quality of generated synthetic data. The results show that even small perturbations can substantially reduce downstream predictive performance and increase statistical divergence, exposing…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Data Quality and Management · Blockchain Technology Applications and Security