Associative Poisoning to Generative Machine Learning
Mathias Lundteigen Mohus, Jingyue Li, Zhirong Yang

TL;DR
This paper introduces associative poisoning, a novel data poisoning method that subtly manipulates feature associations in generative models without controlling training, highlighting vulnerabilities and proposing defenses.
Contribution
The paper presents a new poisoning technique that affects feature correlations in generative models without training control, supported by formal proofs and empirical validation.
Findings
Associative poisoning effectively manipulates feature associations.
The attack maintains data quality and evades visual detection.
Existing defenses have limitations against this subtle attack.
Abstract
The widespread adoption of generative models such as Stable Diffusion and ChatGPT has made them increasingly attractive targets for malicious exploitation, particularly through data poisoning. Existing poisoning attacks compromising synthesised data typically either cause broad degradation of generated data or require control over the training process, limiting their applicability in real-world scenarios. In this paper, we introduce a novel data poisoning technique called associative poisoning, which compromises fine-grained features of the generated data without requiring control of the training process. This attack perturbs only the training data to manipulate statistical associations between specific feature pairs in the generated outputs. We provide a formal mathematical formulation of the attack and prove its theoretical feasibility and stealthiness. Empirical evaluations using two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Malware Detection Techniques · Generative Adversarial Networks and Image Synthesis
