Adversarial Examples Make Strong Poisons
Liam Fowl, Micah Goldblum, Ping-yeh Chiang, Jonas Geiping, Wojtek, Czaja, Tom Goldstein

TL;DR
This paper demonstrates that adversarial examples can be used as highly effective poisons for training data, revealing their potential for data obfuscation and improving poisoning attack strategies.
Contribution
It introduces adversarial poisoning as a novel method that surpasses existing poisoning techniques and provides a poisoned ImageNet dataset for research.
Findings
Adversarial examples are more effective for poisoning than recent methods.
Assigning original labels to adversarial examples prevents training on natural images.
Using adversarial class labels makes adversarial examples useful for training.
Abstract
The adversarial machine learning literature is largely partitioned into evasion attacks on testing data and poisoning attacks on training data. In this work, we show that adversarial examples, originally intended for attacking pre-trained models, are even more effective for data poisoning than recent methods designed specifically for poisoning. Our findings indicate that adversarial examples, when assigned the original label of their natural base image, cannot be used to train a classifier for natural images. Furthermore, when adversarial examples are assigned their adversarial class label, they are useful for training. This suggests that adversarial examples contain useful semantic content, just with the ``wrong'' labels (according to a network, but not a human). Our method, adversarial poisoning, is substantially more effective than existing poisoning methods for secure dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Forensic Fingerprint Detection Methods · Forensic Toxicology and Drug Analysis
