Clean-image Backdoor Attacks
Dazhong Rong, Guoyao Yu, Shuheng Shen, Xinyi Fu, Peng Qian, Jianhai Chen, Qinming He, Xing Fu, Weiqiang Wang

TL;DR
This paper introduces clean-image backdoor attacks that inject backdoors into image classification models through label poisoning without altering training images, highlighting new security vulnerabilities in outsourced data labeling.
Contribution
It reveals a novel backdoor attack method that uses label falsification to implant backdoors without modifying images, challenging assumptions about data labeling security.
Findings
Backdoors can be injected via label poisoning without image modification.
The attack is effective even with a small fraction of incorrect labels.
The method significantly compromises model fairness and robustness.
Abstract
To gather a significant quantity of annotated training data for high-performance image classification models, numerous companies opt to enlist third-party providers to label their unlabeled data. This practice is widely regarded as secure, even in cases where some annotated errors occur, as the impact of these minor inaccuracies on the final performance of the models is negligible and existing backdoor attacks require attacker's ability to poison the training images. Nevertheless, in this paper, we propose clean-image backdoor attacks which uncover that backdoors can still be injected via a fraction of incorrect labels without modifying the training images. Specifically, in our attacks, the attacker first seeks a trigger feature to divide the training images into two parts: those with the feature and those without it. Subsequently, the attacker falsifies the labels of the former part to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPhysical Unclonable Functions (PUFs) and Hardware Security · Advanced Malware Detection Techniques · Advanced Steganography and Watermarking Techniques
MethodsOPT
