TL;DR
This paper demonstrates that strong data augmentation techniques like mixup and CutMix can effectively defend against data poisoning and backdoor attacks without sacrificing model accuracy, outperforming traditional defenses.
Contribution
It introduces the use of data augmentation as a simple yet powerful defense mechanism against poisoning and backdoor attacks, with comprehensive validation.
Findings
Mixup and CutMix reduce attack success rates significantly.
CutMix increases validation accuracy by 9%.
Augmentation-based defenses outperform DP-SGD in robustness.
Abstract
Data poisoning and backdoor attacks manipulate victim models by maliciously modifying training data. In light of this growing threat, a recent survey of industry professionals revealed heightened fear in the private sector regarding data poisoning. Many previous defenses against poisoning either fail in the face of increasingly strong attacks, or they significantly degrade performance. However, we find that strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance. We further verify the effectiveness of this simple defense against adaptive poisoning methods, and we compare to baselines including the popular differentially private SGD (DP-SGD) defense. In the context of backdoors, CutMix greatly mitigates the attack while simultaneously increasing validation accuracy by 9%.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsMixup · Stochastic Gradient Descent · CutMix
