Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets
Yogesh Balaji, Tom Goldstein, Judy Hoffman

TL;DR
This paper introduces instance adaptive adversarial training, which assigns sample-specific perturbation margins to improve neural network accuracy on unperturbed data while maintaining robustness, addressing generalization issues of standard adversarial training.
Contribution
The paper proposes a novel instance adaptive adversarial training method that enforces sample-specific perturbation margins, enhancing unperturbed test accuracy with minimal robustness loss.
Findings
Improved test accuracy on CIFAR-10, CIFAR-100, and ImageNet datasets.
Marginal decrease in robustness with increased unperturbed accuracy.
Effective in better generalizing adversarial training to real-world scenarios.
Abstract
Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications
MethodsTest
