Instance adaptive adversarial training: Improved accuracy tradeoffs in   neural nets

Yogesh Balaji; Tom Goldstein; Judy Hoffman

arXiv:1910.08051·cs.LG·October 18, 2019·66 cites

Instance adaptive adversarial training: Improved accuracy tradeoffs in neural nets

Yogesh Balaji, Tom Goldstein, Judy Hoffman

PDF

Open Access 1 Repo

TL;DR

This paper introduces instance adaptive adversarial training, which assigns sample-specific perturbation margins to improve neural network accuracy on unperturbed data while maintaining robustness, addressing generalization issues of standard adversarial training.

Contribution

The paper proposes a novel instance adaptive adversarial training method that enforces sample-specific perturbation margins, enhancing unperturbed test accuracy with minimal robustness loss.

Findings

01

Improved test accuracy on CIFAR-10, CIFAR-100, and ImageNet datasets.

02

Marginal decrease in robustness with increased unperturbed accuracy.

03

Effective in better generalizing adversarial training to real-world scenarios.

Abstract

Adversarial training is by far the most successful strategy for improving robustness of neural networks to adversarial attacks. Despite its success as a defense mechanism, adversarial training fails to generalize well to unperturbed test set. We hypothesize that this poor generalization is a consequence of adversarial training with uniform perturbation radius around every training sample. Samples close to decision boundary can be morphed into a different class under a small perturbation budget, and enforcing large margins around these samples produce poor decision boundaries that generalize poorly. Motivated by this hypothesis, we propose instance adaptive adversarial training -- a technique that enforces sample-specific perturbation margins around every training sample. We show that using our approach, test accuracy on unperturbed samples improve with a marginal drop in robustness.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yogeshbalaji/Instance_Adaptive_Adversarial_Training
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Domain Adaptation and Few-Shot Learning · Anomaly Detection Techniques and Applications

MethodsTest