Adversarial Weight Perturbation Helps Robust Generalization
Dongxian Wu, Shu-tao Xia, Yisen Wang

TL;DR
This paper introduces Adversarial Weight Perturbation (AWP), a method that explicitly flattens the weight loss landscape during adversarial training, leading to improved robustness of deep neural networks against adversarial attacks.
Contribution
The paper proposes AWP, a novel regularization technique that enhances adversarial training by explicitly flattening the weight loss landscape, which was previously underexplored.
Findings
AWP results in a flatter weight loss landscape.
AWP improves adversarial robustness across various methods.
AWP can be easily integrated into existing adversarial training frameworks.
Abstract
The study on improving the robustness of deep neural networks against adversarial examples grows rapidly in recent years. Among them, adversarial training is the most promising one, which flattens the input loss landscape (loss change with respect to input) via training on adversarially perturbed examples. However, how the widely used weight loss landscape (loss change with respect to weight) performs in adversarial training is rarely explored. In this paper, we investigate the weight loss landscape from a new perspective, and identify a clear correlation between the flatness of weight loss landscape and robust generalization gap. Several well-recognized adversarial training improvements, such as early stopping, designing new objective functions, or leveraging unlabeled data, all implicitly flatten the weight loss landscape. Based on these observations, we propose a simple yet effective…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
