Robust Weight Perturbation for Adversarial Training
Chaojian Yu, Bo Han, Mingming Gong, Li Shen, Shiming Ge, Bo Du,, Tongliang Liu

TL;DR
This paper introduces a new criterion called Loss Stationary Condition (LSC) for adversarial weight perturbation, which effectively reduces overfitting and enhances robustness in adversarial training of deep networks.
Contribution
It proposes a novel LSC criterion to regulate weight perturbation, improving robustness and reducing overfitting in adversarial training.
Findings
The proposed method outperforms state-of-the-art adversarial training techniques.
Weight perturbation on adversarial data with small loss is crucial for robustness.
The strategy prevents overfitting without compromising robustness.
Abstract
Overfitting widely exists in adversarial robust training of deep networks. An effective remedy is adversarial weight perturbation, which injects the worst-case weight perturbation during network training by maximizing the classification loss on adversarial examples. Adversarial weight perturbation helps reduce the robust generalization gap; however, it also undermines the robustness improvement. A criterion that regulates the weight perturbation is therefore crucial for adversarial training. In this paper, we propose such a criterion, namely Loss Stationary Condition (LSC) for constrained perturbation. With LSC, we find that it is essential to conduct weight perturbation on adversarial data with small classification loss to eliminate robust overfitting. Weight perturbation on adversarial data with large classification loss is not necessary and may even lead to poor robustness. Based on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
