Mitigating Error Amplification in Fast Adversarial Training
Mengnan Zhao, Lihe Zhang, Bo Wang, Tianhang Zheng, Hong Zhong, Geyong Min

TL;DR
This paper introduces a dynamic guidance strategy for fast adversarial training that reduces overfitting and robustness degradation by adjusting perturbation and supervision based on sample confidence.
Contribution
It proposes a Distribution-aware Dynamic Guidance (DDG) method that adaptively modulates training signals to improve robustness and generalization in adversarial training.
Findings
DDG reduces catastrophic overfitting in FAT.
DDG improves robustness without significant clean accuracy loss.
Experiments show DDG outperforms existing methods on benchmark datasets.
Abstract
Fast Adversarial Training (FAT) has proven effective in enhancing model robustness by encouraging networks to learn perturbation-invariant representations. However, FAT often suffers from catastrophic overfitting (CO), where the model overfits to the training attack and fails to generalize to unseen ones. Moreover, robustness oriented optimization typically leads to notable performance degradation on clean inputs, and such degradation becomes increasingly severe as the perturbation budget grows. In this work, we conduct a comprehensive analysis of how guidance strength affects model performance by modulating perturbation and supervision levels across distinct confidence groups. The findings reveal that low confidence samples are the primary contributors to CO and the robustness accuracy trade off. Building on this insight, we propose a Distribution-aware Dynamic Guidance (DDG) strategy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
