Understanding Robust Overfitting of Adversarial Training and Beyond
Chaojian Yu, Bo Han, Li Shen, Jun Yu, Chen Gong, Mingming Gong,, Tongliang Liu

TL;DR
This paper investigates the causes of robust overfitting in adversarial training, revealing that small-loss data contribute to overfitting, and proposes MLCAT, a method to mitigate this issue and improve robustness.
Contribution
The paper introduces MLCAT, a novel adversarial training method that prevents robust overfitting by controlling the influence of small-loss data, enhancing adversarial robustness.
Findings
MLCAT effectively eliminates robust overfitting.
MLCAT boosts adversarial robustness beyond existing methods.
Distribution analysis links small-loss data to overfitting.
Abstract
Robust overfitting widely exists in adversarial training of deep networks. The exact underlying reasons for this are still not completely understood. Here, we explore the causes of robust overfitting by comparing the data distribution of \emph{non-overfit} (weak adversary) and \emph{overfitted} (strong adversary) adversarial training, and observe that the distribution of the adversarial data generated by weak adversary mainly contain small-loss data. However, the adversarial data generated by strong adversary is more diversely distributed on the large-loss data and the small-loss data. Given these observations, we further designed data ablation adversarial training and identify that some small-loss data which are not worthy of the adversary strength cause robust overfitting in the strong adversary mode. To relieve this issue, we propose \emph{minimum loss constrained adversarial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Physical Unclonable Functions (PUFs) and Hardware Security
