TL;DR
This paper introduces MEAT, a median-ensemble adversarial training method that enhances model robustness and generalization by addressing robust overfitting caused by weight anomalies in self-ensemble defenses.
Contribution
The paper proposes MEAT, a novel median-based ensemble approach that effectively mitigates robust overfitting in adversarial training, improving robustness and generalization.
Findings
MEAT achieves superior robustness against AutoAttack.
MEAT effectively alleviates robust overfitting.
Combining MEAT with other defenses improves overall robustness.
Abstract
Self-ensemble adversarial training methods improve model robustness by ensembling models at different training epochs, such as model weight averaging (WA). However, previous research has shown that self-ensemble defense methods in adversarial training (AT) still suffer from robust overfitting, which severely affects the generalization performance. Empirically, in the late phases of training, the AT becomes more overfitting to the extent that the individuals for weight averaging also suffer from overfitting and produce anomalous weight values, which causes the self-ensemble model to continue to undergo robust overfitting due to the failure in removing the weight anomalies. To solve this problem, we aim to tackle the influence of outliers in the weight space in this work and propose an easy-to-operate and effective Median-Ensemble Adversarial Training (MEAT) method to solve the robust…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
