Benign Overfitting in Adversarially Robust Linear Classification
Jinghui Chen, Yuan Cao, Quanquan Gu

TL;DR
This paper demonstrates that benign overfitting occurs in adversarial training of linear classifiers, showing they can generalize well despite overfitting noisy data under adversarial perturbations.
Contribution
It provides the first theoretical analysis of benign overfitting in adversarially trained linear classifiers, establishing risk bounds under $\, ext{ell}_p$ perturbations.
Findings
Adversarially trained linear classifiers achieve near-optimal risks.
Benign overfitting persists under moderate adversarial perturbations.
Numerical experiments support theoretical results.
Abstract
"Benign overfitting", where classifiers memorize noisy training data yet still achieve a good generalization performance, has drawn great attention in the machine learning community. To explain this surprising phenomenon, a series of works have provided theoretical justification in over-parameterized linear regression, classification, and kernel methods. However, it is not clear if benign overfitting still occurs in the presence of adversarial examples, i.e., examples with tiny and intentional perturbations to fool the classifiers. In this paper, we show that benign overfitting indeed occurs in adversarial training, a principled approach to defend against adversarial examples. In detail, we prove the risk bounds of the adversarially trained linear classifier on the mixture of sub-Gaussian data under adversarial perturbations. Our result suggests that under moderate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Statistical Methods and Models
