More Data Can Expand the Generalization Gap Between Adversarially Robust and Standard Models
Lin Chen, Yifei Min, Mingrui Zhang, Amin Karbasi

TL;DR
This paper reveals that increasing training data can paradoxically widen the generalization gap between adversarially robust and standard models, challenging conventional wisdom and providing conditions when more data helps.
Contribution
The paper provides theoretical analysis showing that more data can increase the generalization gap in adversarial training, and identifies conditions when additional data reduces it.
Findings
More data may increase the gap between robust and standard models.
Theoretical conditions under which additional data shrinks the gap.
Experimental validation on linear regression models.
Abstract
Despite remarkable success in practice, modern machine learning models have been found to be susceptible to adversarial attacks that make human-imperceptible perturbations to the data, but result in serious and potentially dangerous prediction errors. To address this issue, practitioners often use adversarial training to learn models that are robust against such attacks at the cost of higher generalization error on unperturbed test sets. The conventional wisdom is that more training data should shrink the gap between the generalization error of adversarially-trained models and standard models. However, we study the training of robust classifiers for both Gaussian and Bernoulli models under attacks, and we prove that more data may actually increase this gap. Furthermore, our theoretical results identify if and when additional data will finally begin to shrink the gap.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Bacillus and Francisella bacterial research
MethodsTest · Linear Regression
