Explicit Tradeoffs between Adversarial and Natural Distributional Robustness
Mazda Moayeri, Kiarash Banihashem, Soheil Feizi

TL;DR
This paper explores the inherent tradeoffs between adversarial robustness and natural distributional robustness in deep neural networks, revealing that improving one can negatively impact the other, with both theoretical analysis and extensive empirical validation.
Contribution
It provides a theoretical and empirical analysis of the tradeoffs between adversarial and natural robustness, highlighting how adversarial training influences reliance on spurious features.
Findings
Adversarial training increases reliance on spurious features.
Spurious reliance in adversarial training depends on feature scale.
Adversarial training can reduce distributional robustness in changing domains.
Abstract
Several existing works study either adversarial or natural distributional robustness of deep neural networks separately. In practice, however, models need to enjoy both types of robustness to ensure reliability. In this work, we bridge this gap and show that in fact, explicit tradeoffs exist between adversarial and natural distributional robustness. We first consider a simple linear regression setting on Gaussian data with disjoint sets of core and spurious features. In this setting, through theoretical and empirical analysis, we show that (i) adversarial training with and norms increases the model reliance on spurious features; (ii) For adversarial training, spurious reliance only occurs when the scale of the spurious features is larger than that of the core features; (iii) adversarial training can have an unintended consequence in reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning
MethodsTest · Linear Regression
