The Pitfalls of Simplicity Bias in Neural Networks
Harshay Shah, Kaustav Tamuly, Aditi Raghunathan, Prateek Jain,, Praneeth Netrapalli

TL;DR
This paper investigates the concept of Simplicity Bias in neural networks, revealing its extreme forms and implications for generalization and robustness, and introduces datasets to evaluate mitigation strategies.
Contribution
It provides a precise notion of simplicity, demonstrates how SB can be extreme and harmful, and offers datasets for testing algorithms to address SB's pitfalls.
Findings
Neural networks can rely solely on the simplest features due to SB.
SB can cause models to fail under distribution shifts and adversarial attacks.
Ensembles and adversarial training may not effectively mitigate SB.
Abstract
Several works have proposed Simplicity Bias (SB)---the tendency of standard training procedures such as Stochastic Gradient Descent (SGD) to find simple models---to justify why neural networks generalize well [Arpit et al. 2017, Nakkiran et al. 2019, Soudry et al. 2018]. However, the precise notion of simplicity remains vague. Furthermore, previous settings that use SB to theoretically justify why neural networks generalize well do not simultaneously capture the non-robustness of neural networks---a widely observed phenomenon in practice [Goodfellow et al. 2014, Jo and Bengio 2017]. We attempt to reconcile SB and the superior standard generalization of neural networks with the non-robustness observed in practice by designing datasets that (a) incorporate a precise notion of simplicity, (b) comprise multiple predictive features with varying levels of simplicity, and (c) capture the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Advanced Neural Network Applications
MethodsStochastic Gradient Descent
