Overcoming Simplicity Bias in Deep Networks using a Feature Sieve
Rishabh Tiwari, Pradeep Shenoy

TL;DR
This paper introduces a feature sieve method to automatically suppress spurious, simple features in deep networks, enabling better extraction of complex, meaningful representations and improving performance on real-world debiasing benchmarks.
Contribution
The paper presents a novel, intervention-based feature sieve technique that does not require prior knowledge of spurious features, outperforming existing methods in debiasing deep networks.
Findings
11.4% relative gain on Imagenet-A
3.2% improvement on BAR dataset
Effective suppression of spurious features in real-world images
Abstract
Simplicity bias is the concerning tendency of deep networks to over-depend on simple, weakly predictive features, to the exclusion of stronger, more complex features. This is exacerbated in real-world applications by limited training data and spurious feature-label correlations, leading to biased, incorrect predictions. We propose a direct, interventional method for addressing simplicity bias in DNNs, which we call the feature sieve. We aim to automatically identify and suppress easily-computable spurious features in lower layers of the network, thereby allowing the higher network levels to extract and utilize richer, more meaningful representations. We provide concrete evidence of this differential suppression & enhancement of relevant features on both controlled datasets and real-world images, and report substantial gains on many real-world debiasing benchmarks (11.4% relative gain on…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Advanced Neural Network Applications
