Spurious Features Everywhere -- Large-Scale Detection of Harmful Spurious Features in ImageNet
Yannic Neuhaus, Maximilian Augustin, Valentyn Boreiko, Matthias Hein

TL;DR
This paper presents a large-scale framework for detecting harmful spurious features in ImageNet classifiers using neural PCA, introduces a new dataset for measuring reliance on these features, and proposes a mitigation method called SpuFix.
Contribution
It introduces a systematic approach to identify and measure harmful spurious features in large datasets like ImageNet, and proposes SpuFix to mitigate their impact without retraining.
Findings
Neural PCA components effectively visualize spurious features.
Presence of harmful spurious features can trigger incorrect class predictions.
SpuFix reduces reliance on harmful spurious features in classifiers.
Abstract
Benchmark performance of deep learning classifiers alone is not a reliable predictor for the performance of a deployed model. In particular, if the image classifier has picked up spurious features in the training data, its predictions can fail in unexpected ways. In this paper, we develop a framework that allows us to systematically identify spurious features in large datasets like ImageNet. It is based on our neural PCA components and their visualization. Previous work on spurious features often operates in toy settings or requires costly pixel-wise annotations. In contrast, we work with ImageNet and validate our results by showing that presence of the harmful spurious feature of a class alone is sufficient to trigger the prediction of that class. We introduce the novel dataset "Spurious ImageNet" which allows to measure the reliance of any ImageNet classifier on harmful spurious…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI
Methodsfail · Principal Components Analysis
