Spurious Features Everywhere -- Large-Scale Detection of Harmful   Spurious Features in ImageNet

Yannic Neuhaus; Maximilian Augustin; Valentyn Boreiko; Matthias Hein

arXiv:2212.04871·cs.CV·August 24, 2023·1 cites

Spurious Features Everywhere -- Large-Scale Detection of Harmful Spurious Features in ImageNet

Yannic Neuhaus, Maximilian Augustin, Valentyn Boreiko, Matthias Hein

PDF

Open Access 1 Repo

TL;DR

This paper presents a large-scale framework for detecting harmful spurious features in ImageNet classifiers using neural PCA, introduces a new dataset for measuring reliance on these features, and proposes a mitigation method called SpuFix.

Contribution

It introduces a systematic approach to identify and measure harmful spurious features in large datasets like ImageNet, and proposes SpuFix to mitigate their impact without retraining.

Findings

01

Neural PCA components effectively visualize spurious features.

02

Presence of harmful spurious features can trigger incorrect class predictions.

03

SpuFix reduces reliance on harmful spurious features in classifiers.

Abstract

Benchmark performance of deep learning classifiers alone is not a reliable predictor for the performance of a deployed model. In particular, if the image classifier has picked up spurious features in the training data, its predictions can fail in unexpected ways. In this paper, we develop a framework that allows us to systematically identify spurious features in large datasets like ImageNet. It is based on our neural PCA components and their visualization. Previous work on spurious features often operates in toy settings or requires costly pixel-wise annotations. In contrast, we work with ImageNet and validate our results by showing that presence of the harmful spurious feature of a class alone is sufficient to trigger the prediction of that class. We introduce the novel dataset "Spurious ImageNet" which allows to measure the reliance of any ImageNet classifier on harmful spurious…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

yanneu/spurious_imagenet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · COVID-19 diagnosis using AI

Methodsfail · Principal Components Analysis