Salient ImageNet: How to discover spurious features in Deep Learning?
Sahil Singla, Soheil Feizi

TL;DR
This paper presents a scalable, general framework for discovering and localizing spurious and core visual features in image classification models, revealing that many models rely on spurious features which standard accuracy metrics overlook.
Contribution
The authors introduce a novel neural feature-based methodology to identify and localize spurious and core features with minimal supervision, and create the Salient ImageNet dataset for analysis.
Findings
Models rely heavily on spurious features for predictions.
The neural feature annotations generalize well to many images.
Standard accuracy metrics do not fully capture model reliability.
Abstract
Deep neural networks can be unreliable in the real world especially when they heavily use {\it spurious} features for their predictions. Focusing on image classifications, we define {\it core features} as the set of visual features that are always a part of the object definition while {\it spurious features} are the ones that are likely to {\it co-occur} with the object but not a part of it (e.g., attribute "fingers" for class "band aid"). Traditional methods for discovering spurious features either require extensive human annotations (thus, not scalable), or are useful on specific models. In this work, we introduce a {\it general} framework to discover a subset of spurious and core visual features used in inferences of a general model and localize them on a large number of images with minimal human supervision. Our methodology is based on this key idea: to identify spurious or core…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques
