Shortcut Learning Susceptibility in Vision Classifiers
Pirzada Suhail, Vrinda Goel, Amit Sethi

TL;DR
This paper investigates how different vision model architectures, including CNNs, MLPs, and ViTs, are susceptible to shortcut learning by systematically introducing artificial cues and analyzing their reliance on these shortcuts versus genuine features.
Contribution
The study provides a systematic evaluation of shortcut learning across various vision architectures using controlled datasets with artificial shortcuts, including qualitative analysis of internal representations.
Findings
CNNs at lower learning rates are less reliant on shortcuts.
ViTs without positional encodings tend to ignore actual image features in presence of shortcuts.
Models vary significantly in their susceptibility to shortcut learning depending on architecture and training parameters.
Abstract
Shortcut learning, where machine learning models exploit spurious correlations in data instead of capturing meaningful features, poses a significant challenge to building robust and generalizable models. This phenomenon is prevalent across various machine learning applications, including vision, natural language processing, and speech recognition, where models may find unintended cues that minimize training loss but fail to capture the underlying structure of the data. Vision classifiers based on Convolutional Neural Networks (CNNs), Multi-Layer Perceptrons (MLPs), and Vision Transformers (ViTs) leverage distinct architectural principles to process spatial and structural information, making them differently susceptible to shortcut learning. In this study, we systematically evaluate these architectures by introducing deliberate shortcuts into the dataset that are correlated with class…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Domain Adaptation and Few-Shot Learning
