Trivial or impossible -- dichotomous data difficulty masks model differences (on ImageNet and beyond)
Kristof Meding, Luca M. Schulze Buschoff, Robert Geirhos, Felix A., Wichmann

TL;DR
This paper investigates how dichotomous data difficulty in datasets like ImageNet influences model decisions, revealing that most images are either trivial or impossible, which masks differences between models and impacts understanding of their biases.
Contribution
It demonstrates that dichotomous data difficulty dominates model decision boundaries, and removing trivial and impossible images reveals clearer model differences.
Findings
46.0% of images are trivial for models
11.5% of images are impossible beyond label errors
Humans predict image difficulty for CNNs with 81.4% accuracy
Abstract
"The power of a generalization system follows directly from its biases" (Mitchell 1980). Today, CNNs are incredibly powerful generalisation systems -- but to what degree have we understood how their inductive bias influences model decisions? We here attempt to disentangle the various aspects that determine how a model decides. In particular, we ask: what makes one model decide differently from another? In a meticulously controlled setting, we find that (1.) irrespective of the network architecture or objective (e.g. self-supervised, semi-supervised, vision transformers, recurrent models) all models end up with a similar decision boundary. (2.) To understand these findings, we analysed model decisions on the ImageNet validation set from epoch to epoch and image by image. We find that the ImageNet validation set, among others, suffers from dichotomous data difficulty (DDD): For the range…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning
