Trivial or impossible -- dichotomous data difficulty masks model   differences (on ImageNet and beyond)

Kristof Meding; Luca M. Schulze Buschoff; Robert Geirhos; Felix A.; Wichmann

arXiv:2110.05922·cs.CV·April 29, 2022·1 cites

Trivial or impossible -- dichotomous data difficulty masks model differences (on ImageNet and beyond)

Kristof Meding, Luca M. Schulze Buschoff, Robert Geirhos, Felix A., Wichmann

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how dichotomous data difficulty in datasets like ImageNet influences model decisions, revealing that most images are either trivial or impossible, which masks differences between models and impacts understanding of their biases.

Contribution

It demonstrates that dichotomous data difficulty dominates model decision boundaries, and removing trivial and impossible images reveals clearer model differences.

Findings

01

46.0% of images are trivial for models

02

11.5% of images are impossible beyond label errors

03

Humans predict image difficulty for CNNs with 81.4% accuracy

Abstract

"The power of a generalization system follows directly from its biases" (Mitchell 1980). Today, CNNs are incredibly powerful generalisation systems -- but to what degree have we understood how their inductive bias influences model decisions? We here attempt to disentangle the various aspects that determine how a model decides. In particular, we ask: what makes one model decide differently from another? In a meticulously controlled setting, we find that (1.) irrespective of the network architecture or objective (e.g. self-supervised, semi-supervised, vision transformers, recurrent models) all models end up with a similar decision boundary. (2.) To understand these findings, we analysed model decisions on the ImageNet validation set from epoch to epoch and image by image. We find that the ImageNet validation set, among others, suffers from dichotomous data difficulty (DDD): For the range…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

wichmann-lab/trivial-or-impossible
pytorchOfficial

Videos

Trivial or Impossible --- dichotomous data difficulty masks model differences (on ImageNet and beyond)· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI) · Domain Adaptation and Few-Shot Learning