Do ImageNet Classifiers Generalize to ImageNet?

Benjamin Recht; Rebecca Roelofs; Ludwig Schmidt; Vaishaal Shankar

arXiv:1902.10811·cs.CV·June 13, 2019·397 cites

Do ImageNet Classifiers Generalize to ImageNet?

Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

This paper investigates how well current ImageNet classifiers perform on newly created test sets that closely follow original dataset creation processes, revealing notable accuracy drops and insights into model generalization.

Contribution

It introduces new test sets for CIFAR-10 and ImageNet to evaluate model generalization beyond re-used test data, highlighting the extent of accuracy drops on fresh data.

Findings

01

Accuracy drops of 3%-15% on CIFAR-10 and 11%-14% on ImageNet.

02

Accuracy improvements on original test sets lead to larger gains on new test sets.

03

Models struggle to generalize to slightly harder images not seen in original datasets.

Abstract

We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

modestyachts/ImageNetV2
pytorchOfficial

Datasets

djghosh/wds_imagenetv2_test
dataset· 149 dl
149 dl

Videos

Do ImageNet Classifiers Generalize to ImageNet? (Paper Explained)· youtube

Taxonomy

TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis