Do ImageNet Classifiers Generalize to ImageNet?
Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar

TL;DR
This paper investigates how well current ImageNet classifiers perform on newly created test sets that closely follow original dataset creation processes, revealing notable accuracy drops and insights into model generalization.
Contribution
It introduces new test sets for CIFAR-10 and ImageNet to evaluate model generalization beyond re-used test data, highlighting the extent of accuracy drops on fresh data.
Findings
Accuracy drops of 3%-15% on CIFAR-10 and 11%-14% on ImageNet.
Accuracy improvements on original test sets lead to larger gains on new test sets.
Models struggle to generalize to slightly harder images not seen in original datasets.
Abstract
We build new test sets for the CIFAR-10 and ImageNet datasets. Both benchmarks have been the focus of intense research for almost a decade, raising the danger of overfitting to excessively re-used test sets. By closely following the original dataset creation processes, we test to what extent current classification models generalize to new data. We evaluate a broad range of models and find accuracy drops of 3% - 15% on CIFAR-10 and 11% - 14% on ImageNet. However, accuracy gains on the original test sets translate to larger gains on the new test sets. Our results suggest that the accuracy drops are not caused by adaptivity, but by the models' inability to generalize to slightly "harder" images than those found in the original test sets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Do ImageNet Classifiers Generalize to ImageNet? (Paper Explained)· youtube
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Generative Adversarial Networks and Image Synthesis
