From ImageNet to Image Classification: Contextualizing Progress on   Benchmarks

Dimitris Tsipras; Shibani Santurkar; Logan Engstrom; Andrew Ilyas,; Aleksander Madry

arXiv:2005.11295·cs.CV·May 25, 2020·61 cites

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks

Dimitris Tsipras, Shibani Santurkar, Logan Engstrom, Andrew Ilyas,, Aleksander Madry

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper investigates how the design choices and noise in the ImageNet dataset creation process introduce biases and misalignments, affecting model evaluation and highlighting the need for improved benchmarking methods.

Contribution

It provides an analysis of the impact of data collection biases in ImageNet and offers refined annotations to better align benchmarks with real-world tasks.

Findings

01

Biases in ImageNet affect model performance evaluation.

02

Noisy data collection leads to systematic dataset-model misalignment.

03

Refined annotations improve benchmark fidelity.

Abstract

Building rich machine learning datasets in a scalable manner often necessitates a crowd-sourced data collection pipeline. In this work, we use human studies to investigate the consequences of employing such a pipeline, focusing on the popular ImageNet dataset. We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset---including the introduction of biases that state-of-the-art models exploit. Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for. Finally, our findings emphasize the need to augment our current model training and evaluation toolkit to take such misalignments into account. To facilitate further research, we release our refined ImageNet annotations at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

MadryLab/ImageNetMultiLabel
noneOfficial

Videos

From ImageNet to Image Classification: Contextualizing Progress on Benchmarks· slideslive

Taxonomy

TopicsMobile Crowdsensing and Crowdsourcing · Data Stream Mining Techniques · COVID-19 diagnosis using AI