The Impact of the Single-Label Assumption in Image Recognition Benchmarking

Esla Timothy Anzaku; Seyed Amir Mousavi; Arnout Van Messem; Wesley De Neve

arXiv:2412.18409·cs.CV·May 29, 2025

The Impact of the Single-Label Assumption in Image Recognition Benchmarking

Esla Timothy Anzaku, Seyed Amir Mousavi, Arnout Van Messem, Wesley De Neve

PDF

Open Access

TL;DR

This paper investigates how the common single-label evaluation assumption in image recognition underestimates models' multi-label recognition abilities, revealing that many models can recognize multiple objects despite training on single-label data.

Contribution

The study introduces a variable top-$k$ evaluation method and a new dataset, PatchML, to better assess multi-label prediction capabilities of models trained with single-label supervision.

Findings

01

Conventional top-1 accuracy penalizes valid secondary labels.

02

Models trained on single-label data often recognize multiple objects.

03

The perceived accuracy gap between ImageNet and ImageNetV2 narrows with new evaluation.

Abstract

Deep neural networks (DNNs) are typically evaluated under the assumption that each image has a single correct label. However, many images in benchmarks like ImageNet contain multiple valid labels, creating a mismatch between evaluation protocols and the actual complexity of visual data. This mismatch can penalize DNNs for predicting correct but unannotated labels, which may partly explain reported accuracy drops, such as the widely cited 11 to 14 percent top-1 accuracy decline on ImageNetV2, a replication test set for ImageNet. This raises the question: do such drops reflect genuine generalization failures or artifacts of restrictive evaluation metrics? We rigorously assess the impact of multi-label characteristics on reported accuracy gaps. To evaluate the multi-label prediction capability (MLPC) of single-label-trained models, we introduce a variable top- $k$ evaluation, where $k$ …

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Biomedical Text Mining and Ontologies · Radiomics and Machine Learning in Medical Imaging

MethodsSoftmax · Attention Is All You Need