When VLMs Meet Image Classification: Test Sets Renovation via Missing Label Identification

Zirui Pang; Haosheng Tan; Yuhan Pu; Zhijie Deng; Zhouan Shen; Keyu Hu; Jiaheng Wei

arXiv:2505.16149·cs.CV·May 23, 2025

When VLMs Meet Image Classification: Test Sets Renovation via Missing Label Identification

Zirui Pang, Haosheng Tan, Yuhan Pu, Zhijie Deng, Zhouan Shen, Keyu Hu, Jiaheng Wei

PDF

Open Access

TL;DR

This paper introduces REVEAL, a comprehensive framework that leverages vision-language models and human curation to detect and correct noisy and missing labels in image classification datasets, improving their quality for fair evaluation.

Contribution

The paper presents a novel integrated approach combining pre-trained vision-language models with advanced curation methods to address both noisy and missing labels in benchmark datasets.

Findings

01

REVEAL effectively detects missing labels in public datasets.

02

It significantly improves dataset quality through human verification.

03

The method aligns well with human judgments, enhancing evaluation fairness.

Abstract

Image classification benchmark datasets such as CIFAR, MNIST, and ImageNet serve as critical tools for model evaluation. However, despite the cleaning efforts, these datasets still suffer from pervasive noisy labels and often contain missing labels due to the co-existing image pattern where multiple classes appear in an image sample. This results in misleading model comparisons and unfair evaluations. Existing label cleaning methods focus primarily on noisy labels, but the issue of missing labels remains largely overlooked. Motivated by these challenges, we present a comprehensive framework named REVEAL, integrating state-of-the-art pre-trained vision-language models (e.g., LLaVA, BLIP, Janus, Qwen) with advanced machine/human label curation methods (e.g., Docta, Cleanlab, MTurk), to systematically address both noisy labels and missing label detection in widely-used image classification…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning

MethodsFocus · BLIP: Bootstrapping Language-Image Pre-training