Human-Corrected Labels Learning: Enhancing Labels Quality via Human Correction of VLMs Discrepancies
Zhongnian Li, Lan Chen, Yixin Xu, Shi Xu, Xinzheng Xu

TL;DR
This paper introduces Human-Corrected Labels (HCLs), a method that improves label quality by selectively correcting VLM-generated labels with human input, leading to better classifier training and noise robustness.
Contribution
It proposes a novel human correction framework for VLM labels, along with a risk-consistent estimator and label distribution estimation method, enhancing label quality and training robustness.
Findings
HCL improves classification accuracy in noisy label scenarios.
Selective human correction reduces annotation costs.
Theoretical analysis confirms risk consistency of the approach.
Abstract
Vision-Language Models (VLMs), with their powerful content generation capabilities, have been successfully applied to data annotation processes. However, the VLM-generated labels exhibit dual limitations: low quality (i.e., label noise) and absence of error correction mechanisms. To enhance label quality, we propose Human-Corrected Labels (HCLs), a novel setting that efficient human correction for VLM-generated noisy labels. As shown in Figure 1(b), HCL strategically deploys human correction only for instances with VLM discrepancies, achieving both higher-quality annotations and reduced labor costs. Specifically, we theoretically derive a risk-consistent estimator that incorporates both human-corrected labels and VLM predictions to train classifiers. Besides, we further propose a conditional probability method to estimate the label distribution using a combination of VLM outputs and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Topic Modeling
