Learning with Noisy Labels Using Collaborative Sample Selection and Contrastive Semi-Supervised Learning
Qing Miao, Xiaohe Wu, Chao Xu, Yanli Ji, Wangmeng Zuo, Yiwen Guo,, Zhaopeng Meng

TL;DR
This paper introduces a novel collaborative sample selection method using CLIP and contrastive semi-supervised learning to effectively identify clean samples and reduce noise impact in learning with noisy labels, improving generalization.
Contribution
The paper proposes a new approach combining CLIP, Gaussian Mixture Models, and contrastive co-training to enhance clean sample selection and mitigate confirmation bias in noisy label learning.
Findings
Outperforms state-of-the-art methods on benchmark datasets.
Effectively removes noisy samples from the clean set.
Improves DNN generalization in noisy label scenarios.
Abstract
Learning with noisy labels (LNL) has been extensively studied, with existing approaches typically following a framework that alternates between clean sample selection and semi-supervised learning (SSL). However, this approach has a limitation: the clean set selected by the Deep Neural Network (DNN) classifier, trained through self-training, inevitably contains noisy samples. This mixture of clean and noisy samples leads to misguidance in DNN training during SSL, resulting in impaired generalization performance due to confirmation bias caused by error accumulation in sample selection. To address this issue, we propose a method called Collaborative Sample Selection (CSS), which leverages the large-scale pre-trained model CLIP. CSS aims to remove the mixed noisy samples from the identified clean set. We achieve this by training a 2-Dimensional Gaussian Mixture Model (2D-GMM) that combines…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Water Systems and Optimization
MethodsSparse Evolutionary Training · Contrastive Language-Image Pre-training
