Contrastive Credibility Propagation for Reliable Semi-Supervised Learning
Brody Kutt, Pralay Ramteke, Xavier Mignot, Pamela Toman, Nandini, Ramanan, Sujit Rokka Chhetri, Shan Huang, Min Du, William Hewlett

TL;DR
This paper introduces Contrastive Credibility Propagation (CCP), a novel SSL algorithm that reliably outperforms supervised baselines across diverse real-world data scenarios by iterative pseudo-label refinement.
Contribution
CCP unifies semi-supervised and noisy label learning, effectively handling various challenging data scenarios to improve SSL reliability.
Findings
CCP outperforms supervised baselines in all tested scenarios.
CCP effectively handles noisy, open-set, and imbalanced data.
The method demonstrates robustness across multiple real-world SSL challenges.
Abstract
Producing labels for unlabeled data is error-prone, making semi-supervised learning (SSL) troublesome. Often, little is known about when and why an algorithm fails to outperform a supervised baseline. Using benchmark datasets, we craft five common real-world SSL data scenarios: few-label, open-set, noisy-label, and class distribution imbalance/misalignment in the labeled and unlabeled sets. We propose a novel algorithm called Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement. CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario. Compared to prior methods which focus on a subset of scenarios, CCP uniquely outperforms the supervised baseline in all scenarios, supporting practitioners when the qualities of labeled or unlabeled data are unknown.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Imbalanced Data Classification Techniques
