Contrastive Credibility Propagation for Reliable Semi-Supervised   Learning

Brody Kutt; Pralay Ramteke; Xavier Mignot; Pamela Toman; Nandini; Ramanan; Sujit Rokka Chhetri; Shan Huang; Min Du; William Hewlett

arXiv:2211.09929·cs.LG·April 3, 2024

Contrastive Credibility Propagation for Reliable Semi-Supervised Learning

Brody Kutt, Pralay Ramteke, Xavier Mignot, Pamela Toman, Nandini, Ramanan, Sujit Rokka Chhetri, Shan Huang, Min Du, William Hewlett

PDF

Open Access 1 Repo

TL;DR

This paper introduces Contrastive Credibility Propagation (CCP), a novel SSL algorithm that reliably outperforms supervised baselines across diverse real-world data scenarios by iterative pseudo-label refinement.

Contribution

CCP unifies semi-supervised and noisy label learning, effectively handling various challenging data scenarios to improve SSL reliability.

Findings

01

CCP outperforms supervised baselines in all tested scenarios.

02

CCP effectively handles noisy, open-set, and imbalanced data.

03

The method demonstrates robustness across multiple real-world SSL challenges.

Abstract

Producing labels for unlabeled data is error-prone, making semi-supervised learning (SSL) troublesome. Often, little is known about when and why an algorithm fails to outperform a supervised baseline. Using benchmark datasets, we craft five common real-world SSL data scenarios: few-label, open-set, noisy-label, and class distribution imbalance/misalignment in the labeled and unlabeled sets. We propose a novel algorithm called Contrastive Credibility Propagation (CCP) for deep SSL via iterative transductive pseudo-label refinement. CCP unifies semi-supervised learning and noisy label learning for the goal of reliably outperforming a supervised baseline in any data scenario. Compared to prior methods which focus on a subset of scenarios, CCP uniquely outperforms the supervised baseline in all scenarios, supporting practitioners when the qualities of labeled or unlabeled data are unknown.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PaloAltoNetworks/CCP_CIFAR
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Text and Document Classification Technologies · Imbalanced Data Classification Techniques