SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed   Semi-Supervised Learning

Chaoqun Du; Yizeng Han; Gao Huang

arXiv:2402.13505·cs.LG·July 31, 2024·1 cites

SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning

Chaoqun Du, Yizeng Han, Gao Huang

PDF

Open Access 1 Repo

TL;DR

SimPro introduces a flexible probabilistic framework for semi-supervised learning that effectively handles imbalanced and unknown class distributions without relying on rigid assumptions, achieving state-of-the-art results.

Contribution

The paper presents a novel, assumption-free probabilistic framework called SimPro that refines EM for better class distribution estimation and pseudo-labeling in semi-supervised learning.

Findings

01

Achieves state-of-the-art performance across diverse benchmarks.

02

Effectively handles imbalanced and mismatched class distributions.

03

Provides theoretical guarantees and is easy to implement.

Abstract

Recent advancements in semi-supervised learning have focused on a more realistic yet challenging task: addressing imbalances in labeled data while the class distribution of unlabeled data remains both unknown and potentially mismatched. Current approaches in this sphere often presuppose rigid assumptions regarding the class distribution of unlabeled data, thereby limiting the adaptability of models to only certain distribution ranges. In this study, we propose a novel approach, introducing a highly adaptable framework, designated as SimPro, which does not rely on any predefined assumptions about the distribution of unlabeled data. Our framework, grounded in a probabilistic model, innovatively refines the expectation-maximization (EM) algorithm by explicitly decoupling the modeling of conditional and marginal class distributions. This separation facilitates a closed-form solution for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

leaplabthu/simpro
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification