CReST: A Class-Rebalancing Self-Training Framework for Imbalanced Semi-Supervised Learning
Chen Wei, Kihyuk Sohn, Clayton Mellina, Alan Yuille, Fan Yang

TL;DR
CReST introduces a class-rebalancing self-training framework that enhances semi-supervised learning on imbalanced datasets by selectively rebalancing pseudo-labeled samples, significantly improving performance over existing methods.
Contribution
The paper proposes CReST, a novel self-training framework with class-rebalancing and adaptive distribution alignment, to improve SSL on imbalanced data, outperforming prior rebalancing techniques.
Findings
CReST and CReST+ outperform state-of-the-art SSL methods on imbalanced datasets.
The methods outperform other popular rebalancing techniques.
Pseudo-labels for minority classes are highly precise, enabling effective rebalancing.
Abstract
Semi-supervised learning on class-imbalanced data, although a realistic problem, has been under studied. While existing semi-supervised learning (SSL) methods are known to perform poorly on minority classes, we find that they still generate high precision pseudo-labels on minority classes. By exploiting this property, in this work, we propose Class-Rebalancing Self-Training (CReST), a simple yet effective framework to improve existing SSL methods on class-imbalanced data. CReST iteratively retrains a baseline SSL model with a labeled set expanded by adding pseudo-labeled samples from an unlabeled set, where pseudo-labeled samples from minority classes are selected more frequently according to an estimated class distribution. We also propose a progressive distribution alignment to adaptively adjust the rebalancing strength dubbed CReST+. We show that CReST and CReST+ improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Vehicle License Plate Recognition · Domain Adaptation and Few-Shot Learning
