Learning Label Refinement and Threshold Adjustment for Imbalanced Semi-Supervised Learning
Zeju Li, Ying-Qiu Zheng, Chen Chen, Saad Jbabdi

TL;DR
This paper introduces SEVAL, a novel pseudo-label refinement and threshold adjustment method for imbalanced semi-supervised learning, improving pseudo-label quality and outperforming existing SSL techniques.
Contribution
The paper proposes SEVAL, a class-balanced pseudo-label optimization approach that enhances pseudo-label accuracy and robustness in imbalanced SSL scenarios.
Findings
SEVAL outperforms state-of-the-art SSL methods in imbalanced settings.
SEVAL improves pseudo-label accuracy and class-wise correctness.
The method is simple, flexible, and applicable to various SSL techniques.
Abstract
Semi-supervised learning (SSL) algorithms struggle to perform well when exposed to imbalanced training data. In this scenario, the generated pseudo-labels can exhibit a bias towards the majority class, and models that employ these pseudo-labels can further amplify this bias. Here we investigate pseudo-labeling strategies for imbalanced SSL including pseudo-label refinement and threshold adjustment, through the lens of statistical analysis. We find that existing SSL algorithms which generate pseudo-labels using heuristic strategies or uncalibrated model confidence are unreliable when imbalanced class distributions bias pseudo-labels. To address this, we introduce SEmi-supervised learning with pseudo-label optimization based on VALidation data (SEVAL) to enhance the quality of pseudo-labelling for imbalanced SSL. We propose to learn refinement and thresholding parameters from a partition…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques
