Towards Semi-supervised Learning with Non-random Missing Labels
Yue Duan, Zhen Zhao, Lei Qi, Luping Zhou, Lei Wang, Yinghuan Shi

TL;DR
This paper introduces a novel pseudo-rectifying guidance method for semi-supervised learning under the challenging MNAR label missing scenario, effectively reducing bias and improving pseudo-label quality across class distributions.
Contribution
It proposes class transition tracking with a Markov random walk to maintain unbiased pseudo-labeling in MNAR semi-supervised learning, outperforming existing bias removal methods.
Findings
PRG achieves superior performance across various MNAR scenarios.
The method effectively reduces bias in pseudo-labels for both common and rare classes.
PRG outperforms recent SSL approaches with bias correction by a large margin.
Abstract
Semi-supervised learning (SSL) tackles the label missing problem by enabling the effective usage of unlabeled data. While existing SSL methods focus on the traditional setting, a practical and challenging scenario called label Missing Not At Random (MNAR) is usually ignored. In MNAR, the labeled and unlabeled data fall into different class distributions resulting in biased label imputation, which deteriorates the performance of SSL models. In this work, class transition tracking based Pseudo-Rectifying Guidance (PRG) is devised for MNAR. We explore the class-level guidance information obtained by the Markov random walk, which is modeled on a dynamically created graph built over the class tracking matrix. PRG unifies the historical information of class distribution and class transitions caused by the pseudo-rectifying procedure to maintain the model's unbiased enthusiasm towards…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Text and Document Classification Technologies · Machine Learning and Data Classification
MethodsFocus
