Self-PU: Self Boosted and Calibrated Positive-Unlabeled Training
Xuxi Chen, Wuyang Chen, Tianlong Chen, Ye Yuan, Chen Gong, Kewei Chen,, Zhangyang Wang

TL;DR
Self-PU introduces a novel self-training framework for positive-unlabeled learning that adaptively discovers confident examples, calibrates losses, and employs teacher-student distillation, achieving state-of-the-art results on benchmarks and real-world data.
Contribution
It proposes a self-boosted, self-calibrated PU learning framework integrating self-paced training, instance-aware loss, and distillation, advancing the effectiveness of PU classifiers.
Findings
Self-PU outperforms existing methods on MNIST and CIFAR-10 benchmarks.
Self-PU achieves significant improvements on Alzheimer's Disease brain image classification.
The framework demonstrates robustness and adaptability in real-world PU learning scenarios.
Abstract
Many real-world applications have to tackle the Positive-Unlabeled (PU) learning problem, i.e., learning binary classifiers from a large amount of unlabeled data and a few labeled positive examples. While current state-of-the-art methods employ importance reweighting to design various risk estimators, they ignored the learning capability of the model itself, which could have provided reliable supervision. This motivates us to propose a novel Self-PU learning framework, which seamlessly integrates PU learning and self-training. Self-PU highlights three "self"-oriented building blocks: a self-paced training algorithm that adaptively discovers and augments confident positive/negative examples as the training proceeds; a self-calibrated instance-aware loss; and a self-distillation scheme that introduces teacher-students learning as an effective regularization for PU learning. We demonstrate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Digital Imaging for Blood Diseases · Imbalanced Data Classification Techniques
