DiCaP: Distribution-Calibrated Pseudo-labeling for Semi-Supervised Multi-Label Learning
Bo Han, Zhuoming Li, Xiaoyu Wang, Yaxin Hou, Hui Liu, Junhui Hou, Yuheng Jia

TL;DR
DiCaP introduces a correctness-aware pseudo-labeling framework for semi-supervised multi-label learning, calibrating pseudo-label weights based on estimated correctness likelihood and effectively handling uncertain samples to improve performance.
Contribution
The paper proposes Distribution-Calibrated Pseudo-labeling (DiCaP), a novel method that calibrates pseudo-label weights using posterior precision and employs dual-thresholding for better semi-supervised multi-label learning.
Findings
Achieves up to 4.27% improvement over state-of-the-art methods.
Effectively separates confident and ambiguous samples for tailored learning.
Maintains stable correctness likelihood distribution across different labeled data sizes.
Abstract
Semi-supervised multi-label learning (SSMLL) aims to address the challenge of limited labeled data in multi-label learning (MLL) by leveraging unlabeled data to improve the model's performance. While pseudo-labeling has become a dominant strategy in SSMLL, most existing methods assign equal weights to all pseudo-labels regardless of their quality, which can amplify the impact of noisy or uncertain predictions and degrade the overall performance. In this paper, we theoretically verify that the optimal weight for a pseudo-label should reflect its correctness likelihood. Empirically, we observe that on the same dataset, the correctness likelihood distribution of unlabeled data remains stable, even as the number of labeled training samples varies. Building on this insight, we propose Distribution-Calibrated Pseudo-labeling (DiCaP), a correctness-aware framework that estimates posterior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning
