DiCaP: Distribution-Calibrated Pseudo-labeling for Semi-Supervised Multi-Label Learning

Bo Han; Zhuoming Li; Xiaoyu Wang; Yaxin Hou; Hui Liu; Junhui Hou; Yuheng Jia

arXiv:2511.20225·cs.LG·December 3, 2025

DiCaP: Distribution-Calibrated Pseudo-labeling for Semi-Supervised Multi-Label Learning

Bo Han, Zhuoming Li, Xiaoyu Wang, Yaxin Hou, Hui Liu, Junhui Hou, Yuheng Jia

PDF

Open Access 1 Video

TL;DR

DiCaP introduces a correctness-aware pseudo-labeling framework for semi-supervised multi-label learning, calibrating pseudo-label weights based on estimated correctness likelihood and effectively handling uncertain samples to improve performance.

Contribution

The paper proposes Distribution-Calibrated Pseudo-labeling (DiCaP), a novel method that calibrates pseudo-label weights using posterior precision and employs dual-thresholding for better semi-supervised multi-label learning.

Findings

01

Achieves up to 4.27% improvement over state-of-the-art methods.

02

Effectively separates confident and ambiguous samples for tailored learning.

03

Maintains stable correctness likelihood distribution across different labeled data sizes.

Abstract

Semi-supervised multi-label learning (SSMLL) aims to address the challenge of limited labeled data in multi-label learning (MLL) by leveraging unlabeled data to improve the model's performance. While pseudo-labeling has become a dominant strategy in SSMLL, most existing methods assign equal weights to all pseudo-labels regardless of their quality, which can amplify the impact of noisy or uncertain predictions and degrade the overall performance. In this paper, we theoretically verify that the optimal weight for a pseudo-label should reflect its correctness likelihood. Empirically, we observe that on the same dataset, the correctness likelihood distribution of unlabeled data remains stable, even as the number of labeled training samples varies. Building on this insight, we propose Distribution-Calibrated Pseudo-labeling (DiCaP), a correctness-aware framework that estimates posterior…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

DiCaP: Distribution-Calibrated Pseudo-labeling for Semi-Supervised Multi-Label Learning· underline

Taxonomy

TopicsText and Document Classification Technologies · Machine Learning and Data Classification · Domain Adaptation and Few-Shot Learning