Learning from M-Tuple Dominant Positive and Unlabeled Data

Jiahe Qin; Junpeng Li; Changchun Hua; Yana Yang

arXiv:2506.15686·cs.LG·July 15, 2025

Learning from M-Tuple Dominant Positive and Unlabeled Data

Jiahe Qin, Junpeng Li, Changchun Hua, Yana Yang

PDF

TL;DR

This paper introduces a novel learning framework, MDPU, for classifying data within tuples using positive and unlabeled data, modeling distribution constraints, and ensuring risk consistency with theoretical guarantees.

Contribution

The paper proposes a generalized framework for learning from tuple-based positive and unlabeled data, including an unbiased risk estimator and a risk correction method, with theoretical and empirical validation.

Findings

01

The proposed MDPU framework effectively models tuple distributions under class constraints.

02

The unbiased risk estimator is proven to be risk consistent with theoretical bounds.

03

Experimental results demonstrate the superiority of MDPU over baseline methods.

Abstract

Label Proportion Learning (LLP) addresses the classification problem where multiple instances are grouped into bags and each bag contains information about the proportion of each class. However, in practical applications, obtaining precise supervisory information regarding the proportion of instances in a specific class is challenging. To better align with real-world application scenarios and effectively leverage the proportional constraints of instances within tuples, this paper proposes a generalized learning framework \emph{MDPU}. Specifically, we first mathematically model the distribution of instances within tuples of arbitrary size, under the constraint that the number of positive instances is no less than that of negative instances. Then we derive an unbiased risk estimator that satisfies risk consistency based on the empirical risk minimization (ERM) method. To mitigate the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.