Learning from positive and unlabeled examples -Finite size sample bounds
Farnam Mansouri, Shai Ben-David

TL;DR
This paper offers a theoretical analysis of positive unlabeled (PU) learning, providing finite sample bounds without assuming known class priors, thus broadening understanding of PU learning's statistical complexity.
Contribution
It introduces finite sample bounds for PU learning that do not require prior knowledge of class proportions, extending previous theoretical work.
Findings
Derived upper bounds on sample sizes for PU learning
Established lower bounds demonstrating sample size limitations
Analyzed the impact of unknown class priors on learning complexity
Abstract
PU (Positive Unlabeled) learning is a variant of supervised classification learning in which the only labels revealed to the learner are of positively labeled instances. PU learning arises in many real-world applications. Most existing work relies on the simplifying assumptions that the positively labeled training data is drawn from the restriction of the data generating distribution to positively labeled instances and/or that the proportion of positively labeled points (a.k.a. the class prior) is known apriori to the learner. This paper provides a theoretical analysis of the statistical complexity of PU learning under a wider range of setups. Unlike most prior work, our study does not assume that the class prior is known to the learner. We prove upper and lower bounds on the required sample sizes (of both the positively labeled and the unlabeled samples).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
