Beyond the Selected Completely At Random Assumption for Learning from   Positive and Unlabeled Data

Jessa Bekker; Pieter Robberechts; and Jesse Davis

arXiv:1809.03207·cs.LG·July 1, 2019

Beyond the Selected Completely At Random Assumption for Learning from Positive and Unlabeled Data

Jessa Bekker, Pieter Robberechts, and Jesse Davis

PDF

1 Repo 1 Models

TL;DR

This paper addresses learning from positive and unlabeled data with selection biases, proposing methods that incorporate the labeling mechanism and handle unknown biases, improving classifier performance.

Contribution

It introduces a theoretically grounded empirical risk method for biased PU learning and explores learning under unknown labeling mechanisms with practical solutions.

Findings

01

Incorporating the labeling mechanism improves classifier accuracy.

02

The proposed methods are effective even when the bias is unknown.

03

Theoretical analysis confirms the validity of the approach.

Abstract

Most positive and unlabeled data is subject to selection biases. The labeled examples can, for example, be selected from the positive set because they are easier to obtain or more obviously positive. This paper investigates how learning can be ena BHbled in this setting. We propose and theoretically analyze an empirical-risk-based method for incorporating the labeling mechanism. Additionally, we investigate under which assumptions learning is possible when the labeling mechanism is not fully understood and propose a practical method to enable this. Our empirical analysis supports the theoretical results and shows that taking into account the possibility of a selection bias, even when the labeling mechanism is unknown, improves the trained classifiers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ML-KULeuven/SAR-PU
none

Models

🤗
philipphager/baidu-ultr_uva-bert_ips-pointwise
model· 2 dl
2 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.