A Variational Approach for Learning from Positive and Unlabeled Data

Hui Chen; Fangqing Liu; Yin Wang; Liyue Zhao; and Hao Wu

arXiv:1906.00642·cs.LG·December 1, 2020·22 cites

A Variational Approach for Learning from Positive and Unlabeled Data

Hui Chen, Fangqing Liu, Yin Wang, Liyue Zhao, and Hao Wu

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a variational approach for learning binary classifiers from positive and unlabeled data, enabling direct error evaluation and efficient optimization without negative samples, applicable to real-world tasks like fraud detection.

Contribution

The paper proposes a novel variational principle for PU learning that simplifies risk estimation and improves stability, advancing beyond existing methods based on negative data distribution approximation.

Findings

01

Effective on benchmark datasets

02

Improved stability and performance with margin loss

03

No need for negative data or class prior estimation

Abstract

Learning binary classifiers only from positive and unlabeled (PU) data is an important and challenging task in many real-world applications, including web text classification, disease gene identification and fraud detection, where negative samples are difficult to verify experimentally. Most recent PU learning methods are developed based on the conventional misclassification risk of the supervised learning type, and they require to solve the intractable risk estimation problem by approximating the negative data distribution or the class prior. In this paper, we introduce a variational principle for PU learning that allows us to quantitatively evaluate the modeling error of the Bayesian classifier directly from given data. This leads to a loss function which can be efficiently calculated without any intermediate step or model, and a variational learning method can then be employed to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HC-Feynman/vpu
pytorchOfficial

Videos

A Variational Approach for Learning from Positive and Unlabeled Data· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Imbalanced Data Classification Techniques · Machine Learning and Algorithms