Soft Label PU Learning

Puning Zhao; Jintao Deng; Xu Cheng

arXiv:2405.01990·cs.LG·May 6, 2024

Soft Label PU Learning

Puning Zhao, Jintao Deng, Xu Cheng

PDF

Open Access

TL;DR

This paper introduces a novel soft label PU learning approach that assigns probabilistic labels to unlabeled data, along with new evaluation metrics, demonstrating improved performance on real-world datasets.

Contribution

It proposes a new soft label PU learning framework with PU-specific metrics and an optimization method, addressing the limitations of treating unlabeled data equally.

Findings

01

Effective in real datasets for anti-cheat applications

02

PU metrics correlate well with true performance measures

03

Outperforms existing PU learning methods

Abstract

PU learning refers to the classification problem in which only part of positive samples are labeled. Existing PU learning methods treat unlabeled samples equally. However, in many real tasks, from common sense or domain knowledge, some unlabeled samples are more likely to be positive than others. In this paper, we propose soft label PU learning, in which unlabeled data are assigned soft labels according to their probabilities of being positive. Considering that the ground truth of TPR, FPR, and AUC are unknown, we then design PU counterparts of these metrics to evaluate the performances of soft label PU learning methods within validation data. We show that these new designed PU metrics are good substitutes for the real metrics. After that, a method that optimizes such metrics is proposed. Experiments on public datasets and real datasets for anti-cheat services from Tencent games…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Technology and Assessment · Pharmacy and Medical Practices · Text and Document Classification Technologies