Semi-Supervised Classification Based on Classification from Positive and   Unlabeled Data

Tomoya Sakai; Marthinus Christoffel du Plessis; Gang Niu; Masashi; Sugiyama

arXiv:1605.06955·cs.LG·June 19, 2017·40 cites

Semi-Supervised Classification Based on Classification from Positive and Unlabeled Data

Tomoya Sakai, Marthinus Christoffel du Plessis, Gang Niu, Masashi, Sugiyama

PDF

Open Access

TL;DR

This paper introduces a novel semi-supervised classification method that leverages positive, negative, and unlabeled data, providing theoretical guarantees and demonstrating improved performance without relying on traditional distributional assumptions.

Contribution

It extends positive-unlabeled classification to include negative data and establishes generalization bounds that improve with more unlabeled data, without requiring distributional assumptions.

Findings

01

Generalization bounds decrease with more unlabeled data

02

Proposed methods outperform existing semi-supervised classifiers

03

The approach is effective in practical experiments

Abstract

Most of the semi-supervised classification methods developed so far use unlabeled data for regularization purposes under particular distributional assumptions such as the cluster assumption. In contrast, recently developed methods of classification from positive and unlabeled data (PU classification) use unlabeled data for risk evaluation, i.e., label information is directly extracted from unlabeled data. In this paper, we extend PU classification to also incorporate negative data and propose a novel semi-supervised classification approach. We establish generalization error bounds for our novel methods and show that the bounds decrease with respect to the number of unlabeled data without the distributional assumptions that are required in existing semi-supervised classification methods. Through experiments, we demonstrate the usefulness of the proposed methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning