Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification
Nan Lu, Shida Lei, Gang Niu, Issei Sato, Masashi Sugiyama

TL;DR
This paper introduces a novel method for binary classification using multiple unlabeled datasets by employing surrogate set classification, enabling effective learning beyond two datasets and improving over existing methods.
Contribution
It extends risk-consistent binary classification to multiple unlabeled datasets using surrogate set classification within a deep learning framework.
Findings
Outperforms state-of-the-art methods in experiments.
Proves classifier consistency theoretically.
Efficient end-to-end deep learning implementation.
Abstract
To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from U-sets for . Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed data is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Domain Adaptation and Few-Shot Learning
