Learning from Similarity-Confidence Data

Yuzhou Cao; Lei Feng; Yitian Xu; Bo An; Gang Niu; Masashi Sugiyama

arXiv:2102.06879·stat.ML·February 16, 2021·1 cites

Learning from Similarity-Confidence Data

Yuzhou Cao, Lei Feng, Yitian Xu, Bo An, Gang Niu, Masashi Sugiyama

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel weakly supervised learning approach that trains binary classifiers using only unlabeled data pairs with similarity confidence, achieving optimal convergence and reducing labeling costs.

Contribution

It proposes an unbiased risk estimator for similarity-confidence data and a risk correction scheme, advancing weakly supervised learning methods.

Findings

01

Estimator achieves optimal convergence rate.

02

Risk correction reduces overfitting.

03

Experimental results confirm effectiveness.

Abstract

Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where we aim to learn an effective binary classifier from only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class). To solve this problem, we propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Learning from Similarity-Confidence Data· slideslive

Taxonomy

TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Anomaly Detection Techniques and Applications