Learning from Similarity-Confidence Data
Yuzhou Cao, Lei Feng, Yitian Xu, Bo An, Gang Niu, Masashi Sugiyama

TL;DR
This paper introduces a novel weakly supervised learning approach that trains binary classifiers using only unlabeled data pairs with similarity confidence, achieving optimal convergence and reducing labeling costs.
Contribution
It proposes an unbiased risk estimator for similarity-confidence data and a risk correction scheme, advancing weakly supervised learning methods.
Findings
Estimator achieves optimal convergence rate.
Risk correction reduces overfitting.
Experimental results confirm effectiveness.
Abstract
Weakly supervised learning has drawn considerable attention recently to reduce the expensive time and labor consumption of labeling massive data. In this paper, we investigate a novel weakly supervised learning problem of learning from similarity-confidence (Sconf) data, where we aim to learn an effective binary classifier from only unlabeled data pairs equipped with confidence that illustrates their degree of similarity (two examples are similar if they belong to the same class). To solve this problem, we propose an unbiased estimator of the classification risk that can be calculated from only Sconf data and show that the estimation error bound achieves the optimal convergence rate. To alleviate potential overfitting when flexible models are used, we further employ a risk correction scheme on the proposed risk estimator. Experimental results demonstrate the effectiveness of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Machine Learning and Algorithms · Anomaly Detection Techniques and Applications
