A Probabilistic Theory of Supervised Similarity Learning for Pointwise ROC Curve Optimization
Robin Vogel, Aur\'elien Bellet, St\'ephan Cl\'emen\c{c}on

TL;DR
This paper develops a probabilistic framework for similarity learning focused on optimizing the pointwise ROC curve, providing theoretical guarantees and addressing large-scale data challenges.
Contribution
It introduces a novel probabilistic approach to similarity learning for ROC optimization, with universal and faster learning rates and analysis of sampling effects.
Findings
Universal learning rates for the proposed method.
Faster rates under a noise assumption.
Effective sampling-based approximations for large-scale data.
Abstract
The performance of many machine learning techniques depends on the choice of an appropriate similarity or distance measure on the input space. Similarity learning (or metric learning) aims at building such a measure from training data so that observations with the same (resp. different) label are as close (resp. far) as possible. In this paper, similarity learning is investigated from the perspective of pairwise bipartite ranking, where the goal is to rank the elements of a database by decreasing order of the probability that they share the same label with some query data point, based on the similarity scores. A natural performance criterion in this setting is pointwise ROC optimization: maximize the true positive rate under a fixed false positive rate. We study this novel perspective on similarity learning through a rigorous probabilistic framework. The empirical version of the problem…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Data-Driven Disease Surveillance · Anomaly Detection Techniques and Applications
