Surrogate regret bounds for generalized classification performance metrics
Wojciech Kot{\l}owski, Krzysztof Dembczy\'nski

TL;DR
This paper establishes theoretical bounds linking surrogate loss minimization to the optimization of complex binary classification metrics, providing a framework for improving performance measures like F-measure and Jaccard similarity.
Contribution
It introduces a two-step procedure with regret bounds for generalized metrics, extending analysis to multilabel classification and averaging measures.
Findings
Regret of the classifier is bounded by surrogate loss regret.
Theoretical guarantees hold for multilabel and averaging metrics.
Empirical results validate the theoretical bounds.
Abstract
We consider optimization of generalized performance metrics for binary classification by means of surrogate losses. We focus on a class of metrics, which are linear-fractional functions of the false positive and false negative rates (examples of which include -measure, Jaccard similarity coefficient, AM measure, and many others). Our analysis concerns the following two-step procedure. First, a real-valued function is learned by minimizing a surrogate loss for binary classification on the training sample. It is assumed that the surrogate loss is a strongly proper composite loss function (examples of which include logistic loss, squared-error loss, exponential loss, etc.). Then, given , a threshold is tuned on a separate validation sample, by direct optimization of the target performance metric. We show that the regret of the resulting classifier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Bandit Algorithms Research · Machine Learning and Data Classification · Data Stream Mining Techniques
