Minimizing The Misclassification Error Rate Using a Surrogate Convex Loss
Shai Ben-David (University of Waterloo), David Loker (University of, Waterloo), Nathan Srebro (TTIC), Karthik Sridharan (University of, Pennsylvania)

TL;DR
This paper analyzes the effectiveness of convex surrogate loss functions in binary classification, demonstrating that hinge loss provides near-optimal bounds for misclassification error and comparing various convex losses.
Contribution
It establishes that hinge loss is nearly optimal among convex surrogates for minimizing misclassification error, and offers lower bounds highlighting differences among common losses.
Findings
Hinge loss yields the best bounds for misclassification error among convex surrogates.
Lower bounds show qualitative differences among popular convex loss functions.
Convex surrogate loss minimization closely relates to minimizing true error rates.
Abstract
We carefully study how well minimizing convex surrogate loss functions, corresponds to minimizing the misclassification error rate for the problem of binary classification with linear predictors. In particular, we show that amongst all convex surrogate losses, the hinge loss gives essentially the best possible bound, of all convex loss functions, for the misclassification error rate of the resulting linear predictor in terms of the best possible margin error rate. We also provide lower bounds for specific convex surrogates that show how different commonly used losses qualitatively differ from each other.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Algorithms · Statistical Methods and Inference
