Linear-Core Surrogates: Smooth Loss Functions with Linear Rates for Classification and Structured Prediction
Mehryar Mohri, Yutao Zhong

TL;DR
The paper introduces Linear-Core Surrogates, a new family of convex loss functions that combine the optimization efficiency of smooth losses with the statistical benefits of margin-based losses, applicable to classification and structured prediction.
Contribution
It proposes a novel family of loss functions that are differentiable everywhere and retain linear consistency bounds, improving both optimization and statistical properties.
Findings
Achieves a 23× speedup over Structured SVMs on large-vocabulary sequence tagging.
Demonstrates superior robustness to label noise, outperforming Cross-Entropy by 2.6% on corrupted CIFAR-10.
Proves that the new surrogates combine smoothness with linear $H$-consistency bounds.
Abstract
The choice of loss function in classification involves a fundamental trade-off: smooth losses (like Cross-Entropy) enable fast optimization rates but yield slow square-root consistency bounds, while piecewise-linear losses (like Hinge) offer fast linear consistency rates but suffer from non-differentiability. We propose Linear-Core (LC) Surrogates, a new family of convex loss functions that resolve this tension by stitching a linear core to a smooth tail. We prove that these surrogates are differentiable everywhere while retaining strict linear -consistency bounds, effectively combining the optimization benefits of smoothness with the statistical efficiency of margin-based losses. In the structured prediction setting, we show that this smoothness unlocks a massive computational and energy advantage: it allows for an unbiased stochastic gradient estimator that bypasses the quadratic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
