Weston-Watkins Hinge Loss and Ordered Partitions
Yutong Wang, Clayton D. Scott

TL;DR
This paper introduces the ordered partition loss for multiclass classification, proves the Weston-Watkins hinge loss's calibration with respect to it, and explains its robustness to label noise.
Contribution
It presents a new discrete loss function, the ordered partition loss, and establishes the calibration of the WW-hinge loss with this loss, providing theoretical insights.
Findings
WW-hinge loss is calibrated with the ordered partition loss.
Ordered partition loss is maximally informative among calibrated discrete losses.
WW-SVM performs well under massive label noise, supported by theoretical justification.
Abstract
Multiclass extensions of the support vector machine (SVM) have been formulated in a variety of ways. A recent empirical comparison of nine such formulations [Do\v{g}an et al. 2016] recommends the variant proposed by Weston and Watkins (WW), despite the fact that the WW-hinge loss is not calibrated with respect to the 0-1 loss. In this work we introduce a novel discrete loss function for multiclass classification, the ordered partition loss, and prove that the WW-hinge loss is calibrated with respect to this loss. We also argue that the ordered partition loss is maximally informative among discrete losses satisfying this property. Finally, we apply our theory to justify the empirical observation made by Do\v{g}an et al. that the WW-SVM can work well even under massive label noise, a challenging setting for multiclass SVMs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsHip disorders and treatments
