Stochastic smoothing of the top-K calibrated hinge loss for deep imbalanced classification
Camille Garcin, Maximilien Servajean, Alexis Joly, Joseph Salmon

TL;DR
This paper introduces a stochastic top-K hinge loss for deep learning that improves performance and efficiency in large, imbalanced classification tasks, especially for top-K metrics.
Contribution
The paper presents a novel stochastic top-K hinge loss based on smoothing the top-K operator, with a variant tailored for imbalanced datasets, advancing top-K loss design for deep learning.
Findings
Performs well on balanced datasets with lower computational cost
Significantly outperforms baseline losses on heavy-tailed datasets
Effective for top-K accuracy in large, imbalanced classification problems
Abstract
In modern classification tasks, the number of labels is getting larger and larger, as is the size of the datasets encountered in practice. As the number of classes increases, class ambiguity and class imbalance become more and more problematic to achieve high top-1 accuracy. Meanwhile, Top-K metrics (metrics allowing K guesses) have become popular, especially for performance reporting. Yet, proposing top-K losses tailored for deep learning remains a challenge, both theoretically and practically. In this paper we introduce a stochastic top-K hinge loss inspired by recent developments on top-K calibrated losses. Our proposal is based on the smoothing of the top-K operator building on the flexible "perturbed optimizer" framework. We show that our loss function performs very well in the case of balanced datasets, while benefiting from a significantly lower computational time than the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsImbalanced Data Classification Techniques · Machine Learning and Data Classification · Infrastructure Maintenance and Monitoring
