Trade-offs in Top-k Classification Accuracies on Losses for Deep Learning
Azusa Sawada, Eiji Kaneko, Kazutoshi Sagi

TL;DR
This paper analyzes the limitations of cross entropy loss in optimizing top-k classification accuracy in deep learning and introduces a novel top-k transition loss that improves top-k predictions, especially for larger k values.
Contribution
The paper proposes a new top-k transition loss that enhances top-k accuracy in deep learning models, addressing the shortcomings of cross entropy in complex data distributions.
Findings
CE is not always optimal for top-k prediction in complex distributions.
The proposed loss improves top-k accuracy over CE for larger k values.
ResNet18 trained with the new loss achieves 99% accuracy with fewer candidates.
Abstract
This paper presents an experimental analysis about trade-offs in top-k classification accuracies on losses for deep leaning and proposal of a novel top-k loss. Commonly-used cross entropy (CE) is not guaranteed to optimize top-k prediction without infinite training data and model complexities. The objective is to clarify when CE sacrifices top-k accuracies to optimize top-1 prediction, and to design loss that improve top-k accuracy under such conditions. Our novel loss is basically CE modified by grouping temporal top-k classes as a single class. To obtain a robust decision boundary, we introduce an adaptive transition from normal CE to our loss, and thus call it top-k transition loss. It is demonstrated that CE is not always the best choice to learn top-k prediction in our experiments. First, we explore trade-offs between top-1 and top-k (=2) accuracies on synthetic datasets, and find…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Machine Learning and Data Classification · Generative Adversarial Networks and Image Synthesis
