Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels
Zhilu Zhang, Mert R. Sabuncu

TL;DR
This paper introduces a set of generalized loss functions for training deep neural networks that are robust to noisy labels, outperforming existing methods like MAE and CCE across various datasets.
Contribution
The paper proposes theoretically grounded generalized loss functions that improve robustness to label noise in deep neural network training.
Findings
The new loss functions outperform MAE and CCE in noisy label scenarios.
Experimental results show improved accuracy on CIFAR-10, CIFAR-100, and FASHION-MNIST.
The methods are compatible with existing DNN architectures and training algorithms.
Abstract
Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning
