Generalized Cross Entropy Loss for Training Deep Neural Networks with   Noisy Labels

Zhilu Zhang; Mert R. Sabuncu

arXiv:1805.07836·cs.LG·December 3, 2018·1.5k cites

Generalized Cross Entropy Loss for Training Deep Neural Networks with Noisy Labels

Zhilu Zhang, Mert R. Sabuncu

PDF

Open Access 4 Repos

TL;DR

This paper introduces a set of generalized loss functions for training deep neural networks that are robust to noisy labels, outperforming existing methods like MAE and CCE across various datasets.

Contribution

The paper proposes theoretically grounded generalized loss functions that improve robustness to label noise in deep neural network training.

Findings

01

The new loss functions outperform MAE and CCE in noisy label scenarios.

02

Experimental results show improved accuracy on CIFAR-10, CIFAR-100, and FASHION-MNIST.

03

The methods are compatible with existing DNN architectures and training algorithms.

Abstract

Deep neural networks (DNNs) have achieved tremendous success in a variety of applications across many disciplines. Yet, their superior performance comes with the expensive cost of requiring correctly annotated large-scale datasets. Moreover, due to DNNs' rich capacity, errors in training labels can hamper performance. To combat this problem, mean absolute error (MAE) has recently been proposed as a noise-robust alternative to the commonly-used categorical cross entropy (CCE) loss. However, as we show in this paper, MAE can perform poorly with DNNs and challenging datasets. Here, we present a theoretically grounded set of noise-robust loss functions that can be seen as a generalization of MAE and CCE. Proposed loss functions can be readily applied with any existing DNN architecture and algorithm, while yielding good performance in a wide range of noisy label scenarios. We report results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning