Taming the Cross Entropy Loss
Manuel Martinez, Rainer Stiefelhagen

TL;DR
The paper introduces the Tamed Cross Entropy (TCE) loss, a robust alternative to the standard Cross Entropy loss that maintains training properties in noiseless conditions and improves performance under label noise.
Contribution
The TCE loss is a novel robust loss function that preserves the training dynamics of CE in clean data and enhances robustness to label noise without requiring training modifications.
Findings
TCE outperforms CE in noisy label scenarios across multiple datasets.
TCE maintains similar training properties to CE in noiseless conditions.
The method is compatible with existing training regimes without modifications.
Abstract
We present the Tamed Cross Entropy (TCE) loss function, a robust derivative of the standard Cross Entropy (CE) loss used in deep learning for classification tasks. However, unlike other robust losses, the TCE loss is designed to exhibit the same training properties than the CE loss in noiseless scenarios. Therefore, the TCE loss requires no modification on the training regime compared to the CE loss and, in consequence, can be applied in all applications where the CE loss is currently used. We evaluate the TCE loss using the ResNet architecture on four image datasets that we artificially contaminated with various levels of label noise. The TCE loss outperforms the CE loss in every tested scenario.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Advanced Neural Network Applications · Machine Learning and Data Classification
MethodsAverage Pooling · *Communicated@Fast*How Do I Communicate to Expedia? · 1x1 Convolution · Batch Normalization · Bottleneck Residual Block · Global Average Pooling · Residual Block · Kaiming Initialization · Max Pooling · Residual Connection
