Does label smoothing mitigate label noise?
Michal Lukasik, Srinadh Bhojanapalli, Aditya Krishna Menon, Sanjiv, Kumar

TL;DR
This paper investigates whether label smoothing helps mitigate label noise in deep learning, revealing its relation to loss correction methods and its benefits in noisy data distillation.
Contribution
It establishes a connection between label smoothing and loss-correction techniques, demonstrating its effectiveness under label noise and in noisy data distillation scenarios.
Findings
Label smoothing amplifies label noise but relates to loss correction methods.
It is competitive with loss correction techniques in noisy settings.
Label smoothing of the teacher improves distillation from noisy data.
Abstract
Label smoothing is commonly used in training deep learning models, wherein one-hot training labels are mixed with uniform label vectors. Empirically, smoothing has been shown to improve both predictive performance and model calibration. In this paper, we study whether label smoothing is also effective as a means of coping with label noise. While label smoothing apparently amplifies this problem --- being equivalent to injecting symmetric noise to the labels --- we show how it relates to a general family of loss-correction techniques from the label noise literature. Building on this connection, we show that label smoothing is competitive with loss-correction under label noise. Further, we show that when distilling models from noisy data, label smoothing of the teacher is beneficial; this is in contrast to recent findings for noise-free problems, and sheds further light on settings where…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMachine Learning and Data Classification · Infrastructure Maintenance and Monitoring · Advanced Multi-Objective Optimization Algorithms
MethodsLabel Smoothing
