Cross Entropy versus Label Smoothing: A Neural Collapse Perspective
Li Guo, George Andriopoulos, Zifan Zhao, Shuyang Ling, Zixuan Dong, Keith Ross

TL;DR
This paper investigates how label smoothing affects neural network training through the lens of Neural Collapse, revealing faster convergence, stronger collapse, and better calibration compared to standard cross-entropy loss.
Contribution
It provides empirical and theoretical insights into the effects of label smoothing on neural collapse and convergence speed, using the neural collapse framework.
Findings
Models with label smoothing converge faster to neural collapse solutions.
Label smoothing leads to a stronger level of neural collapse (NC1) and intensified NC2.
Theoretical analysis shows models under label smoothing have lower conditioning numbers, implying faster convergence.
Abstract
Label smoothing loss is a widely adopted technique to mitigate overfitting in deep neural networks. This paper studies label smoothing from the perspective of Neural Collapse (NC), a powerful empirical and theoretical framework which characterizes model behavior during the terminal phase of training. We first show empirically that models trained with label smoothing converge faster to neural collapse solutions and attain a stronger level of neural collapse. Additionally, we show that at the same level of NC1, models under label smoothing loss exhibit intensified NC2. These findings provide valuable insights into the performance benefits and enhanced model calibration under label smoothing loss. We then leverage the unconstrained feature model to derive closed-form solutions for the global minimizers for both loss functions and further demonstrate that models under label smoothing have a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
MethodsLabel Smoothing
