How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?

Wei Huang; Andi Han; Yujin Song; Yilan Chen; Denny Wu; Difan Zou; Taiji Suzuki

arXiv:2510.17526·cs.LG·October 21, 2025

How Does Label Noise Gradient Descent Improve Generalization in the Low SNR Regime?

Wei Huang, Andi Han, Yujin Song, Yilan Chen, Denny Wu, Difan Zou, Taiji Suzuki

PDF

Open Access

TL;DR

This paper demonstrates that adding label noise during gradient descent training of neural networks can suppress noise memorization and improve generalization in low SNR conditions, supported by theoretical analysis.

Contribution

It provides a theoretical analysis showing how label noise in gradient descent prevents overfitting to noise and enhances generalization in low SNR regimes.

Findings

01

Label noise GD suppresses noise memorization.

02

Standard GD overfits to noise in low SNR.

03

Label noise GD achieves better generalization.

Abstract

The capacity of deep learning models is often large enough to both learn the underlying statistical signal and overfit to noise in the training set. This noise memorization can be harmful especially for data with a low signal-to-noise ratio (SNR), leading to poor generalization. Inspired by prior observations that label noise provides implicit regularization that improves generalization, in this work, we investigate whether introducing label noise to the gradient updates can enhance the test performance of neural network (NN) in the low SNR regime. Specifically, we consider training a two-layer NN with a simple label noise gradient descent (GD) algorithm, in an idealized signal-noise data setting. We prove that adding label noise during training suppresses noise memorization, preventing it from dominating the learning process; consequently, label noise GD enjoys rapid signal growth…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Advanced Statistical Modeling Techniques · Topological and Geometric Data Analysis