Rethinking Benign Overfitting in Two-Layer Neural Networks

Ruichen Xu; Kexin Chen

arXiv:2502.11893·cs.LG·June 10, 2025

Rethinking Benign Overfitting in Two-Layer Neural Networks

Ruichen Xu, Kexin Chen

PDF

Open Access

TL;DR

This paper refines the feature-noise data model to include class-dependent noise, revealing that neural networks can utilize data noise to improve classification in long-tailed distributions, supported by theoretical analysis and experiments.

Contribution

It introduces a refined heterogenous noise model and analyzes neural network training dynamics, showing how data noise can enhance generalization in long-tailed data.

Findings

01

Neural networks leverage data noise to improve classification accuracy.

02

Test loss bounds are established for the refined noise model.

03

Experimental results validate the theoretical insights.

Abstract

Recent theoretical studies (Kou et al., 2023; Cao et al., 2022) have revealed a sharp phase transition from benign to harmful overfitting when the noise-to-feature ratio exceeds a threshold-a situation common in long-tailed data distributions where atypical data is prevalent. However, harmful overfitting rarely happens in overparameterized neural networks. Further experimental results suggested that memorization is necessary for achieving near-optimal generalization error in long-tailed data distributions (Feldman & Zhang, 2020). We argue that this discrepancy between theoretical predictions and empirical observations arises because previous feature-noise data models overlook the heterogeneous nature of noise across different data classes. In this paper, we refine the feature-noise data model by incorporating class-dependent heterogeneous noise and re-examine the overfitting phenomenon…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications