Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon

Tongtong Liang; Dan Qiao; Yu-Xiang Wang; Rahul Parhi

arXiv:2506.20779·stat.ML·January 13, 2026

Stable Minima of ReLU Neural Networks Suffer from the Curse of Dimensionality: The Neural Shattering Phenomenon

Tongtong Liang, Dan Qiao, Yu-Xiang Wang, Rahul Parhi

PDF

Open Access

TL;DR

This paper demonstrates that flat minima in overparameterized ReLU neural networks lead to exponentially worse generalization in high-dimensional settings, due to a phenomenon called neural shattering, which explains why flat minima may fail in high dimensions.

Contribution

The paper provides the first systematic theoretical analysis showing how flat minima cause poor generalization in high-dimensional ReLU networks through neural shattering.

Findings

01

Flat solutions generalize poorly as input dimension increases.

02

Exponential deterioration of convergence rates for flat minima in high dimensions.

03

Neural shattering explains the failure of flat minima to generalize in high-dimensional spaces.

Abstract

We study the implicit bias of flatness / low (loss) curvature and its effects on generalization in two-layer overparameterized ReLU networks with multivariate inputs -- a problem well motivated by the minima stability and edge-of-stability phenomena in gradient-descent training. Existing work either requires interpolation or focuses only on univariate inputs. This paper presents new and somewhat surprising theoretical results for multivariate inputs. On two natural settings (1) generalization gap for flat solutions, and (2) mean-squared error (MSE) in nonparametric function estimation by stable minima, we prove upper and lower bounds, which establish that while flatness does imply generalization, the resulting rates of convergence necessarily deteriorate exponentially as the input dimension grows. This gives an exponential separation between the flat solutions compared to low-norm…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia?