Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

Wenquan Ma; Yang Sui; Jiaye Teng; Bohan Wang; Jing Xu; Jingqin Yang

arXiv:2602.22936·cs.LG·February 27, 2026

Generalization Bounds of Stochastic Gradient Descent in Homogeneous Neural Networks

Wenquan Ma, Yang Sui, Jiaye Teng, Bohan Wang, Jing Xu, Jingqin Yang

PDF

Open Access

TL;DR

This paper establishes new generalization bounds for homogeneous neural networks trained with stochastic gradient descent, showing that slower stepsize decay rates are possible, which better align with practical training scenarios.

Contribution

It proves that homogeneous neural networks allow for a slower stepsize decay of order 1/a0 extbackslash sqrt{t}a0, extending stability-based generalization bounds beyond previous constraints.

Findings

01

Slower stepsize decay 1/a0 extbackslash sqrt{t}a0 is sufficient for generalization.

02

Bounds are applicable to ReLU and LeakyReLU networks.

03

Theoretical extension to non-Lipschitz regimes.

Abstract

Algorithmic stability is among the most potent techniques in generalization analysis. However, its derivation usually requires a stepsize $η_{t} = O (1/ t)$ under non-convex training regimes, where $t$ denotes iterations. This rigid decay of the stepsize potentially impedes optimization and may not align with practical scenarios. In this paper, we derive the generalization bounds under the homogeneous neural network regimes, proving that this regime enables slower stepsize decay of order $Ω (1/ t)$ under mild assumptions. We further extend the theoretical results from several aspects, e.g., non-Lipschitz regimes. This finding is broadly applicable, as homogeneous neural networks encompass fully-connected and convolutional neural networks with ReLU and LeakyReLU activations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Model Reduction and Neural Networks · Neural Networks and Applications