Generalization error bounds for two-layer neural networks with Lipschitz loss function

Jiang Yu Nguwi; Nicolas Privault

arXiv:2604.06281·stat.ML·April 9, 2026

Generalization error bounds for two-layer neural networks with Lipschitz loss function

Jiang Yu Nguwi, Nicolas Privault

PDF

TL;DR

This paper establishes dimension-free and dimension-dependent generalization error bounds for two-layer neural networks with Lipschitz loss functions, applicable under different data independence assumptions.

Contribution

It introduces novel Wasserstein-based generalization bounds that do not require bounded loss functions and can be computed before training.

Findings

01

Dimension-free rate of $O(n^{-1/2})$ for independent test data.

02

Dimension-dependent rate of $O(n^{-1/(d_{in}+d_{out})})$ without independence.

03

Bounds are explicitly computable and validated by simulations.

Abstract

We derive generalization error bounds for the training of two-layer neural networks without assuming boundedness of the loss function, using Wasserstein distance estimates on the discrepancy between a probability distribution and its associated empirical measure, together with moment bounds for the associated stochastic gradient method. In the case of independent test data, we obtain a dimension-free rate of order $O (n^{- 1/2})$ on the $n$ -sample generalization error, whereas without independence assumption, we derive a bound of order $O (n^{- 1/ (d_{in} + d_{out})})$ , where $d_{in}$ , $d_{out}$ denote input and output dimensions. Our bounds and their coefficients can be explicitly computed prior to the training of the model, and are confirmed by numerical simulations.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.