Lower Bounds on the Generalization Error of Nonlinear Learning Models
Inbar Seroussi, Ofer Zeitouni

TL;DR
This paper establishes fundamental lower bounds on the generalization error for nonlinear neural network models, revealing limitations of unbiased estimators and providing insights into the performance of biased estimators like SGD.
Contribution
It derives explicit, asymptotically tight lower bounds for generalization error in both linear and nonlinear neural network models, using random matrix theory.
Findings
Unbiased estimators perform poorly in this regime.
Lower bounds are tight for linear models.
Empirical comparison shows bounds are relevant for SGD.
Abstract
We study in this paper lower bounds for the generalization error of models derived from multi-layer neural networks, in the regime where the size of the layers is commensurate with the number of samples in the training data. We show that unbiased estimators have unacceptable performance for such nonlinear networks in this regime. We derive explicit generalization lower bounds for general biased estimators, in the cases of linear regression and of two-layered networks. In the linear case the bound is asymptotically tight. In the nonlinear case, we provide a comparison of our bounds with an empirical study of the stochastic gradient descent algorithm. The analysis uses elements from the theory of large random matrices.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Machine Learning and ELM · Sparse and Compressive Sensing Techniques
MethodsLinear Regression
