Towards Initialization-dependent and Non-vacuous Generalization Bounds for Overparameterized Shallow Neural Networks
Yunwen Lei, Yufeng Xie

TL;DR
This paper develops new initialization-dependent generalization bounds for overparameterized shallow neural networks using path-norm, addressing vacuity issues in prior Frobenius norm-based bounds.
Contribution
It introduces a novel path-norm based complexity measure and a peeling technique to derive non-vacuous bounds for shallow neural networks.
Findings
Bounds depend on path-norm of the distance from initialization
Develops a tight lower bound up to a constant factor
Empirical results show non-vacuous bounds for overparameterized networks
Abstract
Overparameterized neural networks often show a benign overfitting property in the sense of achieving excellent generalization behavior despite the number of parameters exceeding the number of training examples. A promising direction to explain benign overfitting is to relate generalization to the norm of distance from initialization, motivated by the empirical observations that this distance is often significantly smaller than the norm itself. However, the existing initialization-dependent complexity analyses measure the distance from initialization by the Frobenius norm, and often imply vacuous bounds in practice for overparamterized models. In this paper, we develop initialization-dependent complexity bounds for shallow neural networks with general Lipschitz activation functions. Our bounds depend on the path-norm of the distance from initialization, which are derived by introducing a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
