TL;DR
This paper reviews how gradient descent on infinitely wide two-layer neural networks with homogeneous activations can achieve global convergence guarantees, providing insights into their optimization and generalization properties.
Contribution
It demonstrates that in the limit of infinite width, two-layer neural networks exhibit favorable convergence properties, bridging the gap between theory and practical neural network training.
Findings
Global convergence guarantees for infinitely wide neural networks
Insights into the optimization landscape of large neural networks
Theoretical understanding of generalization in wide neural networks
Abstract
Many supervised machine learning methods are naturally cast as optimization problems. For prediction models which are linear in their parameters, this often leads to convex problems for which many mathematical guarantees exist. Models which are non-linear in their parameters such as neural networks lead to non-convex optimization problems for which guarantees are harder to obtain. In this review paper, we consider two-layer neural networks with homogeneous activation functions where the number of hidden neurons tends to infinity, and show how qualitative convergence guarantees may be derived.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
