Non-asymptotic approximations of neural networks by Gaussian processes
Ronen Eldan, Dan Mikulincer, Tselil Schramm

TL;DR
This paper quantifies how wide neural networks can be approximated by Gaussian processes, providing explicit convergence rates based on activation function properties, thus bridging neural network theory and Gaussian process models.
Contribution
It establishes explicit convergence rates for neural networks to Gaussian processes, depending on activation function polynomial degree or smoothness, advancing theoretical understanding.
Findings
Convergence rate depends on activation polynomial degree or smoothness.
Explicit rates are provided in an infinite-dimensional functional space.
Results apply to wide neural networks with random initialization.
Abstract
We study the extent to which wide neural networks may be approximated by Gaussian processes when initialized with random weights. It is a well-established fact that as the width of a network goes to infinity, its law converges to that of a Gaussian process. We make this quantitative by establishing explicit convergence rates for the central limit theorem in an infinite-dimensional functional space, metrized with a natural transportation distance. We identify two regimes of interest; when the activation function is polynomial, its degree determines the rate of convergence, while for non-polynomial activations, the rate is governed by the smoothness of the function.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Neural Networks and Applications · Stochastic Gradient Optimization Techniques
