A mean-field limit for certain deep neural networks
Dyego Ara\'ujo, Roberto I. Oliveira, Daniel Yukimura

TL;DR
This paper establishes a mean-field limit for deep neural networks with multiple layers trained by stochastic gradient descent, providing a rigorous mathematical framework to describe their training dynamics as the number of neurons per layer grows large.
Contribution
It extends previous mean-field analyses from shallow to deep networks with multiple layers, rigorously deriving the limiting behavior and proving existence and uniqueness of the associated McKean-Vlasov equations.
Findings
Network weights are approximated by ideal particles described by a mean-field model.
The mean-field limit accurately captures the evolution of deep neural networks during training.
Rigorous proof of existence and uniqueness for the McKean-Vlasov problem in this context.
Abstract
Understanding deep neural networks (DNNs) is a key challenge in the theory of machine learning, with potential applications to the many fields where DNNs have been successfully used. This article presents a scaling limit for a DNN being trained by stochastic gradient descent. Our networks have a fixed (but arbitrary) number of inner layers; neurons per layer; full connections between layers; and fixed weights (or "random features" that are not trained) near the input and output. Our results describe the evolution of the DNN during training in the limit when , which we relate to a mean field model of McKean-Vlasov type. Specifically, we show that network weights are approximated by certain "ideal particles" whose distribution and dependencies are described by the mean-field model. A key part of the proof is to show existence and uniqueness for our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Markov Chains and Monte Carlo Methods · Mathematical Approximation and Integration
