Limiting fluctuation and trajectorial stability of multilayer neural networks with mean field training
Huy Tuan Pham, Phan-Minh Nguyen

TL;DR
This paper develops a theoretical framework to analyze the fluctuations and stability of multilayer neural networks during training, extending mean field theory to capture complex inter-layer stochastic dependencies and demonstrating that training biases solutions towards minimal fluctuation in large-width regimes.
Contribution
It introduces the second-order mean field limit for multilayer networks, capturing the distribution of fluctuations and their interactions across layers, a significant extension beyond shallow network analysis.
Findings
Fluctuations in multilayer networks can be systematically characterized by the second-order MF limit.
Gradient descent training biases solutions towards minimal fluctuation in the large-width limit.
The framework applies to general loss functions beyond convex cases.
Abstract
The mean field (MF) theory of multilayer neural networks centers around a particular infinite-width scaling, where the learning dynamics is closely tracked by the MF limit. A random fluctuation around this infinite-width limit is expected from a large-width expansion to the next order. This fluctuation has been studied only in shallow networks, where previous works employ heavily technical notions or additional formulation ideas amenable only to that case. Treatment of the multilayer case has been missing, with the chief difficulty in finding a formulation that captures the stochastic dependency across not only time but also depth. In this work, we initiate the study of the fluctuation in the case of multilayer networks, at any network depth. Leveraging on the neuronal embedding framework recently introduced by Nguyen and Pham, we systematically derive a system of dynamical equations,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Neural Networks and Applications · Machine Learning and ELM
