Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks
Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang

TL;DR
This paper introduces a mean-field framework for analyzing over-parameterized deep neural networks by representing them through probability measures over features, leading to new insights into training dynamics and convergence.
Contribution
It develops a novel mean-field representation over features, overcoming degeneracy issues, and provides the first global convergence proof for multi-layer over-parameterized DNNs.
Findings
Framework applies to standard DNN and Res-Net architectures.
Neural feature flow captures training dynamics.
Proves global convergence for over-parameterized Res-Net.
Abstract
This paper proposes a new mean-field framework for over-parameterized deep neural networks (DNNs), which can be used to analyze neural network training. In this framework, a DNN is represented by probability measures and functions over its features (that is, the function values of the hidden units over the training data) in the continuous limit, instead of the neural network parameters as most existing studies have done. This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization. Moreover, we construct a non-linear dynamics called neural feature flow, which captures the evolution of an over-parameterized DNN trained by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Model Reduction and Neural Networks
