Modeling from Features: a Mean-field Framework for Over-parameterized   Deep Neural Networks

Cong Fang; Jason D. Lee; Pengkun Yang; Tong Zhang

arXiv:2007.01452·stat.ML·July 6, 2020

Modeling from Features: a Mean-field Framework for Over-parameterized Deep Neural Networks

Cong Fang, Jason D. Lee, Pengkun Yang, Tong Zhang

PDF

Open Access

TL;DR

This paper introduces a mean-field framework for analyzing over-parameterized deep neural networks by representing them through probability measures over features, leading to new insights into training dynamics and convergence.

Contribution

It develops a novel mean-field representation over features, overcoming degeneracy issues, and provides the first global convergence proof for multi-layer over-parameterized DNNs.

Findings

01

Framework applies to standard DNN and Res-Net architectures.

02

Neural feature flow captures training dynamics.

03

Proves global convergence for over-parameterized Res-Net.

Abstract

This paper proposes a new mean-field framework for over-parameterized deep neural networks (DNNs), which can be used to analyze neural network training. In this framework, a DNN is represented by probability measures and functions over its features (that is, the function values of the hidden units over the training data) in the continuous limit, instead of the neural network parameters as most existing studies have done. This new representation overcomes the degenerate situation where all the hidden units essentially have only one meaningful hidden unit in each middle layer, and further leads to a simpler representation of DNNs, for which the training objective can be reformulated as a convex optimization problem via suitable re-parameterization. Moreover, we construct a non-linear dynamics called neural feature flow, which captures the evolution of an over-parameterized DNN trained by…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Machine Learning and ELM · Model Reduction and Neural Networks