Mathematical Models of Overparameterized Neural Networks

Cong Fang; Hanze Dong; Tong Zhang

arXiv:2012.13982·cs.LG·December 29, 2020

Mathematical Models of Overparameterized Neural Networks

Cong Fang, Hanze Dong, Tong Zhang

PDF

1 Repo

TL;DR

This paper reviews recent theoretical advances in understanding overparameterized neural networks, focusing on mathematical models and their implications for two-layer networks and deep learning challenges.

Contribution

It summarizes key mathematical models and progress in analyzing overparameterized neural networks, highlighting their convex-like behavior and algorithmic implications.

Findings

01

Overparameterized networks behave like convex systems in certain settings

02

Neural tangent kernel provides a local analysis framework

03

Recent progress enhances understanding of two-layer neural networks

Abstract

Deep learning has received considerable empirical successes in recent years. However, while many ad hoc tricks have been discovered by practitioners, until recently, there has been a lack of theoretical understanding for tricks invented in the deep learning literature. Known by practitioners that overparameterized neural networks are easy to learn, in the past few years there have been important theoretical developments in the analysis of overparameterized neural networks. In particular, it was shown that such systems behave like convex systems under various restricted settings, such as for two-layer NNs, and when learning is restricted locally in the so-called neural tangent kernel space around specialized initializations. This paper discusses some of these recent progresses leading to significant better understanding of neural networks. We will focus on the analysis of two-layer…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hendrydong/NTK-and-MF-examples
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.