A Priori Estimates of the Population Risk for Two-layer Neural Networks

Weinan E; Chao Ma; Lei Wu

arXiv:1810.06397·stat.ML·February 24, 2020

A Priori Estimates of the Population Risk for Two-layer Neural Networks

Weinan E, Chao Ma, Lei Wu

PDF

TL;DR

This paper derives nearly optimal a priori risk estimates for two-layer neural networks, explaining their superior performance over kernel methods, especially in over-parametrized regimes, based solely on function norms.

Contribution

It introduces new a priori population risk estimates for two-layer neural networks that are nearly optimal and applicable in over-parametrized settings, unlike previous a posteriori results.

Findings

01

Estimates scale similarly to Monte Carlo error rates.

02

Effective in over-parametrized regimes with large network sizes.

03

Provide insight into why neural networks outperform kernel methods.

Abstract

New estimates for the population risk are established for two-layer neural networks. These estimates are nearly optimal in the sense that the error rates scale in the same way as the Monte Carlo error rates. They are equally effective in the over-parametrized regime when the network size is much larger than the size of the dataset. These new estimates are a priori in nature in the sense that the bounds depend only on some norms of the underlying functions to be fitted, not the parameters in the model, in contrast with most existing results which are a posteriori in nature. Using these a priori estimates, we provide a perspective for understanding why two-layer neural networks perform better than the related kernel methods.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.