A Priori Estimates of the Population Risk for Residual Networks

Weinan E; Chao Ma; Qingcan Wang

arXiv:1903.02154·cs.LG·June 3, 2019·42 cites

A Priori Estimates of the Population Risk for Residual Networks

Weinan E, Chao Ma, Qingcan Wang

PDF

Open Access

TL;DR

This paper derives optimal a priori generalization error estimates for residual networks using a novel weighted path norm, providing bounds that depend only on the target function and are optimal in high dimensions.

Contribution

It introduces a new weighted path norm for residual networks and establishes optimal a priori generalization bounds that depend solely on the target function.

Findings

01

Provides high-dimensional error bounds comparable to Monte Carlo rates

02

Establishes an optimal Rademacher complexity bound for residual networks

03

Demonstrates the effectiveness of the weighted path norm in regularization

Abstract

Optimal a priori estimates are derived for the population risk, also known as the generalization error, of a regularized residual network model. An important part of the regularized model is the usage of a new path norm, called the weighted path norm, as the regularization term. The weighted path norm treats the skip connections and the nonlinearities differently so that paths with more nonlinearities are regularized by larger weights. The error estimates are a priori in the sense that the estimates depend only on the target function, not on the parameters obtained in the training process. The estimates are optimal, in a high dimensional setting, in the sense that both the bound for the approximation and estimation errors are comparable to the Monte Carlo error rates. A crucial step in the proof is to establish an optimal bound for the Rademacher complexity of the residual networks.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsProbabilistic and Robust Engineering Design · Sparse and Compressive Sensing Techniques · Statistical Methods and Inference