A priori generalization error for two-layer ReLU neural network through   minimum norm solution

Zhi-Qin John Xu; Jiwei Zhang; Yaoyu Zhang; Chengchao Zhao

arXiv:1912.03011·cs.LG·May 8, 2020·1 cites

A priori generalization error for two-layer ReLU neural network through minimum norm solution

Zhi-Qin John Xu, Jiwei Zhang, Yaoyu Zhang, Chengchao Zhao

PDF

Open Access

TL;DR

None

Contribution

None

Abstract

We focus on estimating \emph{a priori} generalization error of two-layer ReLU neural networks (NNs) trained by mean squared error, which only depends on initial parameters and the target function, through the following research line. We first estimate \emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by \cite{zhang2019type} to be an equivalent solution of a linearized (w.r.t. parameter) finite-width two-layer NN. As the width goes to infinity, the linearized NN converges to the NN in Neural Tangent Kernel (NTK) regime \citep{jacot2018neural}. Thus, we can derive the \emph{a priori} generalization error of two-layer ReLU NN in NTK regime. The distance between NN in a NTK regime and a finite-width NN with gradient training is estimated by \cite{arora2019exact}. Based on the results in \cite{arora2019exact}, our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Neural Networks and Applications · Stochastic Gradient Optimization Techniques

MethodsNeural Tangent Kernel · *Communicated@Fast*How Do I Communicate to Expedia?