# A Comparative Analysis of the Optimization and Generalization Property   of Two-layer Neural Network and Random Feature Models Under Gradient Descent   Dynamics

**Authors:** Weinan E, Chao Ma, Lei Wu

arXiv: 1904.04326 · 2020-02-27

## TL;DR

This paper analyzes the training dynamics and generalization properties of two-layer neural networks and random feature models under gradient descent, revealing exponential convergence and kernel-like behavior.

## Contribution

It provides a comprehensive theoretical analysis of gradient descent dynamics for two-layer neural networks, including convergence rates and generalization error estimates.

## Key findings

- Gradient descent achieves exponential convergence to zero training loss.
- Neural network functions remain close to kernel methods during training.
- Sharp bounds on generalization error are established for various network widths and data sizes.

## Abstract

A fairly comprehensive analysis is presented for the gradient descent dynamics for training two-layer neural network models in the situation when the parameters in both layers are updated. General initialization schemes as well as general regimes for the network width and training data size are considered. In the over-parametrized regime, it is shown that gradient descent dynamics can achieve zero training loss exponentially fast regardless of the quality of the labels. In addition, it is proved that throughout the training process the functions represented by the neural network model are uniformly close to that of a kernel method. For general values of the network width and training data size, sharp estimates of the generalization error is established for target functions in the appropriate reproducing kernel Hilbert space.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.04326/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1904.04326/full.md

## References

31 references — full list in the complete paper: https://tomesphere.com/paper/1904.04326/full.md

---
Source: https://tomesphere.com/paper/1904.04326