How Implicit Regularization of ReLU Neural Networks Characterizes the   Learned Function -- Part I: the 1-D Case of Two Layers with Random First   Layer

Jakob Heiss; Josef Teichmann; Hanna Wutte

arXiv:1911.02903·cs.LG·October 5, 2023·5 cites

How Implicit Regularization of ReLU Neural Networks Characterizes the Learned Function -- Part I: the 1-D Case of Two Layers with Random First Layer

Jakob Heiss, Josef Teichmann, Hanna Wutte

PDF

Open Access 1 Repo

TL;DR

This paper analyzes how implicit regularization in shallow ReLU neural networks with random first layers influences the learned function, revealing connections to smoothing splines and function regularization.

Contribution

It establishes a mathematical link between L2 regularization in such networks and second derivative regularization of the function, also relating early stopping to smoothing splines.

Findings

01

Networks converge to smooth spline interpolation as hidden nodes increase

02

L2 regularization corresponds to second derivative regularization in function space

03

Early stopping mimics smoothing spline regression

Abstract

In this paper, we consider one dimensional (shallow) ReLU neural networks in which weights are chosen randomly and only the terminal layer is trained. First, we mathematically show that for such networks L2-regularized regression corresponds in function space to regularizing the estimate's second derivative for fairly general loss functionals. For least squares regression, we show that the trained network converges to the smooth spline interpolation of the training data as the number of hidden nodes tends to infinity. Moreover, we derive a novel correspondence between the early stopped gradient descent (without any explicit regularization of the weights) and the smoothing spline regression.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

JakobHeiss/NN_regularization1
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Numerical methods in inverse problems · Model Reduction and Neural Networks

Methods*Communicated@Fast*How Do I Communicate to Expedia?