Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest   Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz   Functions

Boris Hanin

arXiv:2109.12960·stat.ML·September 28, 2021

Ridgeless Interpolation with Shallow ReLU Networks in $1D$ is Nearest Neighbor Curvature Extrapolation and Provably Generalizes on Lipschitz Functions

Boris Hanin

PDF

Open Access

TL;DR

This paper characterizes one-layer ReLU networks that interpolate data with minimal weights, showing they perform curvature-based extrapolation and can provably generalize well on Lipschitz functions in 1D.

Contribution

It provides a geometric description of ridgeless ReLU interpolants in 1D, linking their extrapolation behavior to curvature estimates from data.

Findings

01

Interpolants compare curvature signs at data points to determine linear or convex/concave behavior.

02

Ridgeless ReLU interpolants achieve near-optimal generalization on 1D Lipschitz functions.

03

The method offers a geometric understanding of how shallow ReLU networks extrapolate.

Abstract

We prove a precise geometric description of all one layer ReLU networks $z (x; θ)$ with a single linear unit and input/output dimensions equal to one that interpolate a given dataset $D = {(x_{i}, f (x_{i}))}$ and, among all such interpolants, minimize the $ℓ_{2}$ -norm of the neuron weights. Such networks can intuitively be thought of as those that minimize the mean-squared error over $D$ plus an infinitesimal weight decay penalty. We therefore refer to them as ridgeless ReLU interpolants. Our description proves that, to extrapolate values $z (x; θ)$ for inputs $x \in (x_{i}, x_{i + 1})$ lying between two consecutive datapoints, a ridgeless ReLU interpolant simply compares the signs of the discrete estimates for the curvature of $f$ at $x_{i}$ and $x_{i + 1}$ derived from the dataset $D$ . If the curvature estimates at $x_{i}$ and $x_{i + 1}$ have different signs, then…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Numerical Analysis Techniques · Model Reduction and Neural Networks · Medical Image Segmentation Techniques

MethodsWeight Decay