On the Effective Number of Linear Regions in Shallow Univariate ReLU   Networks: Convergence Guarantees and Implicit Bias

Itay Safran; Gal Vardi; Jason D. Lee

arXiv:2205.09072·cs.LG·February 3, 2023·1 cites

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias

Itay Safran, Gal Vardi, Jason D. Lee

PDF

Open Access 1 Video

TL;DR

This paper analyzes the convergence and implicit bias of gradient flow in shallow univariate ReLU networks, showing they tend to simplify their decision boundaries to at most proportional to the number of target neurons, with implications for generalization.

Contribution

It provides the first convergence guarantees for gradient flow in shallow ReLU networks with implicit bias towards networks with limited linear regions, even under mild over-parameterization.

Findings

01

Gradient flow converges to a network with at most O(r) linear regions.

02

The result holds with high probability over initialization and data sampling.

03

Implications for generalization bounds in shallow neural networks.

Abstract

We study the dynamics and implicit bias of gradient flow (GF) on univariate ReLU neural networks with a single hidden layer in a binary classification setting. We show that when the labels are determined by the sign of a target network with $r$ neurons, with high probability over the initialization of the network and the sampling of the dataset, GF converges in direction (suitably defined) to a network achieving perfect training accuracy and having at most $O (r)$ linear regions, implying a generalization bound. Unlike many other results in the literature, under an additional assumption on the distribution of the data, our result holds even for mild over-parameterization, where the width is $\tilde{O} (r)$ and independent of the sample size.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

On the Effective Number of Linear Regions in Shallow Univariate ReLU Networks: Convergence Guarantees and Implicit Bias· slideslive

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Stochastic Gradient Optimization Techniques