The Restricted Isometry of ReLU Networks: Generalization through Norm   Concentration

Alex Goe{\ss}mann; Gitta Kutyniok

arXiv:2007.00479·stat.ML·July 2, 2020

The Restricted Isometry of ReLU Networks: Generalization through Norm Concentration

Alex Goe{\ss}mann, Gitta Kutyniok

PDF

Open Access

TL;DR

This paper introduces the Neural Restricted Isometry Property (NeuRIP), a uniform concentration condition for shallow ReLU networks, providing theoretical guarantees for their generalization based on norm concentration and sample complexity bounds.

Contribution

It defines NeuRIP for shallow ReLU networks, derives sample complexity bounds using covering numbers and chaining, and links NeuRIP to uniform generalization guarantees.

Findings

01

NeuRIP holds with high probability for shallow ReLU networks given sufficient data.

02

Networks with small empirical risk are shown to generalize uniformly under NeuRIP.

03

Provides bounds on expected risk for networks in any empirical risk sublevel set.

Abstract

While regression tasks aim at interpolating a relation on the entire input space, they often have to be solved with a limited amount of training data. Still, if the hypothesis functions can be sketched well with the data, one can hope for identifying a generalizing model. In this work, we introduce with the Neural Restricted Isometry Property (NeuRIP) a uniform concentration event, in which all shallow $ReLU$ networks are sketched with the same quality. To derive the sample complexity for achieving NeuRIP, we bound the covering numbers of the networks in the Sub-Gaussian metric and apply chaining techniques. In case of the NeuRIP event, we then provide bounds on the expected risk, which hold for networks in any sublevel set of the empirical risk. We conclude that all networks with sufficiently small empirical risk generalize uniformly.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGaussian Processes and Bayesian Inference · Domain Adaptation and Few-Shot Learning · Machine Learning and ELM