Nonparametric regression using deep neural networks with ReLU activation   function

Johannes Schmidt-Hieber

arXiv:1708.06633·math.ST·September 15, 2020

Nonparametric regression using deep neural networks with ReLU activation function

Johannes Schmidt-Hieber

PDF

TL;DR

This paper demonstrates that sparsely connected deep neural networks with ReLU activation can achieve near-optimal rates in multivariate nonparametric regression, highlighting the importance of network depth and sparsity.

Contribution

It provides theoretical guarantees for deep ReLU networks achieving minimax convergence rates in nonparametric regression under a general compositional framework.

Findings

01

Deep networks with ReLU can attain minimax rates (up to log factors)

02

Network depth should scale with sample size for optimal performance

03

Wavelet estimators are suboptimal under the same assumptions

Abstract

Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to $lo g n$ -factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architecture, the tuning parameter is the sparsity of the network. Specifically, we consider large networks with number of potential network parameters exceeding the sample size. The analysis gives some insights into why multilayer feedforward neural networks perform well in practice. Interestingly, for ReLU activation function the depth (number of layers) of the neural network architectures plays an…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Methods*Communicated@Fast*How Do I Communicate to Expedia?