Nonparametric regression using deep neural networks with ReLU activation function
Johannes Schmidt-Hieber

TL;DR
This paper demonstrates that sparsely connected deep neural networks with ReLU activation can achieve near-optimal rates in multivariate nonparametric regression, highlighting the importance of network depth and sparsity.
Contribution
It provides theoretical guarantees for deep ReLU networks achieving minimax convergence rates in nonparametric regression under a general compositional framework.
Findings
Deep networks with ReLU can attain minimax rates (up to log factors)
Network depth should scale with sample size for optimal performance
Wavelet estimators are suboptimal under the same assumptions
Abstract
Consider the multivariate nonparametric regression model. It is shown that estimators based on sparsely connected deep neural networks with ReLU activation function and properly chosen network architecture achieve the minimax rates of convergence (up to -factors) under a general composition assumption on the regression function. The framework includes many well-studied structural constraints such as (generalized) additive models. While there is a lot of flexibility in the network architecture, the tuning parameter is the sparsity of the network. Specifically, we consider large networks with number of potential network parameters exceeding the sample size. The analysis gives some insights into why multilayer feedforward neural networks perform well in practice. Interestingly, for ReLU activation function the depth (number of layers) of the neural network architectures plays an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Methods*Communicated@Fast*How Do I Communicate to Expedia?
