Analysis of the rate of convergence of fully connected deep neural   network regression estimates with smooth activation function

Sophie Langer

arXiv:2010.06168·math.ST·October 14, 2020·J. Multivar. Anal.

Analysis of the rate of convergence of fully connected deep neural network regression estimates with smooth activation function

Sophie Langer

PDF

TL;DR

This paper extends the theoretical understanding of deep neural network regression estimates by demonstrating that fully connected DNNs with smooth sigmoid activation functions can achieve optimal convergence rates, similar to those with ReLU.

Contribution

It proves that fully connected DNNs with sigmoid activation functions attain minimax convergence rates, expanding previous results limited to ReLU activations.

Findings

01

Fully connected DNNs with sigmoid activation achieve minimax convergence rates.

02

The number of hidden layers is fixed, with neurons per layer increasing as sample size grows.

03

A bound on network weights is established for convergence analysis.

Abstract

This article contributes to the current statistical theory of deep neural networks (DNNs). It was shown that DNNs are able to circumvent the so--called curse of dimensionality in case that suitable restrictions on the structure of the regression function hold. In most of those results the tuning parameter is the sparsity of the network, which describes the number of non-zero weights in the network. This constraint seemed to be the key factor for the good rate of convergence results. Recently, the assumption was disproved. In particular, it was shown that simple fully connected DNNs can achieve the same rate of convergence. Those fully connected DNNs are based on the unbounded ReLU activation function. In this article we extend the results to smooth activation functions, i.e., to the sigmoid activation function. It is shown that estimators based on fully connected DNNs with sigmoid…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.