The power of deeper networks for expressing natural functions

David Rolnick (MIT); Max Tegmark (MIT)

arXiv:1705.05502·cs.LG·April 30, 2018·91 cites

The power of deeper networks for expressing natural functions

David Rolnick (MIT), Max Tegmark (MIT)

PDF

Open Access

TL;DR

This paper demonstrates that deep neural networks require significantly fewer neurons than shallow ones to approximate natural polynomial classes, highlighting the exponential advantage of depth in neural network expressivity.

Contribution

It provides theoretical proofs showing the linear neuron requirement for deep networks versus exponential for shallow ones, and analyzes how depth affects expressibility.

Findings

01

Deep networks need linearly many neurons in the input dimension.

02

Shallow networks require exponentially more neurons.

03

Increasing layers reduces neuron count exponentially, with depth logarithmic in input size.

Abstract

It is well-known that neural networks are universal approximators, but that deeper networks tend in practice to be more powerful than shallower ones. We shed light on this by proving that the total number of neurons $m$ required to approximate natural classes of multivariate polynomials of $n$ variables grows only linearly with $n$ for deep neural networks, but grows exponentially when merely a single hidden layer is allowed. We also provide evidence that when the number of hidden layers is increased from $1$ to $k$ , the neuron requirement grows exponentially not with $n$ but with $n^{1/ k}$ , suggesting that the minimum number of layers required for practical expressibility grows only logarithmically with $n$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Algorithms and Data Compression · Parallel Computing and Optimization Techniques