Benefits of depth in neural networks

Matus Telgarsky

arXiv:1602.04485·cs.LG·May 31, 2016·99 cites

Benefits of depth in neural networks

Matus Telgarsky

PDF

Open Access

TL;DR

This paper demonstrates that deep neural networks with certain types of nodes can represent complex functions more efficiently than shallow networks, highlighting the fundamental advantages of depth in neural network design.

Contribution

The paper proves that neural networks with semi-algebraic gates require exponentially larger size to approximate certain deep networks, establishing the theoretical benefits of depth across various architectures.

Findings

01

Deep networks with semi-algebraic gates cannot be approximated by shallow networks without exponential size.

02

Depth provides a fundamental advantage in representing complex functions efficiently.

03

Results apply to ReLU, max, indicator, piecewise polynomial functions, and extend to convolutional and decision tree models.

Abstract

For any positive integer $k$ , there exist neural networks with $Θ (k^{3})$ layers, $Θ (1)$ nodes per layer, and $Θ (1)$ distinct parameters which can not be approximated by networks with $O (k)$ layers unless they are exponentially large --- they must possess $Ω (2^{k})$ nodes. This result is proved here for a class of nodes termed "semi-algebraic gates" which includes the common choices of ReLU, maximum, indicator, and piecewise polynomial functions, therefore establishing benefits of depth against not just standard networks with ReLU gates, but also convolutional networks with ReLU and maximization gates, sum-product networks, and boosted decision trees (in this last case with a stronger separation: $Ω (2^{k^{3}})$ total tree nodes are required).

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Neural Networks and Applications

Methods*Communicated@Fast*How Do I Communicate to Expedia?