Benefits of depth in neural networks
Matus Telgarsky

TL;DR
This paper demonstrates that deep neural networks with certain types of nodes can represent complex functions more efficiently than shallow networks, highlighting the fundamental advantages of depth in neural network design.
Contribution
The paper proves that neural networks with semi-algebraic gates require exponentially larger size to approximate certain deep networks, establishing the theoretical benefits of depth across various architectures.
Findings
Deep networks with semi-algebraic gates cannot be approximated by shallow networks without exponential size.
Depth provides a fundamental advantage in representing complex functions efficiently.
Results apply to ReLU, max, indicator, piecewise polynomial functions, and extend to convolutional and decision tree models.
Abstract
For any positive integer , there exist neural networks with layers, nodes per layer, and distinct parameters which can not be approximated by networks with layers unless they are exponentially large --- they must possess nodes. This result is proved here for a class of nodes termed "semi-algebraic gates" which includes the common choices of ReLU, maximum, indicator, and piecewise polynomial functions, therefore establishing benefits of depth against not just standard networks with ReLU gates, but also convolutional networks with ReLU and maximization gates, sum-product networks, and boosted decision trees (in this last case with a stronger separation: total tree nodes are required).
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Adversarial Robustness in Machine Learning · Neural Networks and Applications
Methods*Communicated@Fast*How Do I Communicate to Expedia?
