Representation Benefits of Deep Feedforward Networks
Matus Telgarsky

TL;DR
This paper demonstrates that deep feedforward networks with ReLU nonlinearities can efficiently solve certain classification problems that shallow networks cannot, highlighting the representational advantages of depth.
Contribution
It introduces a family of classification problems where deep networks succeed while shallow networks with fewer nodes fail, emphasizing the importance of depth in neural network design.
Findings
Deep networks with 2 nodes per layer achieve zero error on the problem.
Shallow networks with fewer than exponentially many nodes have at least 1/6 error.
Recurrent networks with 3 nodes iterated k times also achieve zero error.
Abstract
This note provides a family of classification problems, indexed by a positive integer , where all shallow networks with fewer than exponentially (in ) many nodes exhibit error at least , whereas a deep network with 2 nodes in each of layers achieves zero error, as does a recurrent network with 3 distinct nodes iterated times. The proof is elementary, and the networks are standard feedforward networks with ReLU (Rectified Linear Unit) nonlinearities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Anomaly Detection Techniques and Applications · Generative Adversarial Networks and Image Synthesis
Methods*Communicated@Fast*How Do I Communicate to Expedia?
