Generalization and Expressivity for Deep Nets
Shao-Bo Lin

TL;DR
This paper investigates the theoretical advantages of deep neural networks by analyzing their expressivity and generalization, demonstrating that deep nets can achieve high expressivity without increased capacity, leading to near optimal learning rates.
Contribution
The paper constructs a deep net with two hidden layers showing superior expressivity and derives near optimal learning rates, highlighting deep nets' advantages from a theoretical perspective.
Findings
Deep nets have excellent expressive power without increasing shallow net capacity.
Constructed deep nets achieve near optimal learning rates.
Deep nets' expressivity is demonstrated through localized and sparse approximation.
Abstract
Along with the rapid development of deep learning in practice, the theoretical explanations for its success become urgent. Generalization and expressivity are two widely used measurements to quantify theoretical behaviors of deep learning. The expressivity focuses on finding functions expressible by deep nets but cannot be approximated by shallow nets with the similar number of neurons. It usually implies the large capacity. The generalization aims at deriving fast learning rate for deep nets. It usually requires small capacity to reduce the variance. Different from previous studies on deep learning, pursuing either expressivity or generalization, we take both factors into account to explore the theoretical advantages of deep nets. For this purpose, we construct a deep net with two hidden layers possessing excellent expressivity in terms of localized and sparse approximation. Then,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Generative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks
