Do Deep Nets Really Need to be Deep?

Lei Jimmy Ba; Rich Caruana

arXiv:1312.6184·cs.LG·October 14, 2014·1.5k cites

Do Deep Nets Really Need to be Deep?

Lei Jimmy Ba, Rich Caruana

PDF

Open Access 2 Repos 1 Video

TL;DR

This paper demonstrates that shallow neural networks can learn complex functions traditionally associated with deep models, achieving similar accuracy with fewer layers and comparable parameters, challenging the necessity of depth in neural networks.

Contribution

It shows that shallow feed-forward networks can replicate the performance of deep models on speech recognition tasks, suggesting alternative training algorithms may exist.

Findings

01

Shallow nets can learn complex functions of deep models.

02

Shallow nets achieved similar accuracy to deep architectures on TIMIT.

03

Training shallow nets may require better algorithms than current methods.

Abstract

Currently, deep neural networks are the state of the art on problems such as speech recognition and computer vision. In this extended abstract, we show that shallow feed-forward networks can learn the complex functions previously learned by deep nets and achieve accuracies previously only achievable with deep models. Moreover, in some cases the shallow neural nets can learn these deep functions using a total number of parameters similar to the original deep model. We evaluate our method on the TIMIT phoneme recognition task and are able to train shallow fully-connected nets that perform similarly to complex, well-engineered, deep convolutional architectures. Our success in training shallow neural nets to mimic deeper models suggests that there probably exist better algorithms for training shallow feed-forward nets than those currently available.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

Do Deep Nets Really Need To Be Deep?· youtube

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Speech Recognition and Synthesis