Unsupervised Pretraining Encourages Moderate-Sparseness

Jun Li; Wei Luo; Jian Yang; Xiaotong Yuan

arXiv:1312.5813·cs.LG·June 10, 2014·2 cites

Unsupervised Pretraining Encourages Moderate-Sparseness

Jun Li, Wei Luo, Jian Yang, Xiaotong Yuan

PDF

Open Access

TL;DR

This paper explains that unsupervised pretraining improves neural network performance by inducing moderate sparsity in hidden unit activations, acting as an adaptive sparse coding mechanism, as supported by experiments on MNIST and Birdsong.

Contribution

It reveals that pretraining encourages sparsity in neural networks, providing a new understanding of its effectiveness beyond regularization and optimization.

Findings

01

Pretraining leads to moderate sparsity in hidden units.

02

Pretrained models can be viewed as adaptive sparse coders.

03

Experimental results support the sparseness hypothesis on MNIST and Birdsong.

Abstract

It is well known that direct training of deep neural networks will generally lead to poor results. A major progress in recent years is the invention of various pretraining methods to initialize network parameters and it was shown that such methods lead to good prediction performance. However, the reason for the success of pretraining has not been fully understood, although it was argued that regularization and better optimization play certain roles. This paper provides another explanation for the effectiveness of pretraining, where we show pretraining leads to a sparseness of hidden unit activation in the resulting neural networks. The main reason is that the pretraining models can be interpreted as an adaptive sparse coding. Compared to deep neural network with sigmoid function, our experimental results on MNIST and Birdsong further support this sparseness observation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Machine Learning and Data Classification · Gaussian Processes and Bayesian Inference