Learning Feature Hierarchies with Centered Deep Boltzmann Machines

Gr\'egoire Montavon; Klaus-Robert M\"uller

arXiv:1203.3783·stat.ML·December 19, 2012·20 cites

Learning Feature Hierarchies with Centered Deep Boltzmann Machines

Gr\'egoire Montavon, Klaus-Robert M\"uller

PDF

Open Access

TL;DR

This paper introduces a centered deep Boltzmann machine with a modified learning algorithm that improves training stability and enables the model to learn hierarchical data representations more effectively.

Contribution

The paper proposes a new centering technique for Deep Boltzmann Machines that enhances training and allows for joint layer learning without greedy pretraining.

Findings

01

Centered Deep Boltzmann Machines learn hierarchical data representations.

02

The new algorithm results in better conditioned Hessian during training.

03

The model demonstrates improved generative capabilities on real data.

Abstract

Deep Boltzmann machines are in principle powerful models for extracting the hierarchical structure of data. Unfortunately, attempts to train layers jointly (without greedy layer-wise pretraining) have been largely unsuccessful. We propose a modification of the learning algorithm that initially recenters the output of the activation functions to zero. This modification leads to a better conditioned Hessian and thus makes learning easier. We test the algorithm on real data and demonstrate that our suggestion, the centered deep Boltzmann machine, learns a hierarchy of increasingly abstract representations and a better generative model of data.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Music and Audio Processing · Model Reduction and Neural Networks