Sparse Group Restricted Boltzmann Machines

Heng Luo; Ruimin Shen; Cahngyong Niu

arXiv:1008.4988·stat.ML·August 31, 2010·AAAI·32 cites

Sparse Group Restricted Boltzmann Machines

Heng Luo, Ruimin Shen, Cahngyong Niu

PDF

Open Access

TL;DR

This paper introduces sparse group regularization for Restricted Boltzmann Machines to promote group and individual sparsity, improving modeling efficiency and achieving state-of-the-art results on MNIST.

Contribution

It proposes a novel $l_1/l_2$ regularization method for RBMs, enabling group sparsity and competition among hidden units, and extends this approach to deep Boltzmann machines.

Findings

01

Achieved 0.84% error rate on MNIST, the best published result.

02

Demonstrated effectiveness in modeling natural image patches and handwritten digits.

03

Extended the regularizer to deep Boltzmann machines for enhanced sparsity.

Abstract

Since learning is typically very slow in Boltzmann machines, there is a need to restrict connections within hidden layers. However, the resulting states of hidden units exhibit statistical dependencies. Based on this observation, we propose using $l_{1} / l_{2}$ regularization upon the activation possibilities of hidden units in restricted Boltzmann machines to capture the loacal dependencies among hidden units. This regularization not only encourages hidden units of many groups to be inactive given observed data but also makes hidden units within a group compete with each other for modeling observed data. Thus, the $l_{1} / l_{2}$ regularization on RBMs yields sparsity at both the group and the hidden unit levels. We call RBMs trained with the regularizer \emph{sparse group} RBMs. The proposed sparse group RBMs are applied to three tasks: modeling patches of natural images, modeling handwritten…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Domain Adaptation and Few-Shot Learning · Music and Audio Processing