Self-regularizing restricted Boltzmann machines

Orestis Loukas

arXiv:1912.05634·cond-mat.dis-nn·December 13, 2019·1 cites

Self-regularizing restricted Boltzmann machines

Orestis Loukas

PDF

Open Access

TL;DR

This paper introduces a grand-canonical extension of restricted Boltzmann machines that adaptively determines the optimal number of hidden units for efficient feature learning, demonstrated on Ising and MNIST data.

Contribution

It proposes a novel grand-canonical framework for RBMs allowing automatic hidden layer size adjustment, improving learning efficiency and reducing generalization error.

Findings

01

Model effectively deduces optimal hidden units

02

Achieves low generalization error

03

Demonstrates on Ising and MNIST datasets

Abstract

Focusing on the grand-canonical extension of the ordinary restricted Boltzmann machine, we suggest an energy-based model for feature extraction that uses a layer of hidden units with varying size. By an appropriate choice of the chemical potential and given a sufficiently large number of hidden resources the generative model is able to efficiently deduce the optimal number of hidden units required to learn the target data with exceedingly small generalization error. The formal simplicity of the grand-canonical ensemble combined with a rapidly converging ansatz in mean-field theory enable us to recycle well-established numerical algothhtims during training, like contrastive divergence, with only minor changes. As a proof of principle and to demonstrate the novel features of grand-canonical Boltzmann machines, we train our generative models on data from the Ising theory and MNIST.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Model Reduction and Neural Networks · Neural Networks and Applications