Using Inherent Structures to design Lean 2-layer RBMs
Abhishek Bansal, Abhinav Anand, Chiranjib Bhattacharyya

TL;DR
This paper introduces the Inherent Structure Capacity (ISC) to measure RBM representation power, showing that two-layer 'Lean' RBMs can match single-layer RBMs' capacity with fewer parameters, highlighting the importance of layering.
Contribution
It proposes ISC as a novel measure for RBM capacity and demonstrates that two-layer Lean RBMs can achieve similar capacity to larger single-layer RBMs with fewer parameters.
Findings
ISC approaches a finite constant as hidden units increase in single-layer RBMs.
Two-layer Lean RBMs can match the capacity of large single-layer RBMs with significantly fewer parameters.
First quantitative evidence for the necessity of layering in RBMs.
Abstract
Understanding the representational power of Restricted Boltzmann Machines (RBMs) with multiple layers is an ill-understood problem and is an area of active research. Motivated from the approach of \emph{Inherent Structure formalism} (Stillinger & Weber, 1982), extensively used in analysing Spin Glasses, we propose a novel measure called \emph{Inherent Structure Capacity} (ISC), which characterizes the representation capacity of a fixed architecture RBM by the expected number of modes of distributions emanating from the RBM with parameters drawn from a prior distribution. Though ISC is intractable, we show that for a single layer RBM architecture ISC approaches a finite constant as number of hidden units are increased and to further improve the ISC, one needs to add a second layer. Furthermore, we introduce \emph{Lean} RBMs, which are multi-layer RBMs where each layer can have at-most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTheoretical and Computational Physics · Generative Adversarial Networks and Image Synthesis · Stochastic Gradient Optimization Techniques
