Layerwise Systematic Scan: Deep Boltzmann Machines and Beyond
Heng Guo, Kaan Kara, Ce Zhang

TL;DR
This paper analyzes the mixing times of Gibbs samplers with layerwise scan order in bipartite models like Deep Boltzmann Machines, showing they are comparable to random updates and establishing bounds for practical implementations.
Contribution
It provides the first theoretical comparison between systematic layerwise scans and random updates for bipartite models, including tight bounds on relaxation times.
Findings
Layerwise scan relaxation time is no larger than random update (in epochs).
Constructed examples show the bound is asymptotically tight.
Results imply similar mixing times for practical Deep Boltzmann Machine training.
Abstract
For Markov chain Monte Carlo methods, one of the greatest discrepancies between theory and system is the scan order - while most theoretical development on the mixing time analysis deals with random updates, real-world systems are implemented with systematic scans. We bridge this gap for models that exhibit a bipartite structure, including, most notably, the Restricted/Deep Boltzmann Machine. The de facto implementation for these models scans variables in a layerwise fashion. We show that the Gibbs sampler with a layerwise alternating scan order has its relaxation time (in terms of epochs) no larger than that of a random-update Gibbs sampler (in terms of variable updates). We also construct examples to show that this bound is asymptotically tight. Through standard inequalities, our result also implies a comparison on the mixing times.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Generative Adversarial Networks and Image Synthesis · Machine Learning and Algorithms
