A random energy approach to deep learning
Rongrong Xie, Matteo Marsili

TL;DR
This paper introduces a random energy framework to analyze deep belief networks, revealing that effective learning requires layers to be near critical points, leading to broad energy level distributions.
Contribution
It presents a novel random energy approach to understanding deep belief networks and identifies the importance of tuning layers close to critical points for efficient training.
Findings
Efficiently trained networks have broad energy level distributions.
Dependence propagation from visible to deep layers occurs near critical points.
Analysis confirms the theory on various datasets.
Abstract
We study a generic ensemble of deep belief networks which is parametrized by the distribution of energy levels of the hidden states of each layer. We show that, within a random energy approach, statistical dependence can propagate from the visible to deep layers only if each layer is tuned close to the critical point during learning. As a consequence, efficiently trained learning machines are characterised by a broad distribution of energy levels. The analysis of Deep Belief Networks and Restricted Boltzmann Machines on different datasets confirms these conclusions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
