Fundamental operating regimes, hyper-parameter fine-tuning and glassiness: towards an interpretable replica-theory for trained restricted Boltzmann machines
Alberto Fachechi, Elena Agliari, Miriam Aquaro, Anthony Coolen, Menno, Mulder

TL;DR
This paper develops a statistical mechanics framework to analyze restricted Boltzmann machines, identifying different operational regimes and the conditions under which replica symmetry breaking occurs, enhancing interpretability of these models.
Contribution
It introduces a replica-theory-based approach to characterize the regimes of trained RBMs, including hyper-parameter effects and glassiness phenomena, with analytical and numerical validation.
Findings
Identification of different operational regimes based on hyper-parameters
Existence of a replica-symmetry breaking region in hyper-parameter space
Analytical and numerical evidence supporting the theoretical framework
Abstract
We consider restricted Boltzmann machines with a binary visible layer and a Gaussian hidden layer trained by an unlabelled dataset composed of noisy realizations of a single ground pattern. We develop a statistical mechanics framework to describe the network generative capabilities, by exploiting the replica trick and assuming self-averaging of the underlying order parameters (i.e., replica symmetry). In particular, we outline the effective control parameters (e.g., the relative number of weights to be trained, the regularization parameter), whose tuning can yield qualitatively-different operative regimes. Further, we provide analytical and numerical evidence for the existence of a sub-region in the space of the hyperparameters where replica-symmetry breaking occurs.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Thermodynamics and Statistical Mechanics · Theoretical and Computational Physics · Quantum many-body systems
