Replica Theory of Spherical Boltzmann Machine Ensembles
Thomas Tulinski (LPENS), Jorge Fernandez-De-Cossio-Diaz (IPHT, LPENS), Simona Cocco (LPENS), R\'emi Monasson

TL;DR
This paper develops an analytical framework using replica theory to understand how ensemble learning enhances the performance of spherical Boltzmann machines, especially for finite-dimensional data.
Contribution
It introduces a duality between ensemble learning and free energy large deviations, providing exact solutions and insights into when ensembles outperform single models.
Findings
Ensemble learning improves performance for nearly finite-dimensional data.
Replica calculations fully solve spherical Boltzmann machine ensembles.
Framework applies to complex data distributions, confirmed by deep network simulations.
Abstract
Training in machine learning generally consists in finding one model, whose parameters minimize a data-dependent loss. Yet, empirical work shows that ensemble learning, an approach in which multiple models are sampled, can improve performance. Here, we provide an analytical framework to understand these observations in the case of Boltzmann machines, exploiting a duality between ensemble learning and large deviations of the free energy in spin-glass models. Replica calculations allow us to fully solve the case of spherical Boltzmann machine ensembles, and clarify when ensemble learning improves over standard loss minimization, in particular for nearly finite-dimensional data. Our framework can also be applied to complex data distributions, in agreement with numerical simulations on deep networks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
