Overlap synchronisation in multipartite random energy models
Giuseppe Genovese, Daniele Tantari

TL;DR
This paper investigates a multipartite random energy model composed of coupled GREMs, establishing how the overlaps between different parts synchronize based on the overlaps of individual GREMs, illustrating a fundamental phenomenon called overlap synchronisation.
Contribution
It provides the first explicit characterization of overlap synchronisation in multipartite random energy models with coupled GREMs.
Findings
Derived the joint law of overlaps in multipartite models
Established the phenomenon of overlap synchronisation
Simplified understanding of coupled GREM interactions
Abstract
In a multipartite random energy model, made of a number of coupled GREMs, we determine the joint law of the overlaps in terms of the ones of the single GREMs. This provides the simplest example of the so-called overlap synchronisation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Overlap synchronisation in multipartite random energy models
Giuseppe Genovese
Giuseppe Genovese: Institut für Mathematik, Universität Zürich, CH-8057 Zürich, Switzerland.
and
Daniele Tantari
Daniele Tantari: Centro Ennio de Giorgi, Scuola Normale Superiore, Piazza dei Cavalieri 3, I-56100 Pisa (Italy).
Abstract.
In a multipartite random energy model, made of a number of coupled GREMs, we determine the joint law of the overlaps in terms of the ones of the single GREMs. This provides the simplest example of the so-called overlap synchronisation.
MSC: 82B44, 60G55, 60K35.
1. Introduction
The overlap synchronisation phenomenon was recently introduced by Panchenko in [1] for multipartite spin glasses [2]. The study of such systems is of primary interest, because of the applications in neural network theory and statistical inference [3, 4]: e.g. the Hopfield model, the restricted Boltzmann machine and the perceptron are examples of bipartite spin glasses. The lack of convexity prevents to apply directly to multipartite spin-glasses some useful techniques developed for the Sherrington-Kirpatrick model (*i.e. *interpolation bounds [5]), calling for new ideas.
In this note we investigate a multipartite random energy model, originally studied for the bipartite case in [6], obtained coupling each level of distinct generalised random energy models (GREMs). We show the joint law of the overlaps to have a direct expression in terms of the ones of the single GREMs. This provides a simple example of overlap synchronisation.
The model is defined as follows. Let , and with , . For each configuration we can identify , . We divide each part respectively into hierarchical levels. For each level of the hierarchy, each group of configurations is divided in further subgroups indexed by , with of course and , . Each configuration can be thought of as a -ple or as a -ple . This multipartite setting brings a somewhat heavy notation. To lighten it a little we let
[TABLE]
label the configurations in the -th level of the -th tree. With a slight abuse of notation we will denote with the same symbol also the set of such configurations (the correct meaning will be always clear from the context). We attach to each couple of levels Gaussian centred r.vs , and with
[TABLE]
The levels interact via the following Hamiltonian
[TABLE]
with
[TABLE]
We can introduce different partial overlaps between two different configurations as
[TABLE]
and if . Then a direct computation gives
[TABLE]
It is somehow convenient to set the overlaps in : we introduce sequences of numbers in
[TABLE]
and put . We also define the total overlap to be . As customary for () the free energy is given by
[TABLE]
Of course as a consequence of Talagrand inequality is self-averaging as , so we can always take the expectation w.r.t. the disorder, when needed. Here and further we denote by the Gibbs distribution associated to the model and by the quenched average of observables (we drop the subscript in the thermodynamic limit) and , .
Our main result is
Theorem**.**
Let be a random variable uniformly distributed in . Then
[TABLE]
This result can be given also in terms of the total overlap (as in [1]):
[TABLE]
A larger class of non-hierarchical random energy models including the one under consideration was studied by Bolthausen and Kistler in [7, 8]. We shall make use of some crucial ideas from those two papers, in which the so-called Parisi picture is proved. A more precise formulation of the results in [7, 8] will be given below.
2. More on the Model
Prior to give the proof, it is convenient to discuss a little more the model. What follows is in a good part heuristics and rigorous proofs can be found in [7, 8].
For and , we simply recover the usual REM. We shortly summarise some basic features of this well-known model. The model has a phase transition at , so that for and otherwise. The free energy reads
[TABLE]
Next consider for simplicity the bipartite model () with , defined by the Hamiltonian (we set , and )
[TABLE]
(this was analysed also in [9] by a slightly different perspective). If we assume for definiteness , there are two possibilities: either or . The first case is less interesting and we focus on the second one. At very high temperature everything is ergodic and the free energy coincides with the annealed one. As , the -subset freezes, *i.e. *its relative entropy goes to zero (as in the first transition in a GREM [10]) and one can show that
[TABLE]
where is a normalisation factor and denotes the law of a normalised Poisson point process with intensity or Poisson-Dirichlet distribution. In this regime for any , while . The free energy is a convex combination (with ) of two REMs, one on the subset at low temperature and the other on the rest of the system at high temperature. As increases further, the total entropy vanishes for and the whole Gibbs measure converges toward a Poisson-Dirichlet process
[TABLE]
The free energy is the convex combination of two REMs at low temperature. is unchanged, but . Note
[TABLE]
for any . We remark that, since the overlaps take value in in this simple case, and (as ). Therefore the first system starts freezing at higher temperature: if the second system is frozen, then also the first one is so (as in a two-level GREM [10]). The whole picture is summarised as follows
[TABLE]
Therefore, albeit not inbuilt in the model, a GREM-like hierarchical structure naturally emerges. A way to visualise that in the general model defined by the Hamiltonian (1.1) is as follows. Recall that the , , , denote the configurations up to the -th level of the -th GREM. Then the phase space is naturally coarse-grained by the class of sets . We think of each level now as an atom and we can consider the power set
[TABLE]
According to [7, 8] a chain is defined to be an increasing (finite) sequence of sets in so that
[TABLE]
To each we associate two sequences and . The represent the relative sizes of the
[TABLE]
easily computed from the numbers and ; the are variances defined by
[TABLE]
From and we can build another sequence of critical inverse temperatures , . In general is not monotone, but we can conveniently confine our attention to those chains for which . We denote by the set of such chains.
To fix the ideas, let us consider again a bipartite REM with levels. The Hamiltonian reads as
[TABLE]
For a given of length , we set for
[TABLE]
so that we can decompose the Hamiltonian (2.3) according to
[TABLE]
and the partition function can be written as
[TABLE]
Now we see the following scenario. At small enough the annealed approximation holds and the overlaps are set to zero. Then increases, , and the configurations in freeze. In fact depends on configurations in , i.e. and all the other addenda in the r.h.s. of (2.4) become independent as . Thus the partition function asymptotically factorises
[TABLE]
as two independent REMs: the first one on the space of configurations is at low temperature, the second one on the remaining configuration space is at high temperature (with the right variance ). The free energy is a convex combination w.r.t. (*i.e. *the relative size of ) of these two REMs. As in the previous example, we have convergence of the marginalised Gibbs measure to a Poisson-Dirichlet distribution
[TABLE]
with an opportune normalisation. Since and remain independent for all we can iterate this procedure: for instance as also freezes and becomes asymptotically independent on ; thus the partition function is factorised as
[TABLE]
These are three independent REMs on configurations , and , the associated free energy is given by a convex combination of the low-temperature free energies of the first two REMs and the high-temperature free energy of the third one and
[TABLE]
is a normalisation factor. Going on this way we recover the free energy and the Gibbs measure as a GREM-like structure along the chain. At zero temperature the free energy of the model is just the convex combination of those of REMs at low temperature, each defined on an element of the chain. This construction can be made for every chain in . Of course for fixed , the more REMs are at low temperature, the higher is the free energy. According to this criterion one can select the chain along which the free energy is maximal. By the above construction it should be clear that such a chain, here denoted by , is unique.
The results of [7, 8] (for the case of our interest) can be precisely formulated as follows. Here and denotes the GREM pressure computed along :
[TABLE]
Theorem** (Bolthausen and Kistler).**
The following holds:
- i)
2. ii)
Ultrametricity: there is a such that for each triad of configurations
[TABLE]
For a thorough discussion of point we refer to the original work, but is worth mentioning it comes from Theorem 3 in [8], on which we roughly report: for and the overlap converges in distribution to the Bolthausen-Sznitman coalescence along the optimal chain (the ’s being associated to ); moreover the overlap and the Gibbs measure are asymptotically independent.
3. Proof
Now we are ready to give the proof of our statement. For simplicity we keep working mostly in the bipartite case. We convey to fix the optimal chain once for all. The sequences , and will be always referred to .
A direct computation from (1.1) and (1.3) yields
[TABLE]
On the other hand the limiting free energy is a convex combination of REM ones along , thus its derivatives can be explicitly computed. We set
[TABLE]
Then
[TABLE]
[TABLE]
[TABLE]
As the two expressions for the derivatives must be equal (the exchange of the limit and the derivative can be justified for instance using Theorem 3 in [8] and the concavity of the free energy). Hence if
[TABLE]
Otherwise we have
[TABLE]
[TABLE]
and
[TABLE]
As is decreasing in , formulas (3.4) and (3.5), (3.6), (3.7) establish directly
[TABLE]
Let now . We have
[TABLE]
from which (1.4) is readily deduced for .
The generalisation to the multipartite case is immediate. We have for every
[TABLE]
whence, proceeding as above, , . Therefore we have mutual synchronisation for every couple , which is enough to obtain (1.4) for any . Moreover a simple computation from (3.9) also gives
[TABLE]
for any , thus for
[TABLE]
where we note . Then (1.5) easily follows.
Acknowledgements This research was supported through the programme Research in Pairs by the Mathematisches Forschungsinstitut Oberwolfach in November 2016. G.G. is supported by the NCCR SwissMAP, D.T. is supported by GNFM-Indam. We thank E. Bolthausen for some valuable discussions and S. Franz for a useful correspondence on the paper [6].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] D. Panchenko, The Free Energy in a Multi-Species Sherrington-Kirkpatrick Model , Ann. Prob. 43, 3494-3513 (2015).
- 2[2] A. Barra, P. Contucci, E. Mingione, D. Tantari, Multi-Species Mean Field Spin Glasses. Rigorous Results , Ann. H. Poincaré 16 , 691-708, (2015).
- 3[3] M. Mezard, Mean-field message-passing equations in the Hopfield model and its generalizations Phys. Rev. E 95 , 022117 (2017)
- 4[4] A.Barra, G.Genovese, P.Sollich and D.Tantari, Phase transitions in Restricted Boltzmann Machines with generic priors , preprint ar Xiv:1612.03132, (2016).
- 5[5] F. Guerra, An Introduction to Mean Field Spin Glass Theory: Methods and Results , A. Bovier et al. eds, Les Houches, Session LXXXIII, (2005).
- 6[6] S. Franz, G. Parisi, M. A. Virasoro, Ultrametricity in an Inhomogeneous Simplest Spin Glass Model , Europhys. Lett. 17, 5-9, (1992).
- 7[7] E. Bolthausen, N. Kistler, On a nonhierarchical version of the Generalized Random Energy Model , Ann. Appl. Prob. 16, 1-14, (2006).
- 8[8] E. Bolthausen, N. Kistler, On a nonhierarchical version of the Generalized Random Energy Model. II. Ultrametricity , Stoch. Proc. Appl. 119, 2357-2386, (2009).
