Convergence Time to Equilibrium of the Metropolis dynamics for the GREM

A. M. B. Nascimento; L. R. Fontes

arXiv:1907.00436·math.PR·January 8, 2020

Convergence Time to Equilibrium of the Metropolis dynamics for the GREM

A. M. B. Nascimento, L. R. Fontes

PDF

TL;DR

This paper analyzes how quickly the Metropolis dynamics for the GREM reach equilibrium by deriving bounds on the spectral gap using advanced mathematical tools, applicable to models with multiple hierarchical levels.

Contribution

It provides new bounds on the convergence time of Metropolis dynamics for GREM with arbitrary hierarchy levels using Poincaré inequalities and convex analysis.

Findings

01

Bounds on the inverse spectral gap are established.

02

The convergence time depends on the hierarchical structure of the GREM.

03

Methodology applies to general cases of the model.

Abstract

We study the convergence time to equilibrium of the Metropolis dynamics for the Generalized Random Energy Model with an arbitrary number of hierarchical levels, a finite and reversible continuous-time Markov process, in terms of the spectral gap of its transition probability matrix. This is done by deducing bounds to the inverse of the gap using a Poincar\'e inequality and a path technique. We also apply convex analysis tools to give the bounds in the most general case of the model.

Equations260

N_{j} = ⌊ p_{j} N ⌋, 1 \leq j \leq k - 1, and N_{k} = N - j = 1 \sum k - 1 N_{j} .

N_{j} = ⌊ p_{j} N ⌋, 1 \leq j \leq k - 1, and N_{k} = N - j = 1 \sum k - 1 N_{j} .

Σ_{N} = Σ_{N_{1}} \times \dots \times Σ_{N_{k}}

Σ_{N} = Σ_{N_{1}} \times \dots \times Σ_{N_{k}}

H = H_{N} = {E_{σ_{1} \dots σ_{j}}^{(j)} : σ_{j} \in Σ_{N_{j}}, 1 \leq j \leq k}

H = H_{N} = {E_{σ_{1} \dots σ_{j}}^{(j)} : σ_{j} \in Σ_{N_{j}}, 1 \leq j \leq k}

H (σ) = - ⟨ a, E_{σ} ⟩ = - j = 1 \sum k a_{j} E_{σ_{1} \dots σ_{j}}^{(j)}, σ \in Σ_{N},

H (σ) = - ⟨ a, E_{σ} ⟩ = - j = 1 \sum k a_{j} E_{σ_{1} \dots σ_{j}}^{(j)}, σ \in Σ_{N},

π_{N} (σ) = π_{k, N, β} (σ) = \frac{1}{Z _{N}} exp (- β H (σ)),

π_{N} (σ) = π_{k, N, β} (σ) = \frac{1}{Z _{N}} exp (- β H (σ)),

F_{N} (β) = F_{k, N} (β) = - \frac{1}{N} lo g Z_{k, N} (β)

F_{N} (β) = F_{k, N} (β) = - \frac{1}{N} lo g Z_{k, N} (β)

F (β) \equiv N ↑ \infty lim F_{N} (β)

F (β) \equiv N ↑ \infty lim F_{N} (β)

Ψ_{k} = {x \in \mathbbm R^{k} : i = 1 \sum j x_{i}^{2} \leq β_{*}^{2} P_{j}, 1 \leq j \leq k},

Ψ_{k} = {x \in \mathbbm R^{k} : i = 1 \sum j x_{i}^{2} \leq β_{*}^{2} P_{j}, 1 \leq j \leq k},

P_{j} = i = 1 \sum j p_{i} \mbox an d β_{*} = 2 lo g 2 .

P_{j} = i = 1 \sum j p_{i} \mbox an d β_{*} = 2 lo g 2 .

J_{l}^{*} = min {J > J_{l - 1}^{*} : B (J_{l - 1}^{*} + 1, J) \leq B (J_{l - 1}^{*} + 1, j), \forall j \geq J_{l - 1}^{*} + 1},

J_{l}^{*} = min {J > J_{l - 1}^{*} : B (J_{l - 1}^{*} + 1, J) \leq B (J_{l - 1}^{*} + 1, j), \forall j \geq J_{l - 1}^{*} + 1},

β_{l} = B (J_{l - 1}^{*} + 1, J_{l}^{*}), 1 \leq l \leq l_{k},

β_{l} = B (J_{l - 1}^{*} + 1, J_{l}^{*}), 1 \leq l \leq l_{k},

w_{j} = β_{i} a_{j}, = β a_{j}, if j \in {J_{i - 1}^{*} + 1, \dots, J_{i}^{*}} \mbox f or so m e i = 1, \dots, l; if j \in {J_{l}^{*} + 1, \dots, k} .

w_{j} = β_{i} a_{j}, = β a_{j}, if j \in {J_{i - 1}^{*} + 1, \dots, J_{i}^{*}} \mbox f or so m e i = 1, \dots, l; if j \in {J_{l}^{*} + 1, \dots, k} .

m^{*} \equiv m^{*} (β) = β a .

m^{*} \equiv m^{*} (β) = β a .

F (β) = \frac{1}{2} (β_{*}^{2} + ∥ m^{*} ∥^{2} - ∥ m^{*} - w ∥^{2}) = β i = 1 \sum l β_{i} j = J_{i - 1}^{*} + 1 \sum J_{i}^{*} a_{j} + \frac{1}{2} j = J_{l}^{*} + 1 \sum k (β_{*}^{2} p_{j} + β^{2} a_{j}),

F (β) = \frac{1}{2} (β_{*}^{2} + ∥ m^{*} ∥^{2} - ∥ m^{*} - w ∥^{2}) = β i = 1 \sum l β_{i} j = J_{i - 1}^{*} + 1 \sum J_{i}^{*} a_{j} + \frac{1}{2} j = J_{l}^{*} + 1 \sum k (β_{*}^{2} p_{j} + β^{2} a_{j}),

F (β) = ⟨ m^{*}, w^{*} ⟩ = x \in Ψ_{k} max ⟨ m^{*}, x ⟩ .

F (β) = ⟨ m^{*}, w^{*} ⟩ = x \in Ψ_{k} max ⟨ m^{*}, x ⟩ .

P (σ, τ) = ⎩ ⎨ ⎧ \frac{1}{N} exp (- β [H (τ) - H (σ)]^{+}), 1 - \sum_{η \neq = σ} P (σ, η), 0, if d (σ, τ) = 1; if σ = τ; otherwise.

P (σ, τ) = ⎩ ⎨ ⎧ \frac{1}{N} exp (- β [H (τ) - H (σ)]^{+}), 1 - \sum_{η \neq = σ} P (σ, η), 0, if d (σ, τ) = 1; if σ = τ; otherwise.

N ↑ \infty lim - \frac{1}{N} lo g λ_{N}^{REM} = β_{*} β \mathbbm P \mbox - a . s .

N ↑ \infty lim - \frac{1}{N} lo g λ_{N}^{REM} = β_{*} β \mathbbm P \mbox - a . s .

λ_{N} \equiv λ_{N} (β) = 1 - μ_{N, 1}

λ_{N} \equiv λ_{N} (β) = 1 - μ_{N, 1}

N ↑ \infty lim sup - \frac{1}{N} lo g λ_{N} \leq ⟨ m^{*}, w^{*} ⟩ \mathbbm P \mbox - a . s .

N ↑ \infty lim sup - \frac{1}{N} lo g λ_{N} \leq ⟨ m^{*}, w^{*} ⟩ \mathbbm P \mbox - a . s .

4 ∥ P_{t} (σ, \cdot) - π_{N} (\cdot) ∥_{var}^{2} \leq \frac{1 - π _{N} ( σ )}{π _{N} ( σ )} exp (- 2 λ_{N} t),

4 ∥ P_{t} (σ, \cdot) - π_{N} (\cdot) ∥_{var}^{2} \leq \frac{1 - π _{N} ( σ )}{π _{N} ( σ )} exp (- 2 λ_{N} t),

N ↑ \infty lim σ max ∥ P_{e^{N t}} (σ, \cdot) - π_{N} (\cdot) ∥_{var} = 0, \mathbbm P \mbox - a . s .

N ↑ \infty lim σ max ∥ P_{e^{N t}} (σ, \cdot) - π_{N} (\cdot) ∥_{var} = 0, \mathbbm P \mbox - a . s .

\frac{1}{λ _{N}} \leq ϱ (Γ_{N}) = e = (σ, τ) max ⎩ ⎨ ⎧ \frac{ℓ ˉ}{π _{N} ( σ ) P ( σ , τ )} γ_{η υ} ∋ e \sum π_{N} (η) π_{N} (υ) ⎭ ⎬ ⎫

\frac{1}{λ _{N}} \leq ϱ (Γ_{N}) = e = (σ, τ) max ⎩ ⎨ ⎧ \frac{ℓ ˉ}{π _{N} ( σ ) P ( σ , τ )} γ_{η υ} ∋ e \sum π_{N} (η) π_{N} (υ) ⎭ ⎬ ⎫

ϱ (Γ_{N}) = \frac{ℓ ˉ N}{Z _{N}} e = (σ, τ) max ⎩ ⎨ ⎧ exp (β [H (σ) \lor H (τ)]) γ_{η υ} ∋ e \sum exp (- β [H (η) + H (υ)]) ⎭ ⎬ ⎫ .

ϱ (Γ_{N}) = \frac{ℓ ˉ N}{Z _{N}} e = (σ, τ) max ⎩ ⎨ ⎧ exp (β [H (σ) \lor H (τ)]) γ_{η υ} ∋ e \sum exp (- β [H (η) + H (υ)]) ⎭ ⎬ ⎫ .

exp (β H (σ)) γ_{σ υ} ∋ e \sum exp (- β [H (σ) + H (υ)]) = υ \neq = σ \sum exp (- β H (υ)) \leq Z_{N} .

exp (β H (σ)) γ_{σ υ} ∋ e \sum exp (- β [H (σ) + H (υ)]) = υ \neq = σ \sum exp (- β H (υ)) \leq Z_{N} .

H (σ) \leq κ N;

H (σ) \leq κ N;

Γ (η, υ) = {γ_{η υ}^{i} : i = 1, 2, \dots, N},

Γ (η, υ) = {γ_{η υ}^{i} : i = 1, 2, \dots, N},

Γ^{i} = {γ_{η υ}^{i} : η, υ \in Σ_{N}}, i = 1, 2, \dots, N .

Γ^{i} = {γ_{η υ}^{i} : η, υ \in Σ_{N}}, i = 1, 2, \dots, N .

ϱ (Γ_{N}) \leq \frac{N ^{2}}{Z _{N}} e = (σ, τ) max ⎩ ⎨ ⎧ exp (β [H (σ) \lor H (τ)]) γ_{η υ} ∋ e \sum exp (- β [H (η) + H (υ)]) ⎭ ⎬ ⎫ .

ϱ (Γ_{N}) \leq \frac{N ^{2}}{Z _{N}} e = (σ, τ) max ⎩ ⎨ ⎧ exp (β [H (σ) \lor H (τ)]) γ_{η υ} ∋ e \sum exp (- β [H (η) + H (υ)]) ⎭ ⎬ ⎫ .

Σ_{N}^{η, υ} = {σ \in Σ_{N} : σ_{1} ∣_{D_{1}^{η, υ}} = η_{1} ∣_{D_{1}^{η, υ}}, d_{1} (σ_{1}, η_{1}) = ⌈ ϵ N_{1} ⌉ and σ_{j} = υ_{j}, j = 2, \dots, k}

Σ_{N}^{η, υ} = {σ \in Σ_{N} : σ_{1} ∣_{D_{1}^{η, υ}} = η_{1} ∣_{D_{1}^{η, υ}}, d_{1} (σ_{1}, η_{1}) = ⌈ ϵ N_{1} ⌉ and σ_{j} = υ_{j}, j = 2, \dots, k}

\mathbbm P (η, υ) ⋂ σ \in Σ_{N}^{η, υ} ⋂ {γ_{σ σ^{'}} is bad} \leq N \geq N^{'} \sum 4^{N} e^{- c_{κ} N (2 ϵ)^{- ϵ N_{1}}} < \infty.

\mathbbm P (η, υ) ⋂ σ \in Σ_{N}^{η, υ} ⋂ {γ_{σ σ^{'}} is bad} \leq N \geq N^{'} \sum 4^{N} e^{- c_{κ} N (2 ϵ)^{- ϵ N_{1}}} < \infty.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Convergence Time to Equilibrium of the Metropolis dynamics for the GREM

A. M. B. Nascimento 33footnotemark: 3 Partially supported by CNPq grant 140762/2016-7

L. R. Fontes Partially supported by CNPq grant 311257/2014-3, and FAPESP grant 2017/10555-0Instituto de Matemática e Estatística, Universidade de São Paulo, Rua do Matão 1010, Cidade Universitária, 05508-090 São Paulo SP, Brasil. Emails: amarcos, [email protected]

Abstract

We study the convergence time to equilibrium of the Metropolis dynamics for the Generalized Random Energy Model with an arbitrary number of hierarchical levels, a finite and reversible continuous-time Markov process, in terms of the spectral gap of its transition probability matrix. This is done by deducing bounds to the inverse of the gap using a Poincaré inequality and a path technique. We also apply convex analysis tools to give the bounds in the most general case of the model.

AMS 2010 Mathematics Subject Classification. 60K35, 82B44, 82C44, 82D30

Key words and phrases. spin glasses, GREM, Metropolis dynamics, convergence to equilibrium, spectral gap, Poincaré inequality

1 Introduction and Main Result

The Generalized Random Energy Model (GREM) is a mean field model for a spin glass in equilibrium, introduced in [6]. Let us describe it. Consider a system with configuration space being $\Sigma_{N}=\{-1,+1\}^{N}$ , the discrete hypercube in $N$ dimensions, equipped with the following hierarchical structure in levels. Fix a number $k\in{\mathbbm{N}}$ , such that $k\leq N$ , to indicate the number of levels. Let $\{p_{j}\}_{j=1}^{k}$ be a sequence of positive real numbers such that $\sum_{j=1}^{k}p_{j}=1$ and consider the following partition of the number $N$ into $k$ integers:

[TABLE]

With this notation, we represent $\Sigma_{N}$ as the product

[TABLE]

so that a spin configuration $\sigma\in\Sigma_{N}$ is labeled as $\sigma=(\sigma_{1},\ldots,\sigma_{k})$ where $\sigma_{j}\in\Sigma_{N_{j}}=\{-1,+1\}^{N_{j}}$ stands for the $j$ -th level of $\sigma$ . We denote with $\sigma^{i}$ and $\sigma_{j}^{i}$ generic spin coordinates of $\sigma$ and $\sigma_{j}$ respectively.

Now, we will define GREM’s Hamiltonian on $\Sigma_{N}$ . Let

[TABLE]

be a family of independent (vectors of independent) Gaussian random variables of mean 0 and variance $N$ . We may view $\mathscr{H}$ as a random environment for the spin model to be defined next. Let $\{a_{j}\}_{j=1}^{k}$ be a collection of strictly positive real numbers such that $\sum_{j=1}^{k}a_{j}=1$ , and denote by $\mathfrak{a}$ the vector $\mathfrak{a}=(\sqrt{a_{j}}:1\leq j\leq k)$ . The GREM Hamiltonian on $\Sigma_{N}$ is then defined by

[TABLE]

where for each $\sigma\in\Sigma_{N}$ , we denote by $E_{\sigma}$ the vector $E_{\sigma}=(E^{(j)}_{\sigma_{1}\cdots\sigma_{j}}:\,1\leq j\leq k)$ , and $\left\langle\cdot,\cdot\right\rangle$ is the usual inner product on ${\mathbbm{R}}^{k}$ . Then $\mathcal{H}=\{\mathcal{H}(\sigma),\,\sigma\in\Sigma_{N}\}$ is a family of Gaussian random variables with marginal mean zero and variance $N$ , and we remark that $\mathcal{H}(\sigma)$ and $\mathcal{H}(\tau)$ are independent if and only if $\sigma,\tau\in\Sigma_{N}$ differ on the first level, i.e., if and only if $\sigma_{1}\neq\tau_{1}$ .

We denote by $\pi_{N}$ the Gibbs measure at inverse temperature $\beta>0$ associated to the GREM Hamiltonian $\mathcal{H}$ that assigns to each $\sigma\in\Sigma_{N}$ the mass

[TABLE]

where $Z_{N}\equiv Z_{k,N}(\beta)$ denotes the usual normalizing factor. As usual, the function

[TABLE]

indicates the finite volume free energy. Notice that all those quantities are random variables on $(\Omega,\mathscr{F},{\mathbbm{P}})$ .

Existence of the Free Energy.

An important equilibrium feature of the GREM that will be needed here is the existence of the free energy: for all $\beta>0$ the limit

[TABLE]

exists ${\mathbbm{P}}$ -almost surely and coincides with $\lim_{N\uparrow\infty}{\mathbbm{E}}(F_{N}(\beta))$ — see [4], Theorem 2.1. Notice that $F(\beta)$ is a nonrandom function.

For the sake of completeness, we recall here the explicit formula of $F(\beta)$ . To get to that, we start by considering the $k$ -dimensional Euclidean space equipped with the norm $\|\cdot\|^{2}=\left\langle\cdot,\cdot\right\rangle$ . Let us denote by $\Psi_{k}$ the following subset of ${\mathbbm{R}}^{k}$ ,

[TABLE]

where

[TABLE]

Now, set $J_{0}^{*}=0$ and recursively, define

[TABLE]

where $B(i,j)=\beta_{\ast}\sqrt{\frac{p_{i}+\cdots+p_{j}}{a_{i}+\cdots+a_{j}}}$ for $1\leq i\leq j\leq k$ . Let $l_{k}\in\{1,\ldots,k\}$ be such that $J_{l_{k}}^{*}=k$ . Consider now the collection $(\beta_{l})_{l=0}^{l_{k}+1}$ , where

[TABLE]

and $\beta_{0}=0$ and $\beta_{l_{k}+1}=\infty$ . From the definition of $(J_{l}^{*})_{l=1}^{l_{k}}$ , it is clear that $(\beta_{l})_{l=1}^{l_{k}}$ is strictly increasing in $l$ . Suppose $\beta\in[\beta_{l},\beta_{l+1})$ for some $0\leq l\leq l_{k}$ , and let $\mathfrak{w}\equiv\mathfrak{w}(\beta)\in\Psi_{k}$ be such that

[TABLE]

With this terminology, $\mathfrak{w}$ is the point of $\Psi_{k}$ at minimal distance from

[TABLE]

We finally have, for all $\beta>0$ , that

[TABLE]

if $\beta_{l}\leq\beta<\beta_{l+1}$ for some $l=0,\ldots l_{k}$ — see [4]. We remark that this function is once, but not twice, continuously differentiable with respect to $\beta$ . From a physical point of view, this means that there exist (possibly multiple) third-order phase transitions for the GREM. Let us also point out that for $\beta\geq\beta_{l_{k}}$ there exists a unique point $\mathfrak{w}^{*}\in\Psi_{k}$ , independent of $\beta$ , such that $\mathfrak{w}=\mathfrak{w}^{*}$ and

[TABLE]

The latter identity is shown in Appendix, Lemma A.1.

Dynamics.

Here, we consider a dynamics for the GREM, that is, we construct a continuous time Markov chain with state space $\Sigma_{N}$ , for which the Gibbs measure $\pi_{N}$ is invariant; indeed, the chain and the GREM are in detailed balance. In fact, we consider the Metropolis dynamics. Let us define it next. Let us consider the continuous-time Markov process $\{\omega_{N}(t):t\geq 0\}$ , taking values in $\Sigma_{N}$ and having transition probability matrix P with entries given by

[TABLE]

where $\mathcal{H}$ is the GREM Hamiltonian defined in (1.3); $\beta>0$ is the inverse of temperature parameter; $\text{d}(\cdot,\cdot)$ denotes the usual Hamming distance on $\Sigma_{N}$ and $x^{+}=x\vee 0$ , $x\in{\mathbbm{R}}$ . This process is reversible, and therefore, both stationary and ergodic, with respect to the Gibbs measure $\pi_{N}$ .

Before discussing our results, let us recall the related results derived for the REM under Metropolis (which corresponds to the GREM with $k=1$ ).

The following result is implied by Theorem 1 in [11]. Let $\lambda_{N}^{\text{\tiny REM}}$ be the spectral gap of the generator of the dynamics (or, equivalently, of the one-step transition probability matrix). Then for all $\beta>0$ we have that

[TABLE]

Indeed Theorem 1 in [11] provides estimates for the errors of approximation that hold a.s. for all large enough $N$ , but we will not be concerned with those here.

In this paper we will derive upper bounds for the analogue in our dynamics of the quantity whose limit is taken in (1.15). These, as is well known, provide upper bounds for the time to reach equilibrium under the dynamics. Let us describe the relevant quantities more precisely.

Let $1=\mu_{N,0}>\mu_{N,1}\geq\cdots\geq\mu_{N,2^{N}}>-1$ denote the eigenvalues of the one-step transition probability matrix P whose entries are defined in (1.14); since P is reversible with respect to $\pi_{N}$ , we have that

[TABLE]

is its spectral gap. Notice that, in the case of the REM, $\lambda_{N}=\lambda_{N}^{\text{\tiny REM}}$ . The main result of this paper is the following.

Theorem 1.

For all $\beta>0$ ,

[TABLE]

Some remarks follow:

First of all, notice that the bound in the right-hand side of (1.17), viewed as function of $\beta$ , is the function that describes the free energy of the GREM for $\beta\geq\beta_{l_{k}}$ . As expected, we get Proposition 4.2 in [11] as corollary of the Theorem 1 by taking $k=1$ . We still remark that Theorem 1 holds for *all * $\beta>0$ , for *all * $k\in{\mathbbm{N}}$ and for any choice of parameters $\{a_{j}\}_{j=1}^{k}$ and $\{p_{j}\}_{j=1}^{k}$ satisfying $0<a_{j},p_{j}<1$ and $\sum_{j=1}^{k}a_{j}=\sum_{j=1}^{k}p_{j}=1$ . 2. 2.

In view of Theorem 1, using the following well known bound (see [7] for a derivation): for all $\sigma\in\Sigma_{N}$ and $t>0$ ,

[TABLE]

together with (1.6) and Theorem 1.5(iii) of [2], one deduces that for any $t>\left\langle\mathfrak{m}^{*},\mathfrak{w}^{*}\right\rangle$ ,

[TABLE]

Here $\text{P}_{t}(\sigma,\tau)=e^{-t}\sum_{n=0}^{\infty}(t^{n}/n!)\text{P}^{n}(\sigma,\tau)$ is the transition kernel of the dynamics. 3. 3.

There is reason to believe that the bound (1.17) is not sharp, based on the results of [10], where large volume limits for a hierarchical, simplified version of the present dynamics are derived for the 2 level GREM at low temperature (in the cascading phase), with time properly scaled. The limit dynamics are ergodic processes, and have the (infinite volume) Gibbs measure as equilibrium measure. The time scalings for those results are always below what is implied by (1.17), and this would indicate that the latter bound is not sharp (at least at low temperatures).

On the other hand, under the dynamics of [10], it may be proved that (1.17) is the best bound one gets (to leading order) by using the Poincaré inequality employed in the present work (at all temperatures). 4. 4.

A direct analysis of the Metropolis dynamics for the GREM at time scales where one would expect to see an ergodic large volume limiting dynamics, as has been done in [10] for a simpler dynamics, has not been undertaken yet; even for the $k=1$ case of the REM, this has been done only at smaller time scales, where aging takes place instead — see [5, 9] — and, indeed, spectral gap estimations are important elements in the derivations.

See also [1] for applications of spectral gap estimation on the study of a class of dynamics for a large family of mean field spin glasses.

The rest of the paper is devoted to prove Theorem 1. In Section 2, we develop our bound to the inverse of the spectral gap, in terms of the canonical path approach by Jerrum and Sinclair. This leads to the statement of two propositions which immediately lead to the proof of Theorem 1. The proof of the first of the propositions is done in Section 3, in several steps which take most of the remainder of the paper. Section 4 contains the similar, shortly presented proof of the second proposition, and an appendix is devoted to supporting results.

2 Proof of the Theorem 1 – Canonical set of paths

As mentioned above, the proof of Theorem 1 relies on a Poincaré inequality derived in [13]. To write this inequality in our context, the first step is to identify the Markovian process $\omega_{N}(t)$ with an undirected graph with vertex set $\Sigma_{N}$ . Naturally, we identify it with the $N$ -dimensional hierarchical hypercube graph which we will also denote, with a little abuse, by $\Sigma_{N}$ . Let us denote $\mathcal{E}_{N}=\{(\sigma,\tau)\in\Sigma_{N}^{2}:\text{d}(\sigma,\tau)=1\}$ the edge set of $\Sigma_{N}$ . Now, let $\Gamma_{N}=\{\gamma_{\eta\upsilon}:\eta,\upsilon\in\Sigma_{N}\}$ be a complete set of self-avoiding canonical paths on $\Sigma_{N}$ , that is, for each $\eta,\upsilon\in\Sigma_{N}$ , there exists exactly one path $\gamma_{\eta\upsilon}$ in $\Gamma_{N}$ connecting $\eta$ and $\upsilon$ using only valid transitions of the Markov chain $\omega_{N}(t)$ , that is, only through edges of $\mathcal{E}_{N}$ . Denote by $\bar{\ell}=\bar{\ell}(\Gamma_{N})$ the maximum length of paths (i.e. number of edges) in $\Gamma_{N}$ . Then, from Theorem 5 in [13] we have

[TABLE]

where the maximum is over all edges $e=(\sigma,\tau)\in\mathcal{E}_{N}$ and the summation is over all pairs $(\eta,\upsilon)$ such that there exists a path $\gamma_{\eta\upsilon}$ in $\Gamma_{N}$ that contains edge $e$ . The expression $\varrho(\Gamma_{N})$ is called the congestion associated with the set of paths $\Gamma_{N}$ . Recall (1.4) and (1.14). Using them, it is easy to check that

[TABLE]

Notice that to apply efficiently inequality in (2.1) we need now to construct a suitable set of paths $\Gamma_{N}$ that allows us to get a good upper bound to $\varrho(\Gamma_{N})$ . By “good”, we mean that on the limit, in the very spirit of (1.17), such bound coincides ${\mathbbm{P}}$ -almost surely with $\left\langle\mathfrak{m}^{*},\mathfrak{w}^{*}\right\rangle$ .

When one tries to obtain a spectral gap estimate for the Metropolis dynamics of spin glass models using the canonical path technique, one of the first concerns is with edges $e=(\sigma,\tau)\in\mathcal{E}_{N}$ where $\mathcal{H}(\sigma)\vee\mathcal{H}(\tau)$ is large. A natural attempt to control these bad edges is to avoid them as much as possible in the trajectories. The completeness of $\Gamma_{N}$ implies that they cannot be avoided as extreme edges of paths, but we may try to avoid them in the interior of paths; as we will see below, we succeed in doing that with high probability, with a set of paths that is amenable enough to subsequent analysis. This approach was already used in [11]. Observe that with such set of paths, if $e=(\sigma,\tau)\in\gamma_{\eta\upsilon}$ is a bad edge, then we have that either $\sigma=\eta$ and $\tau$ has the lowest energy, or $\sigma$ has lowest energy and $\tau=\upsilon$ . Considering the first case — the other one follows by symmetry —, the term inside of the $\max$ sign in (2.2) can be estimated by

[TABLE]

To construct our suitable set of paths $\Gamma_{N}$ , we need to introduce some notation. Let $\kappa>0$ be arbitrary. We say that a configuration $\sigma\in\Sigma_{N}$ is good if

[TABLE]

otherwise, we will call it bad. We will call any set of configurations, in particular an edge of $\mathcal{E}_{N}$ , good if all the configurations in it are good; otherwise, we will call the set bad. Then the set $\mathcal{E}_{N}$ can be written as the following disjoint union: $\mathcal{E}_{N}=\mathcal{G}\cup\mathcal{B}$ , where $\mathcal{G}$ and $\mathcal{B}$ denote the sets of good and bad edges, respectively.

For any path $\gamma=\{e_{1},e_{2},\ldots,e_{n}\}$ with $e_{j}\in\mathcal{E}_{N}$ , $j=1,\ldots,n$ , let $\mathring{\gamma}=\{e_{2},e_{3},\ldots,e_{n-1}\}$ denote the set of interior edges of $\gamma$ . A path $\gamma$ with all interior edges good is called good; a set of paths with all elements good is also called good. At this point, it is clear that the set of paths that we aim to construct, a good one, will depend on the realization of the random environment $\mathscr{H}$ which implies that $\Gamma_{N}$ will be a random set of paths.

One of the fundamental concepts we will need here is the notion of independent paths. Two paths $\gamma_{1}$ and $\gamma_{2}$ will be called independent if for all $\sigma\in\mathring{\gamma}_{1}$ and $\tau\in\mathring{\gamma}_{2}$ , the random variables $\mathcal{H}(\sigma)$ and $\mathcal{H}(\tau)$ are independent; equivalently, if $\sigma_{1}\neq\tau_{1}$ . An extension of this concept for a finite family of paths in $\Sigma_{N}$ can be done in an obvious way. At last, let us denote by $\text{d}_{1}(\cdot,\cdot)$ , resp. $\text{d}(\cdot,\cdot)$ , the usual Hamming distance on $\Sigma_{N_{1}}$ , resp. $\Sigma_{N}$ .

With these concepts in hands, we have the following lemma where we specify one condition under which there exist independent paths connecting configurations in $\Sigma_{N}$ . This will also motivate our subsequent definition of $\Gamma_{N}$ .

Lemma 2.1.

Let $\eta$ and $\upsilon$ be two configurations in $\Sigma_{N}$ . If $\text{d}_{1}(\eta_{1},\upsilon_{1})=n\geq 2$ , then there exists a family containing $n$ independent paths connecting $\eta$ to $\upsilon$ .

Proof.

Consider, for each pair of distinct vertices $\eta,\upsilon\in\Sigma_{N}$ , the set of paths

[TABLE]

where $\gamma_{\eta\upsilon}^{i}$ denotes the path from $\eta$ to $\upsilon$ defined as follows. Suppose $\text{d}(\eta,\upsilon)=r\geq n$ ; then let $1\leq\ell_{m+1}<\cdots<\ell_{r}<i\leq\ell_{1}<\cdots<\ell_{m}\leq N$ be the positions where $\eta$ and $\upsilon$ disagree, $m\in\{0,\ldots,r\}$ . Let $\gamma_{\eta\upsilon}^{i}$ be the path starting at $\eta$ and ending at $\upsilon$ whose $j$ -th edge, $1\leq j\leq r$ , corresponds to flipping $\eta_{\ell_{j}}$ to $\upsilon_{\ell_{j}}$ .

For future reference, we set

[TABLE]

We will now argue that $\Gamma(\eta,\upsilon)$ is a family of paths that satisfies the required property. Let $1\leq i_{1}<\cdots<i_{n}\leq N_{1}$ be the positions where $\eta$ and $\upsilon$ disagree on the first level, and consider the set of paths $\{\gamma_{\eta\upsilon}^{i_{1}},\ldots,\gamma_{\eta\upsilon}^{i_{n}}\}$ . We claim that this set of paths is independent. Indeed, this is quite clear if the discrepancies between $\eta$ and $\upsilon$ are only in the first level. Otherwise, let us first notice that it is enough to consider the case where $\eta_{1}$ and $\upsilon_{1}\equiv+1$ differ in the $n$ first coordinates (where thus $\eta_{1}\equiv-1$ ); now it is just a matter of noticing that any interior configuration $\sigma$ of $\gamma_{\eta\upsilon}^{i_{j}}$ is characterized by the condition that $\sigma_{1}^{i_{j-1}}=-1$ and $\sigma_{1}^{i_{j}}=+1$ (in this paragraph, $i_{0}$ should be understood as $i_{n}$ ).

∎

With the help of this lemma, we can now construct the random set of paths that we will consider in (2.2). Let $0<\epsilon<\nicefrac{{1}}{{2}}$ be arbitrary:

For a given pair of distinct configurations $\eta$ and $\upsilon$ such that $\text{d}_{1}(\eta_{1},\upsilon_{1})\geq\epsilon N_{1}$ , if there exists a good path in $\Gamma(\eta,\upsilon)$ , then we choose one such path, say the one with the smallest superscript, for $\Gamma_{N}$ ; otherwise, we choose $\gamma_{\eta\upsilon}^{1}$ ; 2. 2.

If $\text{d}_{1}(\eta_{1},\upsilon_{1})<\epsilon N_{1}$ , and there exists a good vertex $\sigma^{\prime}\in\Sigma_{N}$ such that $\text{d}_{1}(\eta_{1},\sigma_{1}^{\prime})\geq\epsilon N_{1}$ , $\epsilon N_{1}\leq\text{d}_{1}(\sigma_{1}^{\prime},\upsilon_{1})=\text{d}(\sigma^{\prime},\upsilon)\leq 2\epsilon N_{1}$ and there exist good paths, one in $\Gamma(\eta,\sigma^{\prime})$ and another in $\Gamma(\sigma^{\prime},\upsilon)$ , such that the concatenation of these two paths is a self-avoiding path with length less than $N$ , then we choose this concatenation as the path from $\eta$ to $\upsilon$ in $\Gamma_{N}$ (notice that this is a good path since $\sigma^{\prime}$ is good); otherwise, we choose $\gamma_{\eta\upsilon}^{1}$ .

It is immediate that $\Gamma_{N}$ thus chosen is a complete set of self-avoiding paths, that is each pair $\eta,\upsilon\in\Sigma_{N}$ is uniquely connected by a self-avoiding path $\gamma_{\eta\upsilon}\in\Gamma_{N}$ . Moreover, we may readily check that $\bar{\ell}(\Gamma_{N})\leq N$ , so we get the bound

[TABLE]

The following is a key fact about $\Gamma_{N}$ .

Proposition 2.1.

For any $\kappa>0$ and any $0<\epsilon<\nicefrac{{1}}{{2}}$ the following holds: with ${\mathbbm{P}}$ -probability $1$ there exists an $N_{0}=N_{0}(\kappa,\epsilon)\in{\mathbbm{N}}$ such that for all $N\geq N_{0}$ the set of paths $\Gamma_{N}$ is good.

Proof.

For pairs of vertices $\eta,\upsilon\in\Sigma_{N}$ such that $\text{d}_{1}(\eta_{1},\upsilon_{1})\geq\epsilon N_{1}$ , the ${\mathbbm{P}}$ -almost sure existence of good paths connecting them in $\Gamma_{N}$ is proved arguing as Proposition 4.1 in [11] using the help of Lemma 2.1.

For pairs of vertices $\eta,\upsilon\in\Sigma_{N}$ such that $\text{d}_{1}(\eta_{1},\upsilon_{1})<\epsilon N_{1}$ , let us first denote by $\mathcal{D}_{1}^{\eta,\upsilon}=\{i:\eta_{1}^{i}\neq\upsilon_{1}^{i}\}$ the set of positions where $\eta$ and $\upsilon$ differ on the first level and also introduce the set

[TABLE]

where the condition “ $\sigma_{1}|_{\mathcal{D}_{1}^{\eta,\upsilon}}=\eta_{1}|_{\mathcal{D}_{1}^{\eta,\upsilon}}$ ” is not present if $\mathcal{D}_{1}^{\eta,\upsilon}=\varnothing$ . Here, $\sigma_{1}|_{D}=(\sigma^{i})_{i\in D}$ is just the restriction of $\sigma_{1}$ to set $D\subseteq\{1,\ldots,N_{1}\}$ . We may readily check that $\text{d}_{1}(\eta_{1},\omega_{1})\geq\epsilon N_{1}$ and $\epsilon N_{1}\leq\text{d}_{1}(\omega_{1},\upsilon_{1})=\text{d}(\omega,\upsilon)\leq 2\epsilon N_{1}$ for all $\omega\in\Sigma_{N}^{\eta,\upsilon}$ .

For $\sigma\in\Sigma_{N}^{\eta,\upsilon}$ , let $\gamma_{\sigma\sigma^{\prime}}$ stands for the path starting at the vertex $\sigma$ , constructed by flipping the sites whose positions belong to $\mathcal{D}_{1}^{\eta,\upsilon}$ , in increasing order of coordinate. In case $\mathcal{D}_{1}^{\eta,\upsilon}=\varnothing$ , we assume that $\sigma=\sigma^{\prime}$ and $\gamma_{\sigma\sigma^{\prime}}=\{\sigma\}$ . By Lemma 2.2 below, there are at least $(2\epsilon)^{-\epsilon N_{1}}$ such paths, which are independent by construction. Thus, since there exists a constant $c_{\kappa}>0$ such that the probability of all visited vertices for a given such path $\gamma_{\sigma\sigma^{\prime}}$ to be bad can be bounded by $e^{-c_{\kappa}N}$ when $N$ is large enough, for any $\kappa>0$ and $0<\epsilon<\nicefrac{{1}}{{2}}$ , we can found $N^{\prime}=N^{\prime}(\kappa,\epsilon)\in{\mathbbm{N}}$ such that for all $N\geq N^{\prime}$ ,

[TABLE]

It then follows from the Borel-Cantelli Lemma that, for any $\kappa>0$ and $0<\epsilon<\nicefrac{{1}}{{2}}$ , with ${\mathbbm{P}}$ -probability $1$ , for all $N$ sufficiently large there exists at least one vertex, say $\omega\in\Sigma_{N}^{\eta,\upsilon}$ , and its corresponding path, say $\gamma_{\omega\omega^{\prime}}$ , which is good. By construction we have that $\eta,\omega$ are more than distance $\epsilon N_{1}$ apart, and so are $\omega^{\prime},\upsilon$ ; as before, for any $\kappa>0$ and any $0<\epsilon<\nicefrac{{1}}{{2}}$ , we can ${\mathbbm{P}}\mbox{-a.s.}$ find good paths $\gamma_{\eta\omega}$ and $\gamma_{\omega^{\prime}\upsilon}$ for all $N$ large enough. The conclusion of this case now follows by concatenating the (good) paths $\gamma_{\eta\omega},\gamma_{\omega\omega^{\prime}}$ and $\gamma_{\omega^{\prime}\upsilon}$ , to get the path from $\eta$ to $\upsilon$ in $\Gamma_{N}$ . ∎

Lemma 2.2.

For any $0<\epsilon<\nicefrac{{1}}{{2}}$ and any $\eta,\upsilon\in\Sigma_{N}$ such that $\text{d}_{1}(\eta_{1},\upsilon_{1})<\epsilon N_{1}$ , let $\Sigma_{N}^{\eta,\upsilon}$ be as in (2.7). Then

[TABLE]

Proof.

We have that

[TABLE]

where the last inequality follows from the fact that $\binom{n}{m}\geq(\nicefrac{{n}}{{m}})^{m}$ , $n\geq m\geq 1$ , and standard bounds for $\left\lfloor\cdot\right\rfloor$ and $\left\lceil\cdot\right\rceil$ . Now, since $N_{1}\uparrow\infty$ as $N\uparrow\infty$ , for any $0<\epsilon<\nicefrac{{1}}{{2}}$ , we have that $N_{1}^{-1}\leq\epsilon-2\epsilon^{2}$ for any $N$ sufficiently large. This is enough to get the statement of the lemma. ∎

Having constructed the set of paths $\Gamma_{N}$ , we can now proceed with the spectral gap estimate. From now on we assume that, for all $\kappa>0$ and all $0<\epsilon<\nicefrac{{1}}{{2}}$ , ${\mathbbm{P}}\mbox{-a.s.}$ for all large enough $N$ , $\Gamma_{N}$ is good. Recalling that $\mathcal{E}_{N}=\mathcal{G}\cup\mathcal{B}$ , where $\mathcal{G}$ and $\mathcal{B}$ denote the sets of good and bad edges respectively, we can write

[TABLE]

where $X_{N}^{\mathcal{G}}$ , respectively $X_{N}^{\mathcal{B}}$ , is as the maximum term in (2.6) but with the $\max$ sign restrict to edges in $\mathcal{G}$ , respectively $\mathcal{B}$ . From (2.3), it follows immediately that $X_{N}^{\mathcal{B}}\leq Z_{N}$ and, by Proposition 2.1, one readily concludes that $X_{N}^{\mathcal{G}}\leq\exp\left(\kappa\beta N\right)X_{N}$ , where

[TABLE]

Using these last bounds in (2.11), $\varrho(\Gamma_{N})$ can be estimated by

[TABLE]

for all large enough $N$ .

Let now

[TABLE]

so we have $X_{N}\leq X_{N}^{(1)}+X_{N}^{(2)}$ .

In Sections 3 and 4, we prove the following two results, respectively.

Proposition 2.2.

For all $\beta>0$ ,

[TABLE]

Proposition 2.3.

For all $\beta>0$ ,

[TABLE]

These propositions, combined with (1.6), immediately yield Theorem 1.

3 Proof of Proposition 2.2

We follow the strategy in [11] (see Subsection 4.2 therein), with steps that are increasingly more involved than in the $k=1$ case of that reference; in particular, our last two steps depart considerably from the direct approach there.

Step 1 – Bound in terms of $\Gamma^{1},\ldots,\Gamma^{N}$ .

Since the set $\Gamma_{N}$ is constructed using paths in $\bigcup_{i=1}^{N}\Gamma^{i}$ , if we denote

[TABLE]

for $i=1,\ldots,N$ and $M_{(N)}=M_{1}\vee\cdots\vee M_{N}$ , we get the estimate

[TABLE]

Since $M_{1},\ldots,M_{N}$ are identically distributed, it is sufficient to give an estimate for one of them with a relatively good probability estimate. Consider thus

[TABLE]

For a given edge $e=(\sigma,\tau)$ , there exists a unique coordinate $i\in\{1,\ldots,N\}$ such that $\sigma^{i}\neq\tau^{i}$ . So that, by construction, the set of all pairs $(\eta,\upsilon)$ such that $\gamma_{\eta\upsilon}^{1}\ni e$ is exactly

[TABLE]

Then, if we denote $\sigma^{>i}=(\sigma^{i+1},\ldots,\sigma^{N})$ , $\sigma^{<i}=(\sigma^{1},\ldots,\sigma^{i-1})$ ,

[TABLE]

and

[TABLE]

we obtain the bound

[TABLE]

Step 2 – Coarse graining.

Now we will focus on estimating the right-hand side of (3.7). Before turning to this, let us briefly describe our strategy. We partition the $k$ -dimensional Euclidean space into subsets $\Delta_{\ell_{1},\ldots,\ell_{k}}$ , and analyse separately the contribution to $S_{i-1}^{(1)}(\sigma^{i},\sigma^{>i})$ and $S_{N-i}^{(1)}(\sigma^{<i},-\sigma^{i})$ coming from each $\Delta_{\ell_{1},\ldots,\ell_{k}}$ , by means of large deviation-type estimates, thus securing control over the exponentially many terms involved in the above maximization. It is enough to study $S_{i-1}^{(1)}(\sigma^{i},\sigma^{>i})$ in detail; the case of $S_{N-i}^{(1)}(\sigma^{<i},-\sigma^{i})$ is entirely similar.

Let $1\leq i\leq N$ , $\sigma^{i}=\pm 1$ and $\sigma^{>i}\in\{-1,+1\}^{N-i}$ be fixed, and let $j$ be such that $i\in\{1+\sum_{n=1}^{j-1}N_{n},\ldots,\sum_{n=1}^{j}N_{n}\}$ . Let $\alpha$ be such that $\alpha N_{j}=i-(1+\sum_{n=1}^{j-1}N_{n})$ , and set $\bm{\alpha}^{j}=(\alpha_{1},\ldots,\alpha_{k})$ such that

[TABLE]

Let $\Sigma_{r,s}^{\bm{\alpha}^{j}}=\Sigma_{\alpha_{r}N_{r}}\times\cdots\times\Sigma_{\alpha_{s}N_{s}}$ , $1\leq r\leq s\leq k$ . We can thus write

[TABLE]

and

[TABLE]

where $\bar{\alpha}_{j}=1-\alpha_{j}$ , and $\bm{1}=(1,\ldots,1)$ . We stress the relationship between $i$ , $j$ and $\alpha$ established in this paragraph.

Remark 3.1.

Notice that if $i\in\{1,N_{1},N_{1}+N_{2},\ldots,N\}$ (cases equivalent to $\alpha\in\{0,1\}$ ), then we readily get that

[TABLE]

By Theorem 1.5(iii) in [2] and (1.6), we thus have that for all $\beta>0$ ,

[TABLE]

For convenience, we enumerate/represent

[TABLE]

as

[TABLE]

Set $E_{u}=(E_{u_{1}}^{(1)},\ldots,E_{u}^{(k)})$ , $u=(u_{1},\ldots,u_{k})$ . With this notation, $S_{i-1}^{(1)}(\sigma^{i},\sigma^{>i})$ can be written as

[TABLE]

Let $L\in{\mathbbm{N}}$ and consider the following partition of ${\mathbbm{R}}^{k}$ :

[TABLE]

where for $n=1,\ldots,k$ , we set

[TABLE]

Now we decompose $S_{i-1}^{(1)}(\sigma^{i},\sigma^{>i})$ in the following way:

[TABLE]

where $\mathcal{L}^{*}:=\{0\leq\ell_{1},\ldots,\ell_{k}\leq L+1:\exists\,n\in\{1,\ldots,k\}\mbox{ such that }\ell_{n}=L+1\}$ .

First we consider the last sum in the right-hand side of (3.16); denote it by $S_{N}^{*}$ . We will show that this quantity is zero for all $N$ large enough ${\mathbbm{P}}\mbox{-a.s.}$ Indeed, we note first that

[TABLE]

Now consider the event

[TABLE]

One readily checks that $\left\{\sum_{u_{1}}\cdots\sum_{u_{k}}{\mathbbm{1}}\{E_{u}\in\cup_{\mathcal{L}^{*}}\Delta_{\ell_{1},\ldots,\ell_{k}}\}\geq 1\right\}\subset\mathcal{A}_{L,N}^{c}$ , so that, from Proposition 3.1 in [4], we have that the sum in (3.17), and thus $S_{N}^{*}$ , vanishes for all large $N$ ${\mathbbm{P}}\mbox{-a.s.}$

Step 3 – Large deviation estimate.

It remains to bound the first term in the right-hand side of (3.16). In order to do this, we need to introduce some notation. Given $0\leq r\leq s\leq k$ , define the canonical projection $\Pi_{r}^{s}\colon{\mathbbm{R}}^{k}\to{\mathbbm{R}}^{s-r}$ such that $\Pi_{r}^{s}x=(x_{r+1},\ldots,x_{s})$ , where by convention $\Pi_{s}^{s}\equiv 0$ . Set $\Psi_{r}^{s}\equiv\Pi_{r}^{s}\Psi_{k}=\{\Pi_{r}^{s}x:x\in\Psi_{k}\}$ . Now, let $\Phi_{r}^{s}:\Psi_{r}^{s}\to[0,\infty)$ be the functional defined by

[TABLE]

We remark that by compactness and convexity, $\Phi_{r}^{s}$ admits a unique maximum on $\Pi_{r}^{s}\Psi_{k}$ , at say $z_{r}^{s}\in\partial(\Pi_{r}^{s}\Psi_{k})$ ; set $\hat{\Phi}_{r}^{s}=\Phi_{r}^{s}(z_{r}^{s})=\left\langle\Pi_{r}^{s}\mathfrak{m}^{*},z_{r}^{s}\right\rangle$ . We note that $\|z_{r}^{s}\|^{2}=\beta_{\ast}^{2}\sum_{m=r+1}^{s}p_{m}$ .

For each $0\leq r\leq s\leq k$ , let us set

[TABLE]

with the convention that $Q_{r}^{r}\equiv 0$ .

Let now $\underline{x}=(\underline{x}_{1},\ldots,\underline{x}_{k})\in{\mathbbm{R}}^{k}$ be such that $\underline{x}_{n}=\frac{\ell_{n}}{L}\beta_{\ast}\sqrt{P_{n}}$ , $n=1,\ldots,k$ . With the above terminology, we have that ${\mathbbm{P}}\mbox{-a.s.}$ for all $N$ large enough,

[TABLE]

where

[TABLE]

with the middle sum above being over all sequences of integers $1\leq i_{1}<\cdots<i_{n}\leq j$ , and

[TABLE]

Remark 3.2.

Note that in (3.22) the point $\Pi_{0}^{j}\underline{x}=(\underline{x}_{1},\ldots,\underline{x}_{j})$ is such that $\underline{x}_{r}=0$ for all $r\neq i_{1},\ldots,i_{n}$ .

Let $n\in\{1,\ldots,j\}$ , $[i_{1},\ldots,i_{n}]$ and $1\leq\ell_{i_{1}},\ldots,\ell_{i_{n}}\leq L$ be fixed. For $1\leq r\leq n$ , set

[TABLE]

where $i_{0}=0$ . With this notation, we write

[TABLE]

Now let us estimate $K^{\star}_{\ell_{i_{1}},\ldots,\ell_{i_{n}}}$ . Let $q^{\star}_{\ell_{i_{r}}}={\mathbbm{P}}(E_{u_{1},\ldots,u_{i_{r}}}^{(i_{r})}\in\Delta_{\ell_{i_{r}}}^{i_{r}})$ . We then have for all $r=1,\ldots,n$ and $N$ large enough that

[TABLE]

Let now $c_{\star}>0$ be a positive constant to be specified later, and define the following family of integers. For all $1\leq r\leq s\leq n$ , set $U_{r,s}=\prod_{m=r}^{s}q^{\star}_{\ell_{i_{m}}}2^{N^{\star}_{m}}$ . Let $J_{0}=0$ and recursively define

[TABLE]

until $\nu=\nu_{n}\in\{0,\ldots,n\}$ such that $J_{\nu_{n}}=n$ or $U_{J_{\nu_{n}}+1,s}\geq c_{\star}N$ for all $J_{\nu_{n}}+1\leq s\leq n$ . Put $J_{\nu_{n}+1}=n+1$ . We then have that $0=J_{0}<J_{1}<\cdots<J_{\nu_{n}}<J_{\nu_{n}+1}=n+1$ . Moreover, for every $\nu=0,\ldots,\nu_{n}$ ,

[TABLE]

At last, if $\nu_{n}=0$ , then put

[TABLE]

otherwise, that is, if $\nu_{n}\in\{1,\ldots,n\}$ , then put

[TABLE]

Lemma 3.1.

With the notation introduced above, for any $c_{\star}>0$ and $1\leq\ell_{i_{1}},\ldots,\ell_{i_{n}}\leq L$ , the following holds for all large enough $N$ .

[TABLE]

In the proof of Lemma 3.1, we will use the following result.

Lemma 3.2.

For each $1\leq n\leq j$ , let $b_{n}=\prod_{r=1}^{n}\rho^{\star}_{\ell_{i_{r}}}q^{\star}_{\ell_{i_{r}}}2^{N^{\star}_{r}}$ . Set

[TABLE]

Then

[TABLE]

Proof.

From the definition of the $\rho^{\star}_{\ell_{i_{r}}}$ and (3.28), it follows that

[TABLE]

and thus $\vartheta_{n}\geq 3b_{n}/4\geq 3c_{\star}N$ . ∎

Proof of the Lemma 3.1.

We will argue by induction on $n$ . In the case $n=1$ , since $\rho^{\star}_{\ell_{i_{1}}}\geq 4$ , by Chernoff’s inequality and Lemma 3.2 above, we have

[TABLE]

Assume that (3.31) is proved for $n-1$ . Introducing the random set

[TABLE]

and taking into account the independence of the Gaussian random variables, we may write

[TABLE]

where we also use on the second inequality the induction hypothesis (3.31) for $n-1$ ; here, $\{E_{u_{0}}\}$ is a relabeling of the random variables $\{E_{u_{1},\ldots,u_{n}}^{(i_{n})}\}$ . It remains to bound the last term on the right-hand side of (3.35). Notice that $b_{n}=\rho^{\star}_{\ell_{i_{n}}}b_{n-1}q^{\star}_{\ell_{i_{n}}}2^{N^{\star}_{n}}$ . Since $\rho^{\star}_{\ell_{i_{n}}}\geq 4$ , it follows from Chernoff’s inequality and Lemma 3.2 above that

[TABLE]

This concludes the proof. ∎

Coming back to (3.7), we need to make a probability estimate which holds for all possible random variables $K^{\star}_{\ell_{i_{1}},\ldots,\ell_{i_{n}}}$ involved in the max signs. Recall that there is an index for the chosen path family, the index $i$ , the configurations $\sigma^{i}$ , $\sigma^{>i}$ , and the indices $n$ , $[i_{1},\ldots,i_{n}]$ and $\ell_{i_{1}},\ldots,\ell_{i_{n}}$ . Since there are not more than $2L^{k}N^{2}2^{N+k}$ distinct such objects, it suffices to have a probability estimate in (3.31) to compensate for this factor. This suggests the choice of $c_{\star}$ for the following result, which is immediate from Lemma 3.1 and the union bound.

Proposition 3.1.

Given $\delta>0$ , assume that $c_{\star}>\log 2+2\delta$ . Then for all $N$ sufficiently large,

[TABLE]

In view of (3.22) and (3.25), one readily deduces from (3.37) that for any given $\delta>0$ , with ${\mathbbm{P}}-$ probability $\geq 1-e^{-\delta N}$ , for all $N$ large enough,

[TABLE]

where

[TABLE]

The next step is to estimate (the non-random term on) the right hand side of (3.39).

Step 4 – Deterministic estimation.

It is worth noticing at this point that we have to make our estimation uniform with respect to all the indices involved.

Let $n\in\{1,\ldots,j\}$ and $[i_{1},\ldots,i_{n}]$ be fixed. We partition the support of the sum in (3.39) into the subsets

[TABLE]

If $\ell_{i_{1}},\ldots,\ell_{i_{n}}\in\mathcal{I}^{\star}_{n,j}(s)$ , then we have from (3.26) that

[TABLE]

for all $N$ large enough, where the exponential factor is not present if $s=n$ . (3.39) can thus be bounded above by

[TABLE]

For $0\leq s\leq n$ , let

[TABLE]

We will estimate $S^{\star}_{n,j}(s)$ by distinguishing the case $s\in\{0,\ldots,n-1\}$ from the case $s=n$ .

Case I: $s\in\{0,\ldots,n-1\}$ . From (3.20), (3.24) and Remark 3.2, we get that

[TABLE]

Now, from basic properties of inner product, we can also write

[TABLE]

Moreover, by construction, if $\ell_{i_{1}},\ldots,\ell_{i_{n}}\in\mathcal{I}^{\star}_{n,j}(s)$ , then $\Pi_{0}^{i_{s}}\underline{x}\in\Psi_{0}^{i_{s}}$ , and thus

[TABLE]

Thus, by suitably using (3.44) and (3.46), we get that

[TABLE]

Now, for $0\leq r\leq s\leq k$ , set

[TABLE]

and note that $\Psi_{r,s}^{\bm{\alpha}^{j}}$ is a nonempty closed convex subset of ${\mathbbm{R}}^{s-r}$ so, from Theorem 2 (see Appendix A), there exists a unique element of $\Psi_{r,s}^{\bm{\alpha}^{j}}$ , say $\mathfrak{w}_{r,s}^{\bm{\alpha}^{j}}$ , such that

[TABLE]

Since we have $s=J_{\nu_{n}}$ , it is not difficult to see with the help of Remark 3.2 and (3.28) that $\Pi_{i_{s}}^{j}\underline{x}\in\Psi_{i_{s},j}^{\bm{\alpha}^{j}}$ ; it then follows from (3.49) that

[TABLE]

For each $0\leq l\leq r\leq k$ , let

[TABLE]

With this definition, (3.47) and (3.50) imply that for all $N$ large enough and for any $s=0,\ldots,n-1$ ,

[TABLE]

Before going to the next case, let us point out that for any $0\leq l\leq r\leq k$ , we have

[TABLE]

Indeed, let $\mathcal{L}_{l,r}^{\bm{\alpha}^{j}}\colon\Psi_{l,r}^{\bm{\alpha}^{j}}\to{\mathbbm{R}}$ be given by $\mathcal{L}_{l,r}^{\bm{\alpha}^{j}}(x)=\left\langle\Pi_{l}^{r}\mathfrak{m}^{*}-\mathfrak{w}_{l,r}^{\bm{\alpha}^{j}},x-\mathfrak{w}_{l,r}^{\bm{\alpha}^{j}}\right\rangle$ , and notice, from (3.51b), that

[TABLE]

According to Theorem 2 (see Appendix A), we have that $\mathcal{L}_{l,r}^{\bm{\alpha}^{j}}(x)\leq 0$ for all $x\in\Psi_{l}^{r,\bm{\alpha}^{j}}$ . The claim follows.

Case II: $s=n$ . In this case, since

[TABLE]

it is immediate from (3.53) that $S^{\star}_{n,j}(n)$ can be estimated by

[TABLE]

for all large enough $N$ . Summarizing and coming back to (3.42), we get that

[TABLE]

for all $N$ sufficiently large, for some $c>0$ not depending on $N$ or $L$ , where $\hat{\Phi}_{0}^{0}\equiv G_{s,s}\equiv 0$ .

Recall (3.38). In view of (3.57) and standard combinatorial estimates, we obtain that for any $\delta>0$ , with a ${\mathbbm{P}}-$ probability $\geq 1-e^{-\delta N}$ for all $N$ large enough,

[TABLE]

for some constant $c>0$ . This concludes the estimation of $S_{j}^{N}$ .

Let us now recall (3.21). Since we have already estimated $S_{j}^{N}$ , it remains to estimate the term $e^{\frac{k}{L}\beta_{\ast}\beta N}e^{\tfrac{\beta_{\ast}^{2}}{2}\,Q_{0}^{j}N}$ . From (3.53), we readily find that

[TABLE]

It follows from Proposition 3.1, (3.58) and (3.59) that for any $\delta>0$ , with a ${\mathbbm{P}}-$ probability $\geq 1-e^{-\delta N}$ for all $N$ large enough,

[TABLE]

Symmetrically, we also have that

[TABLE]

Thus, letting

[TABLE]

we get that with ${\mathbbm{P}}-$ probability $\geq 1-e^{-\delta N}$ for all $N$ large enough,

[TABLE]

Step 5 – Maximization.

As a final step, it remains to maximize $\psi_{j}(\beta,\bm{\alpha}^{j})$ over $j\in\{1,\ldots,k\}$ and $\alpha\in[0,1]$ . We do this in the following lemma.

Lemma 3.3.

For every $1\leq j\leq k$ and all $0\leq\alpha\leq 1$ ,

[TABLE]

Proof.

Recall the definitions (3.8) of $\bm{\alpha}^{j}$ , (3.20) of $Q_{r}^{s,\bm{\alpha}^{j}}$ and (3.51a-b) of $G_{r,s}(\beta,\bm{\alpha}^{j})$ . Also recall, from the discussion at the beginning of Step 3 above, that $\hat{\Phi}_{r}^{s}=\left\langle\Pi_{r}^{s}\mathfrak{m}^{*},z_{r}^{s}\right\rangle$ denotes the maximum of $\Phi_{r}^{s}$ over $\Psi_{r}^{s}$ , attained at point $z_{r}^{s}$ . We claim that

[TABLE]

for any $0\leq s\leq j$ and any $j-1\leq r\leq k$ .

We check this for $0\leq s<j$ and $r=j-1$ . The other cases follows from similar arguments. Noting that $\hat{\Phi}_{j-1}^{r}$ is not present in the left-side of (3.65), from definition of $\bm{\alpha}^{j}$ , it is equal to

[TABLE]

Now, using “ $\circ$ ” to indicate vector concatenation, it is immediate to observe that $\mathfrak{w}_{j-1,k}^{\bm{1}-\bm{\alpha}^{j}}=\Pi_{0}^{1}\mathfrak{w}_{j-1,k}^{\bm{1}-\bm{\alpha}^{j}}\circ\Pi_{1}^{k-j+1}\mathfrak{w}_{j-1,k}^{\bm{1}-\bm{\alpha}^{j}}$ , and so we find that the expression in (3.66) equals

[TABLE]

where the inequality follows from the the facts that $\left\langle z_{0}^{s},z_{0}^{s}\right\rangle=\beta_{\ast}^{2}\sum_{n=1}^{s}p_{n}$ (as noted right below (3.19)) and $\mathfrak{m}_{j}^{*}\cdot\Pi_{0}^{1}\mathfrak{w}_{j-1,k}^{\bm{1}-\bm{\alpha}^{j}}\leq\hat{\Phi}_{j-1}^{j}$ (by the maximality of the latter quantity). Convexity now implies that $z_{0}^{s}\circ\mathfrak{w}_{s,j}^{\bm{\alpha}^{j}}\circ\Pi_{1}^{k-j+1}\mathfrak{w}_{j-1,k}^{\bm{1}-\bm{\alpha}^{j}}\in\Psi_{k}$ , and thus from Theorem 2 we may conclude that

[TABLE]

(3.65) is now just a matter of recalling (1.12).

From (3.65), we find that

[TABLE]

The lemma now follows readily from the fact that $\hat{\Phi}_{0}^{j-1}+\hat{\Phi}_{j-1}^{j}+\hat{\Phi}_{j}^{k}\leq\left\langle\mathfrak{m}^{*},\mathfrak{w}^{*}\right\rangle$ . ∎

Now, from (3.7), (3.63), Lemma 3.3, and the Borel-Cantelli Lemma, we get that

[TABLE]

Replacing this in (3.2), and since $L$ is arbitrary, Proposition 2.2 follows.

4 Proof of Proposition 2.3

Recall (2.15). Let us start by describing briefly the strategy we use to prove Proposition 2.3. In Proposition 2.1, we have showed that, for each pair of vertices $(\eta,\upsilon)$ such that $\text{d}_{1}(\eta_{1},\upsilon_{1})<\epsilon N_{1}$ , the path connecting them in $\Gamma_{N}$ has, with ${\mathbbm{P}}$ -probability $1$ , the form $\gamma_{\eta\upsilon}=\gamma_{\eta\omega}\cup\gamma_{\omega\upsilon}$ for all large enough $N$ , where the vertex $\omega$ , which we will refer to here as the intermediate point of the path $\gamma_{\eta\upsilon}$ , is such that $\text{d}_{1}(\eta_{1},\omega_{1})\geq\epsilon N_{1}$ and $\epsilon N_{1}\leq\text{d}_{1}(\omega_{1},\upsilon_{1})=\text{d}(\omega,\upsilon)\leq 2\epsilon N_{1}$ . Keeping this in mind, since the summation in the right-hand side of (2.15) is over a set of self-avoiding paths $\gamma_{\eta\upsilon}$ that go through the edge $e$ , we have that either $e\in\gamma_{\eta\omega}$ , or $e\in\gamma_{\omega\upsilon}$ . So, our plan is to proceed with the estimation of $X_{N}^{2}$ by considering these two cases separately.

Recall (2.7). Using the above arguments, we get that with ${\mathbbm{P}}$ -probability $1$ ,

[TABLE]

for all large enough $N$ where

[TABLE]

and

[TABLE]

Notice that, by our construction, $\gamma_{\eta\omega}$ and $\gamma_{\omega\upsilon}$ have no edge in common.

Let us first estimate the term $Y_{N}^{\prime\prime}$ . To do this, it is enough to notice that for a given edge $e=(\sigma,\tau)$ the sum in the right-hand side of (4.3) is over a set of paths connecting pairs of vertices $(\eta,\upsilon)$ such that $\eta$ is in a hypercube of dimension at most $N$ around $\sigma$ and $\upsilon$ is in a hypercube of dimension at most $2\epsilon N_{1}$ around $\tau$ . Using this, it follows that

[TABLE]

hence, by Theorem 1.5(iii) in [2] and (1.6),

[TABLE]

since $0<\epsilon<\nicefrac{{1}}{{2}}$ is arbitrary.

To estimate the term $Y_{N}^{1}$ , we use basically the same argument that we have applied to prove Proposition 2.2. Arguing as we did to get (3.2), we can write

[TABLE]

where $Y_{N}^{\prime}(i)$ is as in (4.2) but with the paths $\gamma_{\eta\omega}$ in $\Gamma^{i}$ . Again, it is sufficient to consider the variable $Y_{N}^{\prime}(1)$ . Now, using the fact that the set $\{(\eta,\omega)\in\Sigma_{N}\times\Sigma_{N}^{\eta,\upsilon}:\gamma_{\eta\upsilon}^{1}\ni e\}$ is equal to

[TABLE]

for a given edge $e=(\sigma,\tau)$ , with respective $i\in\{1,\ldots,N\}$ ; using the same notation used in (3.7), we readily get

[TABLE]

where the power of 4 error factor arises due to condition $\text{d}_{1}(\omega_{1},\upsilon_{1})=\text{d}(\omega,\upsilon)\leq 2\epsilon N_{1}$ . Arguing now as at the end of Section 3, since $0<\epsilon<\nicefrac{{1}}{{2}}$ is arbitrary, we conclude that

[TABLE]

Hence, the claim of Proposition 2.3 holds.

Appendix A Appendix

Lemma A.1.

Let $\mathfrak{w}^{*}=(\mathfrak{w}_{1}^{*},\ldots,\mathfrak{w}_{k}^{*})$ be the point of ${\mathbbm{R}}^{k}$ such that

[TABLE]

Then, $\mathfrak{w}^{*}\in\Psi_{k}$ and

[TABLE]

Proof.

The proof of Lemma A is inspired by a one in [8] and has as key tool the Cauchy-Schwarz inequality. The fact that $\mathfrak{w}^{*}\in\Psi_{k}$ is an immediate consequence of definition (1.9) and assumptions $\sum_{j=1}^{k}a_{j}=\sum_{j=1}^{k}p_{j}=1$ . Now, let $x\in\Psi_{k}$ . By Cauchy-Schwarz inequality, for all $l\in\{1,\ldots,l_{k}\}$ , we have

[TABLE]

Since $\left\|\Pi_{0}^{J_{l}^{*}}\mathfrak{w}^{*}\right\|^{2}=P_{J_{l}^{*}}$ , it follows that

[TABLE]

Hence,

[TABLE]

Set $y_{l}=\sum_{j=J_{l-1}^{*}+1}^{J_{l}^{*}}\beta_{l}\sqrt{a_{j}}(\beta_{l}\sqrt{a_{j}}-x_{j})$ , $l=1,\ldots,l_{k}$ , and consider the numbers $\beta\beta_{1}^{-1}>\cdots>\beta\beta_{l_{k}}^{-1}>0$ . From what we have just seen, the sequences $(y_{l})_{l=1}^{l_{k}}$ and $(\beta\beta_{l}^{-1})_{l=1}^{l_{k}}$ satisfy the conditions of Lemma A in [8] so that we readily get

[TABLE]

This concludes the proof of Lemma A.1. ∎

Theorem 2 (Projection onto a closed convex set).

Let $K\subset H$ be a nonempty closed convex set. Then for every $f\in H$ there exists a unique element $u\in K$ such that

[TABLE]

Moreover, $u$ is characterized by the property

[TABLE]

See [3], Theorem V.2, p. 79.

Acknowledgements

This work is part of the Ph.D. thesis of the second author at IME-USP and was supported in part by CNPq 140762/2016-7. We warmfully thank Pierre Picco for suggesting this problem and for innumerable discussions concerning it in many occasions.

Bibliography13

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Arous, G. B. and Jagannath, A. Spectral gap estimates in mean field spin glasses. Communications in Mathematical Physics , 361(1), 1-52 (2018).
2[2] Bovier, A. and Kurkova, I. Derrida’s generalised random energy models 1: models with finitely many hierarchies. Ann. Inst. H. Poincaré Probab. Statist. 40(4), 439–480 (2004).
3[3] Brezis, H. Analyse fonctionnelle. Théorie et applications. Masson (1983).
4[4] Capocaccia, D., Cassandro, M. and Picco, P. On the existence of thermodynamics for the generalized random energy model. Journal of Statistical Physics 46(3-4), 493–505 (1987).
5[5] Cernỳ, J. and Wassmer, T. Aging of the metropolis dynamics on the random energy model. Probability Theory and Related Fields 167(1-2), 253–303 (2017).
6[6] Derrida, B. A generalization of the random energy model which includes correlations between energies. J. Phys. Lett. 46(9), 401–407 (1985).
7[7] Diaconis, P. and Stroock, D. Geometric bounds for eigenvalues of markov chains. The Annals of Applied Probability 1(1), 36–61 (1991).
8[8] Dorlas, T. C. and Dukes, W. M. B. Large deviation approach to the generalized random energy model. Journal of Physics A: Mathematical and General 35(20), 4385 (2002).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Convergence Time to Equilibrium of the Metropolis dynamics for the GREM

Abstract

1 Introduction and Main Result

Existence of the Free Energy.

Dynamics.

Theorem 1**.**

2 Proof of the Theorem 1 – Canonical set of paths

Lemma 2.1**.**

Proof.

Proposition 2.1**.**

Proof.

Lemma 2.2**.**

Proof.

Proposition 2.2**.**

Proposition 2.3**.**

3 Proof of Proposition 2.2

Step 1 – Bound in terms of Γ1,…,ΓN\Gamma^{1},\ldots,\Gamma^{N}Γ1,…,ΓN.

Step 2 – Coarse graining.

Remark 3.1**.**

Step 3 – Large deviation estimate.

Remark 3.2**.**

Lemma 3.1**.**

Lemma 3.2**.**

Proof.

Proof of the Lemma 3.1.

Proposition 3.1**.**

Step 4 – Deterministic estimation.

Step 5 – Maximization.

Lemma 3.3**.**

Proof.

4 Proof of Proposition 2.3

Appendix A Appendix

Lemma A.1**.**

Proof.

Theorem 2** (Projection onto a closed convex set).**

Acknowledgements

Theorem 1.

Lemma 2.1.

Proposition 2.1.

Lemma 2.2.

Proposition 2.2.

Proposition 2.3.

Step 1 – Bound in terms of $\Gamma^{1},\ldots,\Gamma^{N}$ .

Remark 3.1.

Remark 3.2.

Lemma 3.1.

Lemma 3.2.

Proposition 3.1.

Lemma 3.3.

Lemma A.1.

Theorem 2 (Projection onto a closed convex set).