Localization in Gaussian disordered systems at low temperature

Erik Bates; Sourav Chatterjee

arXiv:1906.05502·math.PR·August 30, 2021

Localization in Gaussian disordered systems at low temperature

Erik Bates, Sourav Chatterjee

PDF

TL;DR

This paper demonstrates that in Gaussian disordered systems at low temperatures, the Gibbs measure concentrates around a few states, providing new insights into localization phenomena in spin glasses and directed polymers.

Contribution

It introduces a unified argument showing localization in Gaussian disordered systems, enabling results on path localization and Gibbs state exhaustiveness without relying on traditional identities.

Findings

01

Gibbs measure localizes in small neighborhoods of few states

02

Path localization for directed polymers achieved without exact solvability

03

Gibbs states are exhaustive in spin glasses without Ghirlanda-Guerra identities

Abstract

For a broad class of Gaussian disordered systems at low temperature, we show that the Gibbs measure is asymptotically localized in small neighborhoods of a small number of states. From a single argument, we obtain (i) a version of "complete" path localization for directed polymers that is not available even for exactly solvable models; and (ii) a result about the exhaustiveness of Gibbs states in spin glasses not requiring the Ghirlanda-Guerra identities.

Equations818

μ_{n}^{β} (d σ) : = \frac{e ^{β H_{n} (σ)}}{Z _{n} ( β )} P_{n} (d σ), where Z_{n} (β) : = \int e^{β H_{n} (σ)} P_{n} (d σ) .

μ_{n}^{β} (d σ) : = \frac{e ^{β H_{n} (σ)}}{Z _{n} ( β )} P_{n} (d σ), where Z_{n} (β) : = \int e^{β H_{n} (σ)} P_{n} (d σ) .

F_{n} (β) : = \frac{1}{n} lo g Z_{n} (β),

F_{n} (β) : = \frac{1}{n} lo g Z_{n} (β),

n \to \infty lim F_{n} (β) = p (β) P - a.s. and in L^{1} (P), for every β \in R .

n \to \infty lim F_{n} (β) = p (β) P - a.s. and in L^{1} (P), for every β \in R .

Var H_{n} (σ) = n .

Var H_{n} (σ) = n .

Cov (H_{n} (σ^{1}), H_{n} (σ^{2})) \geq - n E_{n},

Cov (H_{n} (σ^{1}), H_{n} (σ^{2})) \geq - n E_{n},

H_{n} (σ) = i = 1 \sum \infty g_{i, n} φ_{i, n} (σ),

H_{n} (σ) = i = 1 \sum \infty g_{i, n} φ_{i, n} (σ),

⟨ f (σ)⟩ = \frac{E _{n} ( f ( σ ) e ^{β H_{n} (σ)} )}{E _{n} ( e ^{β H_{n} (σ)} )} .

⟨ f (σ)⟩ = \frac{E _{n} ( f ( σ ) e ^{β H_{n} (σ)} )}{E _{n} ( e ^{β H_{n} (σ)} )} .

R (σ^{1}, σ^{2})

R (σ^{1}, σ^{2})

- E_{n} \leq R (σ^{1}, σ^{2}) \leq 1.

- E_{n} \leq R (σ^{1}, σ^{2}) \leq 1.

ρ (σ^{1}, σ^{2}) : = 1 - R_{1, 2} .

ρ (σ^{1}, σ^{2}) : = 1 - R_{1, 2} .

E F_{n} (β) \leq \frac{1}{n} lo g E Z_{n} (β) = \mbox (Lemma \ref m o m e n t s_{l} e mma) \frac{β ^{2}}{2},

E F_{n} (β) \leq \frac{1}{n} lo g E Z_{n} (β) = \mbox (Lemma \ref m o m e n t s_{l} e mma) \frac{β ^{2}}{2},

n \to \infty lim E ⟨ R_{1, 2} ⟩ = 1 - \frac{p ^{'} ( β )}{β} .

n \to \infty lim E ⟨ R_{1, 2} ⟩ = 1 - \frac{p ^{'} ( β )}{β} .

B (σ, δ) : = {σ^{'} \in Σ_{n} : R (σ, σ^{'}) \geq δ}, σ \in Σ_{n}, δ > 0.

B (σ, δ) : = {σ^{'} \in Σ_{n} : R (σ, σ^{'}) \geq δ}, σ \in Σ_{n}, δ > 0.

\displaystyle\mu_{n}^{\beta}\Big{(}\bigcup_{j=1}^{k}\mathcal{B}(\sigma^{j},\delta)\Big{)}\geq 1-\varepsilon.

\displaystyle\mu_{n}^{\beta}\Big{(}\bigcup_{j=1}^{k}\mathcal{B}(\sigma^{j},\delta)\Big{)}\geq 1-\varepsilon.

R (σ^{1}) : = ⟨ R_{1, 2} ∣ σ^{1} ⟩ = \frac{1}{n} i = 1 \sum \infty φ_{i, n} (σ^{1}) ⟨ φ_{i, n} (σ^{2})⟩ .

R (σ^{1}) : = ⟨ R_{1, 2} ∣ σ^{1} ⟩ = \frac{1}{n} i = 1 \sum \infty φ_{i, n} (σ^{1}) ⟨ φ_{i, n} (σ^{2})⟩ .

A_{n, δ} : = {σ \in Σ_{n} : R (σ) \leq δ} .

A_{n, δ} : = {σ \in Σ_{n} : R (σ) \leq δ} .

n \to \infty lim sup E ⟨ \mathds 1_{A_{n, δ}} ⟩ \leq ε .

n \to \infty lim sup E ⟨ \mathds 1_{A_{n, δ}} ⟩ \leq ε .

B_{n, δ} : = {⟨ R_{1, 2} ⟩ \leq δ},

B_{n, δ} : = {⟨ R_{1, 2} ⟩ \leq δ},

n \to \infty lim sup P (B_{n, δ}) \leq ε .

n \to \infty lim sup P (B_{n, δ}) \leq ε .

H_{n} (σ) = p \geq 2 \sum \frac{β _{p}}{n ^{(p - 1) /2}} i_{1}, \dots, i_{p} = 1 \sum n g_{i_{1}, \dots, i_{p}} σ_{i_{1}} \dots σ_{i_{p}} .

H_{n} (σ) = p \geq 2 \sum \frac{β _{p}}{n ^{(p - 1) /2}} i_{1}, \dots, i_{p} = 1 \sum n g_{i_{1}, \dots, i_{p}} σ_{i_{1}} \dots σ_{i_{p}} .

p \geq 2 \sum β_{p}^{2} (1 + ε)^{p} < \infty for some ε > 0,

p \geq 2 \sum β_{p}^{2} (1 + ε)^{p} < \infty for some ε > 0,

ξ (1) = 1 and ξ (q) \geq 0 for all q \in [- 1, 1] .

ξ (1) = 1 and ξ (q) \geq 0 for all q \in [- 1, 1] .

R_{j, k} = ξ (R_{j, k}), where R_{j, k} : = \frac{1}{n} i = 1 \sum n σ_{i}^{j} σ_{i}^{k} \in [- 1, 1] .

R_{j, k} = ξ (R_{j, k}), where R_{j, k} : = \frac{1}{n} i = 1 \sum n σ_{i}^{j} σ_{i}^{k} \in [- 1, 1] .

\displaystyle\mu_{n}^{\beta}\Big{(}\bigcup_{j=1}^{k}\{\sigma^{k+1}\in\Sigma_{n}:|R_{j,k+1}|\geq\delta\}\Big{)}\geq 1-\varepsilon.

\displaystyle\mu_{n}^{\beta}\Big{(}\bigcup_{j=1}^{k}\{\sigma^{k+1}\in\Sigma_{n}:|R_{j,k+1}|\geq\delta\}\Big{)}\geq 1-\varepsilon.

P_{n} (σ (0) = 0)

P_{n} (σ (0) = 0)

P_{n} (σ (i) = y ∣ σ (i - 1) = x)

H_{n} (σ) = i = 1 \sum n g (i, σ (i)) = i = 1 \sum n x \in Z^{d} \sum g (i, x) \mathds 1_{{σ (i) = x}} .

H_{n} (σ) = i = 1 \sum n g (i, σ (i)) = i = 1 \sum n x \in Z^{d} \sum g (i, x) \mathds 1_{{σ (i) = x}} .

R_{1, 2} = \frac{1}{n} i = 1 \sum n \mathds 1_{{σ^{1} (i) = σ^{2} (i)}} .

R_{1, 2} = \frac{1}{n} i = 1 \sum n \mathds 1_{{σ^{1} (i) = σ^{2} (i)}} .

\displaystyle\begin{split}\mu^{\beta}_{n}\big{(}\big{\{}\sigma:\sigma(n)\in\{x_{1},\dots,x_{k}\}\big{\}}\big{)}\geq 1-\varepsilon.\end{split}

\displaystyle\begin{split}\mu^{\beta}_{n}\big{(}\big{\{}\sigma:\sigma(n)\in\{x_{1},\dots,x_{k}\}\big{\}}\big{)}\geq 1-\varepsilon.\end{split}

\displaystyle\mu_{n}^{\beta}\bigg{(}\bigcup_{j=1}^{k}\Big{\{}\sigma^{k+1}\in\Sigma_{n}:\frac{1}{n}\sum_{i=1}^{n}\mathds{1}_{\{\sigma^{k+1}(i)=\sigma^{j}(i)\}}\geq\delta\Big{\}}\bigg{)}\geq 1-\varepsilon.

\displaystyle\mu_{n}^{\beta}\bigg{(}\bigcup_{j=1}^{k}\Big{\{}\sigma^{k+1}\in\Sigma_{n}:\frac{1}{n}\sum_{i=1}^{n}\mathds{1}_{\{\sigma^{k+1}(i)=\sigma^{j}(i)\}}\geq\delta\Big{\}}\bigg{)}\geq 1-\varepsilon.

p (β) = {β^{2} /2 β_{c}^{2} /2 + (β - β_{c}) β_{c} β \leq β_{c} β > β_{c} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

MnLargeSymbols’164 MnLargeSymbols’171

Localization in Gaussian disordered systems at low temperature

Erik Bates

Department of Mathematics

University of California, Berkeley

1067 Evans Hall

Berkeley, CA 94720-3840

[email protected]

and

Sourav Chatterjee

Department of Statistics

Stanford University

Sequoia Hall, 390 Jane Stanford Way

Stanford, CA 94305

[email protected]

Abstract.

For a broad class of Gaussian disordered systems at low temperature, we show that the Gibbs measure is asymptotically localized in small neighborhoods of a small number of states. From a single argument, we obtain (i) a version of “complete” path localization for directed polymers that is not available even for exactly solvable models; and (ii) a result about the exhaustiveness of Gibbs states in spin glasses not requiring the Ghirlanda–Guerra identities.

Key words and phrases:

Replica overlap, Gaussian disorder, spin glasses, directed polymers, path localization

2010 Mathematics Subject Classification:

60G15, 60G17, 60K37, 82B44, 82D30, 82D60

E.B. was partially supported by NSF grants DGE-114747 and DMS-1902734

S.C. was partially supported by NSF grant DMS-1608249.

1. Introduction

A ubiquitous theme in statistical mechanics is to understand how a system behaves differently at high and low temperatures. In a disordered system, where the interactions between its elements are governed by random quantities, the strength of the disorder is determined by temperature. Namely, high temperatures mean the disorder is weak, and the system is likely to resemble a generic one based on entropy. On the other hand, low temperatures indicate strong disorder, which creates dramatically different behavior in which the system is constrained to a small set of states that are energetically favorable. In the latter case, this concentration phenomenon is often called “localization”.

A useful statistic in distinguishing different temperature regimes is the so-called “replica overlap”. That is, given the disorder, one can study the similarity of two independently observed states. If the disorder is strong, then these two states should closely resemble one another with good probability, since we believe the system is bound to a relatively small number of possible realizations. Some version of this statement has been rigorously established in a number of contexts, most famously in spin glass theory but also in the settings of disordered random walks and disordered Brownian motion. Unfortunately, it does not follow that the number of realizable states is small, but only that there is small number of states that are observed with positive probability.

In the present study, our entry point to this problem is to consider conditional overlap. Whereas previous results in the literature show the overlap distribution between two independent states has a nonzero component, we ask whether the same is true even if one conditions on the first state. That is, does a typical state always have positive expected overlap with an independent one? We show that for a broad class of Gaussian disordered systems, the answer is yes, the key implication being that the entire realizable state space is small. Specifically, there is an $O(1)$ number of states such that all but a negligible fraction of samples from the system will have positive overlap with one of these states.

The general setting, notation, motivation, and results are given in Sections 1.1–1.4, respectively. The consequences for spin glasses, directed polymers, and other Gaussian fields are discussed in Sections 1.5 and 1.6.

1.1. Model and assumptions

Let $(\Omega,\mathcal{F},\mathbb{P})$ be an abstract probability space, and $(\Sigma_{n})_{n\geq 1}$ a sequence of Polish spaces equipped respectively with probability measures $(P_{n})_{n\geq 1}$ . For each $n$ , we consider a centered Gaussian field $H_{n}$ indexed by $\Sigma_{n}$ and defined on $\Omega$ . Viewing this field as a Hamiltonian, we have the associated Gibbs measure at inverse temperature $\beta$ :

[TABLE]

Our results concern the relationship between the free energy,

[TABLE]

and the covariance structure of $H_{n}$ . We make the following assumptions:

•

There is a deterministic function $p:\mathbb{R}\to\mathbb{R}$ such that

[TABLE]

•

For every $\sigma\in\Sigma_{n}$ ,

[TABLE]

•

For every $\sigma^{1},\sigma^{2}\in\Sigma_{n}$ ,

[TABLE]

where $\mathscr{E}_{n}$ is a nonnegative constant tending to [math] as $n\to\infty$ .

•

For each $n$ , there exist measurable real-valued functions $(\varphi_{i,n})_{i=1}^{\infty}$ on $\Sigma_{n}$ and i.i.d. standard normal random variables $(g_{i,n})_{i=1}^{\infty}$ defined on $\Omega$ such that for each $\sigma\in\Sigma_{n}$ , with $\mathbb{P}$ -probability $1$ ,

[TABLE]

where the series on the right converges in $L^{2}(\mathbb{P})$ .

Remark 1.1.

In all applications of interest (see Section 1.5), the hypothesis (• ‣ 1.1) is trivially satisfied with $\mathscr{E}_{n}=0$ . Nevertheless, we assume throughout only that $\mathscr{E}_{n}\to 0$ (at any rate). This modest relaxation is made so our results can apply to slightly more general models, for instance perturbations of the standard models we will soon describe.

Remark 1.2.

The condition (• ‣ 1.1) is very mild: For example, it always holds when $\Sigma_{n}$ is finite. More generally, a sufficient condition for the existence of a representation (• ‣ 1.1) is that $\Sigma_{n}$ is compact in the metric defined by $H_{n}$ (namely, the metric that defines the distance between $\sigma$ and $\sigma^{\prime}$ as the $L^{2}$ distance between the random variables $H_{n}(\sigma)$ and $H_{n}(\sigma^{\prime})$ ). For a proof of this standard result, see [1, Theorem 3.1.1]. Furthermore, in all applications of interest, $H_{n}$ will actually be explicitly defined using a sum of the form (• ‣ 1.1).

1.2. Notation

Unless stated otherwise, “almost sure” and “in $L^{\alpha}$ ” statements are with respect to $\mathbb{P}$ . We will use $E_{n}$ and $\mathbb{E}$ to denote expectation with respect to $P_{n}$ and $\mathbb{P}$ , respectively. Absent any decoration, $\langle\cdot\rangle$ will always denote expectation with respect to $\mu_{n}^{\beta}$ , meaning

[TABLE]

At various points in the paper, we will decorate $\langle\cdot\rangle$ to denote expectation with respect to some perturbation of $\mu_{n}^{\beta}$ . The type of perturbation will change between sections. The symbols $\sigma^{j}$ , $j=1,2,\dots$ , shall denote independent samples from $\mu_{n}^{\beta}$ if appearing within $\langle\cdot\rangle$ , or from $P_{n}$ if appearing within $E_{n}(\cdot)$ . We will refer to the vector ${\boldsymbol{g}}_{n}=(g_{i,n})_{i=1}^{\infty}$ as the disorder or random environment. Sometimes we will consider multiple environments at the same time, which will necessitate that we write $\mu_{n,{\boldsymbol{g}}_{n}}^{\beta}$ instead of $\mu_{n}^{\beta}$ to emphasize the dependence on the environment ${\boldsymbol{g}}_{n}$ .

In the sequel, $\sum_{i}$ will always mean $\sum_{i=1}^{\infty}$ , and we will condense our notation to $\varphi_{i}=\varphi_{i,n}(\sigma)$ when we are dealing with some fixed $n$ . Similarly, $g_{i,n}$ will be shortened to $g_{i}$ and ${\boldsymbol{g}}_{n}$ will be shortened to ${\boldsymbol{g}}$ . Also, $C(\cdot)$ will indicate a positive constant that depends only on the argument(s). In particular, no such constant depends on ${\boldsymbol{g}}$ or $n$ . We will not concern ourselves with the precise value, which may change from line to line.

1.3. Motivation

Our results will be stated in terms of the correlation or overlap function,

[TABLE]

Note that (• ‣ 1.1) and (• ‣ 1.1) imply

[TABLE]

We will often abbreviate $\mathcal{R}(\sigma^{j},\sigma^{k})$ to $\mathcal{R}_{j,k}$ .

The Gaussian process $(H_{n}(\sigma))_{\sigma\in\Sigma_{n}}$ naturally defines a (pseudo)metric $\rho$ on $\Sigma_{n}$ , given by

[TABLE]

Given the metric topology, we can study the so-called “energy landscape” of $\beta H_{n}$ on $\Sigma_{n}$ . The geometry of this landscape is intimately related to the free energy. By Jensen’s inequality,

[TABLE]

which in particular implies $p(\beta)\leq\beta^{2}/2$ . In general, whether or not this inequality is strict determines the nature of the energy landscape: In order for $p(\beta)=\beta^{2}/2$ , the fluctuations of $\log Z_{n}(\beta)$ must be relatively small so that the Jensen gap in (1.2) is $o(1)$ . This behavior arises when the Gaussian deviations of $\beta H_{n}(\sigma)$ are washed out by the entropy of $P_{n}$ , creating a more or less flat landscape. On the other hand, if $p(\beta)<\beta^{2}/2$ , then these deviations will have overcome the entropy of $P_{n}$ , producing large peaks and valleys where $\beta H_{n}(\sigma)$ is exceptionally positive or negative. From a physical perspective, this latter scenario is more interesting, as these peaks can account for an exponentially vanishing fraction of the state space even as their union accounts for a non-vanishing fraction of the mass of $\mu_{n}^{\beta}$ . The primary goal of this paper is to give a sufficient condition for when (in a sense Theorem 1.3 makes precise) $\mu_{n}^{\beta}$ places all of its mass on this union of peaks.

Suppose that $p(\cdot)$ is differentiable at $\beta\geq 0$ . Using Gaussian integration by parts, it is not difficult to show (as we do in Corollary 3.10) that

[TABLE]

This identity has been observed before (e.g. see [3, 27, 63, 47], [19, Lemma 7.1], and [24, Theorem 6.1]). For this reason, the condition in which we are interested is $p^{\prime}(\beta)<\beta$ . To improve upon (1.3), a first step is to show that if $\mathbb{E}\langle\mathcal{R}_{1,2}\rangle$ is bounded away from [math], then the random variable $\langle\mathcal{R}_{1,2}\rangle$ is itself stochastically bounded away from [math]. This is the content of Theorem 1.5. The more substantial contribution of this paper, however, is to bootstrap this result to a proof of Theorem 1.4, which roughly says that $\langle\mathcal{R}_{1,2}\rangle$ is stochastically bounded away from [math] even conditional on $\sigma^{1}$ .

It follows from Corollary 3.10 that $p^{\prime}(\beta)<\beta$ implies $p(\beta)<\beta^{2}/2$ , but it is natural to ask whether the two conditions are equivalent. This equivalence is true for spin glasses [63, 47] and is believed to be true for directed polymers [24, Conjecture 6.1]. But at the level of generality considered in this paper, we are not aware of any conjecture. In any case, for the examples we consider in Section 1.5, both conditions will be true for sufficiently large $\beta$ .

1.4. Results

Our main result is Theorem 1.3, stated below. It says that at low temperatures, one can find a finite number of (random) states such that almost any sample from the Gibbs measure will have positive overlap with at least one of them. To state this precisely, let us define the sets

[TABLE]

In terms of the metric $\rho$ defined in (1.1), this is just the ball of radius $1-\delta$ centered at $\sigma$ . Typically, such balls have vanishingly small size under $P_{n}$ as $n\to\infty$ , which should be contrasted with the following behavior of the Gibbs measure.

Theorem 1.3.

Assume (• ‣ 1.1)–(• ‣ 1.1). If $\beta\geq 0$ is a point of differentiability for $p(\cdot)$ , and $p^{\prime}(\beta)<\beta$ , then for every $\varepsilon>0$ , there exist integers $k=k(\beta,\varepsilon)$ and $n_{0}=n_{0}(\beta,\varepsilon)$ and a number $\delta=\delta(\beta,\varepsilon)>0$ such that the following is true for all $n\geq n_{0}$ . With $\mathbb{P}$ -probability at least $1-\varepsilon$ , there exist $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ such that

[TABLE]

It is worth noting that in some cases, such as the directed polymer model defined in Section 1.5.2, it is possible (although unproven) that $k$ can be taken equal to $1$ if $\delta$ is chosen sufficiently small. For other models, however, such as polymers on trees or the Random Energy Model discussed in Section 1.6, $k$ will necessarily diverge as $\varepsilon\to 0$ .

We will derive Theorem 1.3 as a corollary of Theorem 1.4, stated below. In fact, Theorem 1.3 is actually equivalent to Theorem 1.4, although the latter has a less transparent statement, which is why we have stated Theorem 1.3 as our main result.

Theorem 1.4 concerns the following function on $\Sigma_{n}$ . For given $\sigma^{1}\in\Sigma_{n}$ , we will write the conditional expectation of $\mathcal{R}_{1,2}$ as

[TABLE]

(Note that the expectation $\langle\cdot\>|\>\sigma^{1}\rangle$ can be exchanged with the sum because of Fubini’s theorem, in light of (• ‣ 1.1).) Given $\delta>0$ , we consider the set

[TABLE]

With this notation, the quantity $\langle\mathds{1}_{\mathcal{A}_{n,\delta}}\rangle$ is the probability that a state sampled from $\mu_{n}^{\beta}$ has expected overlap at most $\delta$ with an independent sample from $\mu_{n}^{\beta}$ . Theorem 1.4 says that at low temperatures and for small $\delta$ , this probability is typically small.

Theorem 1.4.

Assume (• ‣ 1.1)–(• ‣ 1.1). If $\beta\geq 0$ is a point of differentiability for $p(\cdot)$ , and $p^{\prime}(\beta)<\beta$ , then for every $\varepsilon>0$ , there exists $\delta=\delta(\beta,\varepsilon)>0$ sufficiently small that

[TABLE]

To prove Theorem 1.4, we first have to prove a weaker theorem stated below. This result considers the following event in the $\sigma$ -algebra $\mathcal{F}$ ,

[TABLE]

and shows that its probability is small at low temperature.

Theorem 1.5.

Assume (• ‣ 1.1)–(• ‣ 1.1). If $\beta\geq 0$ is a point of differentiability for $p(\cdot)$ , and $p^{\prime}(\beta)<\beta$ , then for every $\varepsilon>0$ , there exists $\delta=\delta(\beta,\varepsilon)>0$ sufficiently small such that

[TABLE]

Theorem 1.5 is proved in Section 4, Theorem 1.4 in Section 5, and the equivalence of Theorems 1.3 and 1.4 in Section 6. In Section 3, we provide some general facts that are needed in the main arguments. A detailed sketch of the proof technique is given in Section 2. We will often simplify notation by writing $\mathcal{A}_{\delta}$ and $B_{\delta}$ , where the dependence on $n$ is understood and will not be a source of confusion.

1.5. Applications

For many applications, it would suffice to consider $\Sigma_{n}$ which is finite for every $n$ . Other applications, however, such as spherical spin glasses or directed polymers with a reference walk of unbounded support, require $\Sigma_{n}$ to be infinite. It is for this reason that we have stated the setting and results in the generality seen above. Now we discuss specific models of interest.

1.5.1. Spin glasses

Let $\Sigma_{n}=\{\pm 1\}^{n}$ (Ising case) or $\Sigma_{n}=\{\sigma\in\mathbb{R}^{n}:\|\sigma\|_{2}=\sqrt{n}\}$ (spherical case), and take $P_{n}$ to be uniform measure on $\Sigma_{n}$ . In the mean-field models, the Hamiltonian is of the form

[TABLE]

We will assume

[TABLE]

which is more restrictive than what we require but standard in the literature. Standard applications of Gaussian concentration show that $|F_{n}(\beta)-\mathbb{E}F_{n}(\beta)|\to 0$ almost surely and in $L^{1}$ . Assumption (• ‣ 1.1) then follows from the convergence of $\mathbb{E}F_{n}(\beta)\to p(\beta)$ , where $p(\beta)$ is given by a formula depending on the model. In the Ising case, there is the celebrated Parisi formula [53, 54], proved by Talagrand [62] for even-spin models, building on the seminal work of Guerra [40]. It was later extended by Panchenko [51] to general mixed $p$ -spins. For the spherical model, there is a simpler and elegant formula predicted by Crisanti and Sommers [32], and proved by Talagrand [61] and Chen [22].

To accommodate assumptions (• ‣ 1.1) and (• ‣ 1.1), one should assume the function $\xi(q)\coloneqq\sum_{p\geq 2}\beta_{p}^{2}q^{p}$ satisfies

[TABLE]

This is because

[TABLE]

Note that the second assumption in (1.11) is automatic if $\beta_{p}=0$ for all odd $p$ . When $\xi(q)=q^{2}$ , (1.9) is the classical Sherrington–Kirkpatrick (SK) model [57] if $\Sigma_{n}=\{\pm 1\}^{n}$ , or the spherical SK model [44] if $\Sigma_{n}=\{\sigma\in\mathbb{R}^{n}:\|\sigma\|_{2}=\sqrt{n}\}$ .

In the spin glass literature, $R_{1,2}$ is the usual replica overlap that is studied as an order parameter for the system [59]. Roughly speaking, $R_{1,2}$ converges to [math] when $p(\beta)=\beta^{2}/2$ , but converges in law to a non-trivial distribution when $p(\beta)<\beta^{2}/2$ . In the latter case, the model exhibits what is known as replica symmetry breaking (RSB). If the limiting distribution of $R_{1,2}$ , called the Parisi measure, contains $k+1$ distinct atoms (one of which must be [math] [5]), then $\xi$ is said to be $k$ RSB. For instance, spherical pure $p$ -spin models are $1$ RSB for large $\beta$ [52], and it was recently shown that some spherical mixed spin models are $2$ RSB at zero temperature [9]. In the Ising case, however, the Parisi measure is expected to have an infinite support throughout the low-temperature phase (with [math] in the support but not as an atom; see [17, Page 15]), a behavior referred to as full-RSB (FRSB). Proving such a statement is a problem of great interest and has been solved at zero temperature [7]. For spherical models, the situation is somewhat clearer; in [23], sufficient conditions were given for both $1$ RSB and FRSB, again at zero temperature.

The simplest type of symmetry breaking, $1$ RSB, admits the following heuristic picture. The state space $\Sigma_{n}$ is (from the perspective of $\mu_{n}^{\beta}$ ) separated into many orthogonal parts called “pure states”, within which the intra-cluster overlap concentrates on some positive value $q>0$ . In the $2$ RSB picture, the pure states are not necessarily orthogonal, but rather grouped together into larger clusters which are themselves orthogonal. In this case, the overlap could be $q$ (same pure state), $q^{\prime}\in(0,q)$ (same cluster but different pure state), or [math] (different clusters). The complexity increases in the same fashion for general $k$ RSB. In FRSB, the clusters become infinitely nested, yielding a continuous spectrum of possible overlaps while maintaining “ultrametric” structure [49]. In any case, though, there should be asymptotically no part of the state space which is orthogonal to everything; that is, the pure states exhaust $\mu_{n}^{\beta}$ .

Absent the intricate hierarchical picture described above, the following rephrasing of Theorem 1.3 confirms this idea.

Theorem 1.6.

Assume (1.10) and (1.11), and that $\beta\geq 0$ is a point of differentiability for $p(\cdot)$ such that $p^{\prime}(\beta)<\beta$ . Then for every $\varepsilon>0$ , there exist integers $k=k(\beta,\varepsilon)$ and $n_{0}=n_{0}(\beta,\varepsilon)$ and a number $\delta=\delta(\beta,\varepsilon)>0$ such that the following is true for all $n\geq n_{0}$ . With $\mathbb{P}$ -probability at least $1-\varepsilon$ , there exist $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ such that

[TABLE]

The proof of the above Theorem follows simply from Theorem 1.3 and the observation that by (1.10), $\xi$ is continuous at [math].

Under strong assumptions on $\xi$ and the overlap distribution, namely the (extended) Ghirlanda–Guerra identities, much more precise results were proved by Talagrand [64, Theorem 2.4] and later Jagannath [42, Corollary 2.8]. For spherical pure spin models, similar results were proved by Subag [58, Theorem 1]. An advantage of our approach, beyond its generality, is that our assumptions on $\xi$ are elementary to check and fairly loose (they include all even spin models), and the temperature condition $p^{\prime}(\beta)<\beta$ is explicit and sharp.

While the literature on replica overlaps in spin glasses is vast, the reader will find much information in [45, 65, 66, 50]; see also [43] and references therein.

1.5.2. Directed polymers

Given a positive integer $d$ , let $\Sigma_{n}$ be the set of all maps from $\{0,1,\ldots,n\}$ into $\mathbb{Z}^{d}$ , and let $P_{n}$ be the law, projected onto $\Sigma_{n}$ , of a homogeneous random walk on $\mathbb{Z}^{d}$ starting at the origin. That is, there is some probability mass function $K$ on $\mathbb{Z}^{d}$ such that

[TABLE]

Let $(g(i,x):i\geq 1,x\in\mathbb{Z}^{d})$ be i.i.d. standard normal random variables. The Hamiltonian for the model of directed polymers in Gaussian environment is then given by

[TABLE]

In this case, the overlap between two paths is the fraction of time they intersect:

[TABLE]

The assumption (• ‣ 1.1) holds for any $K$ [14, Section 2], although typically $P_{n}$ is taken to be standard simple random walk; all the references below refer to this case. Alternatively, one can consider point-to-point polymer measures, meaning the endpoint of the polymer is fixed. This case is studied in [55, 39] and accommodates the same structure as above, up to changing the reference measure $P_{n}$ .

Notice that the identity (1.3) immediately implies $\lim_{n\to\infty}\mathbb{E}\langle\mathcal{R}_{1,2}\rangle>0$ when $p^{\prime}(\beta)<\beta$ . Theorem 1.5 goes a step further, showing that the random variable $\langle\mathcal{R}_{1,2}\rangle$ is itself stochastically bounded away from 0. For a certain class of bounded random environments, a quantitative version of Theorem 1.5 was proved by Chatterjee [21], but Theorem 1.4 is the first of its kind. Unlike some other conjectured polymer properties, the statement (1.7) has not been verified for the so-called exactly solvable models in $d=1$ [56, 31, 46, 13, 67]. For heavy-tailed environments, a stronger notion of localization is considered in [8, 68] and also discussed in [37, 16]. Historically, studying pathwise localization has found somewhat greater success in the context of continuous space-time polymer models [29, 30, 26, 25].

For polymers in Gaussian environment, it is known (see [24, Proposition 2.1(iii)]) that $p^{\prime}$ is bounded from above by a constant, and so $\mathbb{E}\langle\mathcal{R}_{1,2}\rangle\to 1$ as $\beta\to\infty$ by (1.3). (While convexity guarantees $p(\cdot)$ is differentiable almost everywhere, it is an open problem to show that $p(\cdot)$ is everywhere differentiable, let alone analytic away from the critical value separating the high and low temperature phases.) In this sense, the polymer measure becomes completely localized near the maximizer of $H_{n}(\cdot)$ as $\beta\to\infty$ . A main motivation for the present study was to formulate a version of “complete localization” for fixed $\beta$ in the low-temperature regime.

In [69, 15], complete localization was phrased in terms of the endpoint distribution: the law of $\sigma(n)$ under $\mu^{\beta}_{n}$ . Loosely speaking, what was shown is that if $p(\beta)<\beta^{2}/2$ , then with probability at least $1-\varepsilon$ , one can find sufficiently many (independent of $n$ ) random vertices $x_{1},\dots,x_{k}$ in $\mathbb{Z}^{d}$ so that

[TABLE]

This behavior is called “asymptotic pure atomicity”, referring to the fact that even as $n$ grows large, the endpoint distribution remains concentrated on an $O(1)$ number of sites (rather than diffuse polynomially as in simple random walk). This is analogous to the results of this paper, except that the endpoint statistic has been used to reduce the state space to $\mathbb{Z}^{d}$ . The pathwise localization in Theorem 1.3 describes a more global phenomenon occurring in the original state space $\Sigma_{n}$ . Rephrased below, it says that up to arbitrarily small probabilities, the Gibbs measure is concentrated on paths intersecting one of a few distinguished paths a positive fraction of the time.

Theorem 1.7.

Assume (1.12) and that $\beta\geq 0$ is a point of differentiability for $p(\cdot)$ such that $p^{\prime}(\beta)<\beta$ . Then for every $\varepsilon>0$ , there exist integers $k=k(\beta,\varepsilon)$ and $n_{0}=n_{0}(\beta,\varepsilon)$ and a number $\delta=\delta(\beta,\varepsilon)>0$ such that the following is true for all $n\geq n_{0}$ . With $\mathbb{P}$ -probability at least $1-\varepsilon$ , there exist paths $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ such that

[TABLE]

In Section 7, we demonstrate that path localization does not occur in the atomic sense (1.14). That is, any bounded number of paths will have a total mass under $\mu^{\beta}_{n}$ that decays to [math] as $n\to\infty$ . For this reason, the definitions from [69, 15] of complete localization for the endpoint are inadequate for path localization, necessitating a statement in terms of overlap. This distinguishes the lattice polymer model from its mean-field counterpart on regular trees, which is simply the statistical mechanical version of branching random walk [36, 24]. For those models, the endpoint distribution on the leaves of the tree is obviously equivalent to the Gibbs measure because each leaf is the termination point of a unique path. Moreover, the results of [15] can be interpreted equally well (and improved upon) in that setting (see [12, 41]), and so we will not elaborate on the fact that polymers on trees also fit into the framework of this paper.

1.6. Other Gaussian fields

Here we mention several other models to which our results apply but for which they are not new. Indeed, each model below is known to exhibit Poisson–Dirichlet statistics for the masses assigned by $\mu^{\beta}_{n}$ to the “peaks” discussed in the motivating Section 1.3. In particular, asymptotically no mass is given to states having vanishing expected overlap with an independent sample.

•

Derrida’s Random Energy Model (REM) [33, 34] is set on the hypercube $\Sigma_{n}=\{\pm 1\}^{n}$ with uniform measure, and has the simplest possible covariance structure: $\mathcal{R}_{j,k}=\delta_{j,k}$ . With $\beta_{\mathrm{c}}=\sqrt{2\log 2}$ , the following formula holds [18, Theorem 9.1.2]:

[TABLE]

See also [60, Chapter 1], in particular Theorem 1.2.1.

•

The generalized random energy models have non-trivial covariance structure [35], and can be tuned to have an arbitrary number of phase transitions. The condition $p^{\prime}(\beta)<\beta$ is satisfied as soon as the first phase transition occurs. See also [18, Chapter 10].

•

Finally, in [4] Arguin and Zindy studied a discretization of a log-correlated Gaussian field from [11, 10] which has the same free energy as the REM. Their particular model had the technical complication of correlations not following a tree structure, unlike for instance the discrete Gaussian free field.

1.7. Open problems

There are a number of open questions which, if solved, would enhance the theory presented in this paper. A partial list is the following.

(1)

Understand conditions under which the number of localizing regions is exactly one. As mentioned before, this requires more conditions than (• ‣ 1.1)–(• ‣ 1.1), because it does not hold for some models (such as REM), whereas it is supposed to hold for many others. 2. (2)

A close cousin of the above problem is to understand conditions under which $\mathcal{R}_{1,2}$ is itself guaranteed to be away from zero with high probability. This would have important implications about the FRSB picture in mean-field spin glasses and path localization in directed polymers. 3. (3)

Obtain a good quantitative bound on $\delta$ in terms of $\varepsilon$ in Theorem 1.4. Our proof gives a very poor bound, since it is based on an iterative argument similar to those used in extremal combinatorics (see the proof sketch in Section 2.2). 4. (4)

For directed polymers, prove a stronger theorem about path localization that says a typical path localizes within a narrow neighborhood of one or more fixed paths, rather than saying that a typical path has nonzero intersection with one or more fixed paths. 5. (5)

Prove more general versions of Theorems 1.3, 1.4 and 1.5 that do not require the condition (• ‣ 1.1) guaranteeing asymptotically nonnegative correlations. This would allow the theory to include other models of interest, such as the Edwards–Anderson model [38] of lattice spin glasses. It is important to note, however, that the hypotheses and conclusions of these more general theorems may require adjustment in order to be physically meaningful. 6. (6)

For any finite $\beta$ , prove estimates that stochastically bound $\langle\mathcal{R}_{1,2}\rangle$ away from $1$ . More ambitiously, determine conditions which guarantee that $\langle\mathcal{R}_{1,2}\rangle$ concentrates around its expectation as $n\to\infty$ . 7. (7)

Even when the spin glass correlation function $\xi$ takes negative values (recall that $\xi(R_{1,2})=\mathcal{R}_{1,2}$ ), it is possible for the Gibbs measure to concentrate on a set such that $R_{1,2}\geq 0$ . This is Talagrand’s positivity principle and is known to hold when the extended Ghirlanda–Guerra identities are satisfied; see [66, Section 12.3] or [50, Section 3.3]. Perhaps the methods of this paper can be adapted to use this input rather than the condition $\xi\geq 0$ .

2. Proof sketches

The proofs of Theorems 1.4 and 1.5 are long, but they contain ideas that may be useful for other problems. Therefore, we have included this proof-sketch section which, while still rather lengthy, distills the arguments to their central ideas. It introduces some of the notations that will be used later in the manuscript; however, these notations will be reintroduced in the later sections, so it is safe to skip directly to Section 3 should the reader decide to do so.

2.1. Proof sketch of Theorem 1.5

For simplicity, let us assume that the representation (• ‣ 1.1) consists of only finitely many terms:

[TABLE]

Following the argument described below, the general case is handled by some routine calculations (made in Section 3.1) to check that sending $N\to\infty$ poses no issues.

Given (1.3), it is clear that $p^{\prime}(\beta)<\beta$ would imply (1.8) if we knew that $\langle\mathcal{R}_{1,2}\rangle$ concentrates around its mean as $n\to\infty$ . Unfortunately, this may not be true in general. Therefore, as a way of artificially imposing concentration, we let the environment evolve as an Ornstein–Uhlenbeck (OU) flow, and then eventually take an average over a short time interval. Formally, this means we consider

[TABLE]

where ${\boldsymbol{W}}(\cdot)=(W_{i}(\cdot))_{i=1}^{N}$ are independent Brownian motions that are also independent of ${\boldsymbol{g}}={\boldsymbol{g}}_{0}$ . Recall the OU generator $\mathcal{L}\coloneqq\Delta-{\boldsymbol{x}}\cdot\nabla$ , and the fact that $\mathbb{E}\mathcal{L}f({\boldsymbol{g}})=0$ for any $f$ with suitable regularity. By expanding $f$ in an orthonormal basis of eigenfunctions of $\mathcal{L}$ , and expressing both $\mathcal{L}f({\boldsymbol{g}}_{t})$ and $\mathbb{E}\|\nabla f({\boldsymbol{g}})\|^{2}$ using the coefficients from this expansion, one can show that

[TABLE]

This inequality, established in Lemma 4.3, provides the proof’s essential estimate when applied to $f({\boldsymbol{g}})=F_{n}(\beta)$ . For this $f$ , it is easy to verify that $\mathbb{E}\|\nabla f({\boldsymbol{g}})\|^{2}=O(1/n)$ , and

[TABLE]

where $\langle\mathcal{R}_{1,2}\rangle_{t}$ and $F_{n,t}(\beta)$ are the expected overlap and free energy, respectively, in the environment ${\boldsymbol{g}}_{t}$ . Moreover, from standard methods (worked out in Section 3.2), it follows that $\frac{\partial}{\partial\beta}F_{n,t}(\beta)\approx p^{\prime}(\beta)$ with high probability. Combining these observations about $f$ with the general variance estimate (2.2), we arrive at

[TABLE]

In other words, averaging $\langle\mathcal{R}_{1,2}\rangle_{t}$ over a long enough interval, but whose size is still $O(1/n)$ , results in a value close to the expectation suggested by (1.3). We choose $T=T(\varepsilon)$ large enough depending on $\varepsilon$ , which determines the level of precision required in (2.3).

Next comes the most crucial step in the proof, where we show that if $\langle\mathcal{R}_{1,2}\rangle=\langle\mathcal{R}_{1,2}\rangle_{0}\leq\delta$ for some small $\delta$ , then for each $t\in[0,T(\varepsilon)/n]$ , the quantity $\langle\mathcal{R}_{1,2}\rangle_{t}$ is also small with high probability. If $p^{\prime}(\beta)<\beta$ , this leads to a contradiction to (2.3) if $\delta$ is small enough. To avoid this contradiction, the probability of $\langle\mathcal{R}_{1,2}\rangle\leq\delta$ happening in the first place must be small, which is what we want to show.

To demonstrate our crucial claim, we consider any $t=T/n$ , where $T\leq T(\varepsilon)$ and $n$ is large. First, note that

[TABLE]

where $B_{t}$ comes from the Brownian part of (2.1), and $A_{t}$ comes from the initial environment:

[TABLE]

Since $t=T/n\ll 1$ , we have

[TABLE]

By standard arguments (again presented in Section 3.2), $H_{n}(\sigma^{1})/n$ and $H_{n}(\sigma^{2})/n$ are both close to $p^{\prime}(\beta)$ with high probability under the Gibbs measure. Thus, for fixed $t$ , the random variable $A_{t}$ behaves like a constant inside $\langle\cdot\rangle$ . Consequently, we can reduce (2.4) to

[TABLE]

Now let $h_{i}:=W_{i}(\operatorname{e}^{2t}-1)/\sqrt{\operatorname{e}^{2t}-1}$ , so that $h_{i}\sim\mathcal{N}(0,1)$ . Again since $t=T/n\ll 1$ , we have

[TABLE]

Thus, if $\mathbb{E}_{{\boldsymbol{h}}}$ denotes expectation in ${\boldsymbol{h}}=(h_{1},\ldots,h_{N})$ only, then

[TABLE]

In the event that $\langle\mathcal{R}_{1,2}\rangle$ is small, the assumption (• ‣ 1.1) implies that $\mathcal{R}_{1,2}\approx 0$ with high probability under the Gibbs measure. Therefore, conditional on this event (which depends only on ${\boldsymbol{g}}$ , not ${\boldsymbol{h}}$ ), we have

[TABLE]

By a similar argument, we also have

[TABLE]

In summary, if $\langle\mathcal{R}_{1,2}\rangle\approx 0$ , then

[TABLE]

and thus, with high probability,

[TABLE]

By following exactly the same steps with $\langle\mathcal{R}_{1,2}\operatorname{e}^{\beta B_{t}}\rangle$ instead of $\langle\operatorname{e}^{\beta B_{t}}\rangle$ , we show that

[TABLE]

Combining (2.5)–(2.7), we conclude that if $\langle\mathcal{R}_{1,2}\rangle\approx 0$ , then $\langle\mathcal{R}_{1,2}\rangle_{t}\approx\langle\mathcal{R}_{1,2}\rangle\approx 0$ .

2.2. Proof sketch of Theorem 1.4

We begin this proof sketch where the previous section left off, namely the observation that if the average overlap $\langle\mathcal{R}_{1,2}\rangle$ in environment ${\boldsymbol{g}}$ is small, then Gibbs averages of the type in (2.6) and (2.7) are well concentrated. By the same type of argument — see Lemma 4.5(b) and (5.11) — we can say something more general: no matter the size of $\langle\mathcal{R}_{1,2}\rangle$ , these averages remain concentrated so as long as they are restricted to the set $\mathcal{A}_{n,\delta}$ defined in (1.6), where conditional average overlap $\langle\mathcal{R}_{1,2}\>|\>\sigma^{1}\rangle$ is small. That is, if $\widetilde{H}_{n}$ is an independent Hamiltonian (i.e. defined with ${\boldsymbol{h}}$ , an independent copy of ${\boldsymbol{g}}$ ), then with high probability,

[TABLE]

In fact, the opposite is true off of the set $\mathcal{A}_{n,\delta}$ . If $\langle\mathcal{R}_{1,2}\rangle$ is not too small relative to $\delta$ , then the fluctuations of $\langle\mathds{1}_{\mathcal{A}_{n,\delta}^{\mathrm{c}}}\operatorname{e}^{\frac{\beta}{\sqrt{n}}\widetilde{H}_{n}(\sigma)}\rangle$ due to ${\boldsymbol{h}}$ are $\Omega(1)$ as $n\to\infty$ . This is again an elementary calculation; see (5.8)–(5.12).

On the other hand, a convenient consequence of Gaussianity is that $H_{n}+\frac{1}{\sqrt{n}}\widetilde{H}_{n}\stackrel{{\scriptstyle\mathrm{d}}}{{=}}\sqrt{1+\frac{1}{n}}H_{n}$ . That is, an environment perturbation is equivalent in distribution to a temperature perturbation. (In fact, this simple observation underlies the Aizenman–Contucci identities [2], the predecessor of the Ghirlanda–Guerra identities.) Therefore, if we keep track of the dependence on $\beta$ by writing $\langle\cdot\rangle_{\beta}$ , and abbreviate $\mathcal{A}_{n,\delta}$ to $\mathcal{A}_{\delta}$ , we have

[TABLE]

By rewriting the denominator in a trivial way and using our observation (2.8), we see that with high probability,

[TABLE]

In the last expression above, the only term depending on ${\boldsymbol{h}}$ is the second summand in the denominator. Therefore, Jensen’s inequality gives

[TABLE]

A more careful analysis shows that the Jensen gap is large enough that we can replace the lower bound by $(1+\gamma)\langle\mathds{1}_{\mathcal{A}_{\delta}}\rangle_{\beta}-C\sqrt{\delta}$ , where $\gamma$ and $C$ are positive constants. One important caveat is that this stronger lower bound is valid only when $\langle\mathcal{R}_{1,2}\rangle$ is not too small (so that the fluctuations of $\langle\mathds{1}_{\mathcal{A}^{\mathrm{c}}_{\delta}}\operatorname{e}^{\frac{\beta}{\sqrt{n}}\widetilde{H}_{n}(\sigma)}\rangle_{\beta}$ are order $1$ ), which is why Theorem 1.5 is needed beforehand. Reading (2.9)–(2.11) from start to end, we obtain

[TABLE]

While the above inequality is the most important step of the proof, a key shortcoming is that the set $\mathcal{A}_{\delta}$ is defined using $\langle\cdot\rangle_{\beta}$ rather than $\langle\cdot\rangle_{\beta\sqrt{1+\frac{1}{n}}}$ . Since we will want to apply the inequality iteratively, we need to replace $\mathcal{A}_{\delta}$ on the left-hand side by $\mathcal{A}_{\delta,1}$ , where

[TABLE]

To make this replacement, we produce a complementary inequality, again using the equivalence of environment/temperature perturbations. For simplicity, let us assume $\mathcal{R}_{1,2}\geq 0$ , which is essentially realized by (• ‣ 1.1) for large $n$ . Observe that

[TABLE]

where we have applied Cauchy–Schwarz (and then $\mathcal{R}_{1,2}^{2}\leq\mathcal{R}_{1,2}\leq 1$ ) and Jensen’s inequality (using the convexity of $x\mapsto x^{-1}$ ). When $\sigma^{1}\in\mathcal{A}_{\delta}=\mathcal{A}_{\delta,0}$ , the final expression is at most $X\sqrt{\delta}$ , and so the inequality implies $\mathcal{A}_{\delta,0}\subset\mathcal{A}_{X\sqrt{\delta},1}$ . Now, the random variable $X$ has moments of all orders (admitting simple upper bounds), and so it can be essentially regarded as a large constant. In particular, when $\delta$ is small, we will have $X\leq\delta^{-1/4}$ with high probability, in which case $\mathcal{A}_{\delta,0}\subset\mathcal{A}_{\delta^{1/4},1}$ . Combining these ideas with (2.12), we show

[TABLE]

More generally, for any integer $k\geq 1$ ,

[TABLE]

This inequality can now be iterated, with $\delta$ being replaced by $\delta^{1/4}$ , then $\delta^{1/16}$ , and so on, as the expectation on the left is inserted on the right in the next iteration.

Since the left-hand side of (2.13) is always at most $1$ , we clearly obtain a contradiction if $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta,0}}\rangle_{\beta}$ is larger than $x$ , where $x$ is the solution to $x=(1+\gamma)x-C\sqrt{\delta}$ . This would complete the proof of Theorem 1.4 if not for the subtlety that $\gamma$ actually depends on $k$ in a non-trivial way. Nevertheless, (2.13) can still be used to derive a contradiction of the same spirit unless $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta^{1/4^{k}},k}}\rangle$ is small for some $k\leq K$ , where $K$ is large and tends to infinity as $\varepsilon\to 0$ , but crucially does not depend on $n$ . This approach is reminiscent of tower-type arguments in extremal combinatorics.

Replacing $\delta$ by $\delta^{4^{k}}$ , we can then say $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta,k}}\rangle$ is small. Finally, to deduce the smallness of $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta,0}}\rangle$ from the smallness of $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta,k}}\rangle$ , we make use of standard arguments showing that if an event is rare at inverse temperature $\beta$ , then it remains rare at inverse temperature $\beta+O(1/n)$ .

2.3. Proof sketch of Theorem 1.3

To deduce Theorem 1.3 from Theorem 1.4, simply let $\sigma^{1},\ldots,\sigma^{k},\sigma^{k+1}$ be i.i.d. draws from the Gibbs measure. Then by the law of large numbers, when $k$ is large,

[TABLE]

with high probability. But by Theorem 1.4, we know that with high probability, $\mathcal{R}(\sigma^{k+1})$ is not close to zero. Therefore, with high probability, there must exist $1\leq j\leq k$ such that $\mathcal{R}_{j,k+1}$ is not close to zero.

3. General preliminaries

In this preliminary section, we record several facts needed in the proofs of Theorems 1.4 and 1.5. These preparatory results are mostly elementary.

3.1. The Gibbs measure and partition function

In order for our results to apply to a broad collection of models, we have allowed the state space $\Sigma_{n}$ to be completely general, and the Hamiltonian $H_{n}$ to consist of countably infinite summands. We begin by checking that these assumptions pose no issues to computation. So for the remainder of Section 3.1, we fix the value of $n$ .

Let $\langle\cdot\rangle_{N}$ denote expectation with respect to the Gibbs measure when the Hamiltonian is replaced by the finite sum $H_{n,N}\coloneqq\sum_{i=1}^{N}g_{i}\varphi_{i}$ . That is,

[TABLE]

So that we can pass from $\langle\cdot\rangle_{N}$ to $\langle\cdot\rangle$ , we begin with the following lemma.

Lemma 3.1.

For all $\beta\in\mathbb{R}$ and any $f\in L^{2}(\Sigma_{n})$ , the following limits hold almost surely and in $L^{\alpha}$ for any $\alpha\in[1,\infty)$ :

[TABLE]

Proof.

We organize the proof into a sequence of claims.

Claim 3.2.

With $\mathbb{P}$ -probability equal to $1$ ,

[TABLE]

Proof.

Observe that for fixed $\sigma\in\Sigma_{n}$ , the sequence $(H_{n,N}(\sigma))_{N\geq 0}$ is a martingale with respect to $\mathbb{P}$ . Since

[TABLE]

the martingale convergence theorem guarantees that $H_{n,N}(\sigma)$ converges $\mathbb{P}$ -almost surely as $N\to\infty$ to a limit we call $H_{n}(\sigma)$ . Now Fubini’s theorem proves the claim:

[TABLE]

∎

Claim 3.3.

There exist nonnegative random variables $(M^{+}(\sigma))_{\sigma\in\Sigma_{n}}$ and $(M^{-}(\sigma))_{\sigma\in\Sigma_{n}}$ such that

[TABLE]

and

[TABLE]

Proof.

We simply take

[TABLE]

so that (3.3) is satisfied by definition. Since $M^{+}\stackrel{{\scriptstyle\text{d}}}{{=}}M^{-}$ , we need only check (3.4) for $M^{+}$ . Observe that for any $\beta\geq 0$ , $(\operatorname{e}^{\beta H_{n,N}(\sigma)})_{N\geq 0}$ is a submartingale. By Doob’s inequality, for any $\lambda>0$ and any integer $m\geq 0$ ,

[TABLE]

Therefore, for any $0<\varepsilon<\lambda$ ,

[TABLE]

which implies

[TABLE]

Since Tonelli’s theorem gives $\mathbb{E}E_{n}(\operatorname{e}^{\beta M^{+}(\sigma)})=E_{n}(\mathbb{E}\operatorname{e}^{\beta M^{+}(\sigma)})$ , (3.4) follows from the above display. ∎

Claim 3.4.

For any $f\in L^{2}(\Sigma_{n})$ and any continuous function $\phi:\mathbb{R}\to\mathbb{R}$ such that $|\phi(x)|\leq a\operatorname{e}^{b|x|}$ for all $x\in\mathbb{R}$ , for some $a,b\geq 0$ , we have

[TABLE]

Proof.

By Claim 3.2 and the continuity of $\phi$ , we almost surely have that $\phi(H_{n,N}(\sigma))\to\phi(H_{n}(\sigma))$ for $P_{n}$ -a.e. $\sigma\in\Sigma_{n}$ , as $N\to\infty$ . And by hypothesis,

[TABLE]

Since

[TABLE]

and Claim 3.3 implies that almost surely $E_{n}(\operatorname{e}^{2bM^{\pm}(\sigma)})<\infty$ , (3.5) now follows from dominated convergence (with respect to $P_{n}$ ). ∎

Claim 3.5.

For any $f\in L^{2}(\Sigma_{n})$ and any continuous function $\phi:\mathbb{R}\to\mathbb{R}$ such that $|\phi(x)|\leq a\operatorname{e}^{b|x|}$ for all $x\in\mathbb{R}$ , for some $a,b\geq 0$ , we have

[TABLE]

Proof.

Recall that

[TABLE]

Since $|\phi(x)|\operatorname{e}^{\beta x}\leq a\operatorname{e}^{(b+\beta)|x|}$ , the almost sure part of (3.7) is immediate from Claim 3.4. The convergence in $L^{\alpha}$ is then a consequence of dominated convergence (with respect to $\mathbb{P}$ ). Indeed, by Cauchy–Schwarz and Jensen’s inequality, we have the majorization

[TABLE]

where the final expression has moments of all orders by (3.4). ∎

We now complete the proof of Lemma 3.1 by taking $\phi\equiv 1$ for (3.2a), and $f\equiv 1$ , $\phi(x)=x$ for (3.2b).

∎

Remark 3.6.

The essential feature of the above proof was checking in Claim 3.3 that (• ‣ 1.1) is enough to guarantee the first equality below:

[TABLE]

We will frequently use the above identity, an easy consequence of which is the following.

Lemma 3.7.

For any $\beta\in\mathbb{R}$ , we have

[TABLE]

as well as

[TABLE]

Proof.

By exchanging the order of expectation in the identity $\mathbb{E}Z_{n}(\beta)=\mathbb{E}[E_{n}(\operatorname{e}^{\beta H_{n}(\sigma)})]$ (which we are permitted to do by Tonelli’s theorem) and applying (3.8), we obtain (3.9). For (3.10), we apply Jensen’s inequality to obtain

[TABLE]

then take expectation $\mathbb{E}(\cdot)$ of both sides, and again exchange the order of expectation. ∎

Let us also record two consequences of Lemma 3.1 that will be needed later in the paper.

Corollary 3.8.

For any $\beta\in\mathbb{R}$ , the following limits hold almost surely and in $L^{\alpha}$ for any $\alpha\in[1,\infty)$ :

[TABLE]

Proof.

First we argue the almost sure statements. The $L^{\alpha}$ statements will then follow from bounded convergence, since (• ‣ 1.1) gives the uniform bound

[TABLE]

So we fix the disorder ${\boldsymbol{g}}$ . By Lemma 3.1, it is almost surely the case that for every $i\geq 1$ , $\langle\varphi_{i}\rangle_{N}\to\langle\varphi_{i}\rangle$ and $\langle\varphi_{i}^{2}\rangle_{N}\to\langle\varphi_{i}^{2}\rangle$ as $N\to\infty$ . We also know $\sum_{i=1}^{\infty}\varphi_{i}^{2}=n$ . In particular, given $\varepsilon>0$ , we can choose $M$ so large that

[TABLE]

Given $M$ , there is $N_{0}$ such that for all $N\geq N_{0}$ ,

[TABLE]

In particular, for all $N\geq N_{0}\vee M$ ,

[TABLE]

and also

[TABLE]

∎

3.2. Derivative of free energy

This section records some important facts regarding convergence of the free energy’s derivative. By Lemma 3.1, it is almost surely the case that the random variable $H_{n}(\sigma)$ has exponential moments of all orders with respect to $P_{n}$ . Standard calculations then show that the free energy $F_{n}(\beta)=\frac{1}{n}\log Z_{n}(\beta)$ satisfies

[TABLE]

Recall from (• ‣ 1.1) that $F_{n}(\beta)\to p(\beta)$ . Since $F_{n}(\cdot)$ is convex for every $n$ , $p(\cdot)$ is necessarily convex. This assumption implies the following lemma, which is a general fact about the convergence of convex functions.

Lemma 3.9.

If $p(\cdot)$ is differentiable at $\beta$ , and $\beta_{n}=\beta+\delta(n)$ with $\delta(n)\to 0$ as $n\to\infty$ , then

[TABLE]

Proof.

Let $\varepsilon>0$ . By differentiability, we can choose $h>0$ sufficiently small that

[TABLE]

where the middle inequality is due to convexity. Given $h$ , we next choose $\delta>0$ such that

[TABLE]

which is possible by the continuity of $p(\cdot)$ . Now, convexity of $F_{n}$ implies the following for all $n$ such that $\delta(n)\leq\delta$ :

[TABLE]

Upon defining

[TABLE]

it follows that for all sufficiently large $n$ ,

[TABLE]

Analogously, (3.13), (3.14b), and (3.15b) together yield the lower bound

[TABLE]

By (• ‣ 1.1), both $\Delta_{n}^{-}(\beta-\delta,h)$ and $\Delta_{n}^{+}(\beta+\delta,h)$ tend to [math] almost surely and in $L^{1}$ as $n\to\infty$ . As $\varepsilon$ is arbitrary, the desired result follows. ∎

Corollary 3.10.

For every $\beta\geq 0$ at which $p(\cdot)$ is differentiable,

[TABLE]

In particular, $0\leq p^{\prime}(\beta)\leq\beta$ , and there is thus some $\beta_{\mathrm{c}}\in[0,\infty]$ such that

[TABLE]

Proof.

Using the notation of Lemma 3.1, we have

[TABLE]

By Gaussian integration by parts,

[TABLE]

and then Lemma 3.9 allows us to write

[TABLE]

which completes the proof of (3.17). The inequalities $0\leq p^{\prime}(\beta)\leq\beta$ now follow from

[TABLE]

For the second part of the claim, we recall that $p(\cdot)$ is convex and thus absolutely continuous. Since $p(0)=0$ , we then have

[TABLE]

Since the integrand is nonnegative, it follows that $\beta\mapsto\beta^{2}/2-p(\beta)$ is non-decreasing for $\beta\geq 0$ . ∎

So that we can be explicit in the inverse temperature parameter $\beta$ , for the remainder of the section we will write $\langle\cdot\rangle_{\beta}$ for expectation with respect to $\mu_{n}^{\beta}$ . In light of (3.12), Lemma 3.9 implies

[TABLE]

We will require the following stronger form of this result, which also appears in [6, Theorem 3]. Our proof is adapted from the elegant approach of [48], and included for completeness.

Lemma 3.11.

If $\beta$ is a point of differentiability for $p(\cdot)$ , then

[TABLE]

Proof.

By Lemma 3.9, it suffices to show that if $\beta_{0}$ is a point of differentiability for $p(\cdot)$ , then

[TABLE]

Fix $\varepsilon>0$ and choose $h>0$ small enough that

[TABLE]

Given $h$ , differentiability allows us to take $\beta_{1}>\beta_{0}$ sufficiently close to $\beta_{0}$ to satisfy

[TABLE]

By adding and subtracting $\langle|H_{n}(\sigma^{1})-H_{n}(\sigma^{2})|\rangle_{\beta_{0}}$ , we have

[TABLE]

A simple calculation, followed by Cauchy–Schwarz, shows

[TABLE]

By another application of Cauchy–Schwarz, we have

[TABLE]

From the previous two displays, we find

[TABLE]

In light of this inequality, (LABEL:trivial_integrals) now shows

[TABLE]

where

[TABLE]

In summary,

[TABLE]

where

[TABLE]

Therefore, convexity of $F_{n}(\cdot)$ implies

[TABLE]

As $n\to\infty$ , (• ‣ 1.1) shows that $\Delta_{n}^{+}(\beta_{1},h)$ and $\Delta_{n}^{-}({\beta_{0}},h)$ each converge to [math] almost surely and in $L^{1}$ . Thus (3.21) and the above display together yield the desired result, as $\varepsilon$ is arbitrary. ∎

3.3. Temperature perturbations

Here we derive upper bounds for the effects of temperature perturbations on certain expectations with respect to $\mu_{n}^{\beta}$ .

Lemma 3.12.

The following statements hold for any $\beta_{1}\geq\beta_{0}\geq 0$ .

(a)

For any measurable $f:\Sigma_{n}\to[-1,1]$ ,

[TABLE]

(b)

For any $\sigma\in\Sigma_{n}$ ,

[TABLE]

(c)

Finally,

[TABLE]

Proof.

All three claims follow from two crucial observations. First, for any $f\in L^{2}(\Sigma_{n})$ ,

[TABLE]

And second,

[TABLE]

Then part (a) immediately follows, since

[TABLE]

For part (b), we first observe that if $0\leq\beta\leq\beta_{1}$ , then

[TABLE]

where now the right-hand side is independent of $\beta$ and (almost surely) finite. Moreover, we have the following finiteness condition when summing over $i$ :

[TABLE]

It thus follows that

[TABLE]

In particular,

[TABLE]

As in part (a), (3.25) now proves (3.22). For part (c), we can argue similarly in order to obtain

[TABLE]

from which (3.25) proves (3.23). ∎

4. Proof of Theorem 1.5

Recall the event under consideration:

[TABLE]

The proof of Theorem 1.5 is a perturbative argument using an Ornstein–Uhlenbeck (OU) flow on the environment,

[TABLE]

where ${\boldsymbol{W}}(\cdot)=(W_{i}(\cdot))_{i=1}^{\infty}$ is a collection of independent Brownian motions that are also independent of ${\boldsymbol{g}}={\boldsymbol{g}}_{0}$ , and the above definition is understood coordinate-wise. Within Section 4, we denote expectation with respect to $\mu_{n,{\boldsymbol{g}}_{t}}^{\beta}$ by $\langle\cdot\rangle_{t}$ , not to be confused with $\langle\cdot\rangle_{\beta}$ used in Section 3. We now prove Theorem 1.5 by juxtaposing the following two propositions. Notice that if $\mathbb{P}(B_{\delta})=0$ , then there is nothing to be done; therefore, we may henceforth assume $\mathbb{P}(B_{\delta})>0$ so that conditioning on $B_{\delta}$ is well-defined.

Proposition 4.1.

If $\beta$ is a point of differentiability for $p(\cdot)$ , and $p^{\prime}(\beta)<\beta$ , then there exists $\kappa=\kappa(\beta)>0$ such that the following holds: For any $\varepsilon>0$ , there is $T=T(\beta,\varepsilon)$ sufficiently large that

[TABLE]

More specifically,

[TABLE]

For the statement of the second result, let $\mathscr{F}_{t}$ denote the $\sigma$ -algebra generated by ${\boldsymbol{g}}_{0}$ and $({\boldsymbol{W}}(s))_{0\leq s\leq\operatorname{e}^{2t}-1}$ .

Proposition 4.2.

Assume $\beta$ is a point of differentiability for $p(\cdot)$ . Then there is a process $(I_{t})_{t>0}$ adapted to the filtration $(\mathscr{F}_{t})_{t>0}$ , such that the following statements hold:

(a)

For any $T,\varepsilon>0$ ,

[TABLE]

(b)

For any $T,\varepsilon_{1},\varepsilon_{2}>0$ , there exist $\delta_{1}=\delta_{1}(\beta,T,\varepsilon_{1},\varepsilon_{2})>0$ sufficiently small and $n_{0}=n_{0}(\beta,T,\varepsilon_{1},\varepsilon_{2})$ sufficiently large, that

[TABLE]

Proof of Theorem 1.5.

Let $\varepsilon>0$ be given, and assume the hypotheses of Proposition 4.1. By that result, there is $\kappa>0$ and $T$ large enough that

[TABLE]

Let $(I_{t})_{t\geq 0}$ be the process guaranteed by Proposition 4.2, and define the events

[TABLE]

By Proposition 4.2(a),

[TABLE]

And by Proposition 4.2(b), we can choose $0<\delta\leq\kappa/5$ sufficiently small and $n_{0}$ sufficiently large that

[TABLE]

Observe that $B_{\delta}\cap H_{1}\cap H_{2}\subset H$ , and clearly the events $G$ and $H$ are disjoint. We thus have

[TABLE]

On the other hand,

[TABLE]

Putting the two previous displays together, we find

[TABLE]

and so

[TABLE]

∎

4.1. Proof of Proposition 4.1

We will need to recall some facts about Ornstein–Uhlenbeck processes. To avoid technical complications, we restrict ourselves to finite-dimensional OU processes, and then take an appropriate limit at a later stage.

4.1.1. General OU theory

Fix a positive integer $N$ , and consider a vector ${\boldsymbol{g}}=(g_{1},\dots,g_{N})$ of i.i.d. standard normal random variables. Let ${\boldsymbol{W}}=({\boldsymbol{W}}(t))_{t\geq 0}$ be an independent $N$ -dimensional Brownian motion. The OU flow starting at ${\boldsymbol{g}}$ is given by

[TABLE]

This is a continuous-time, stationary Markov chain. Let $(\mathcal{P}_{t})_{t\geq 0}$ denote the OU semigroup; that is, for $f:\mathbb{R}^{N}\to\mathbb{R}$ ,

[TABLE]

Denote the OU generator by $\mathcal{L}\coloneqq\Delta-{\boldsymbol{x}}\cdot\nabla$ . It is especially useful to consider the spectral decomposition of $\mathcal{L}$ , whose eigenfunctions are the multivariate Hermite polynomials. For our purposes, it suffices to recall the following well-known facts (see, for instance, [20, Chapter 6]):

•

Let $\gamma_{N}$ denote the $N$ -dimensional standard Gaussian measure. There is an orthonormal basis $\{\phi_{j}\}_{j=0}^{\infty}$ of $L^{2}(\gamma_{N})$ consisting of eigenfunctions of $\mathcal{L}$ , where $\phi_{0}\equiv 1$ , $\mathcal{L}\phi_{0}=\lambda_{0}\phi_{0}=0$ , and $\mathcal{L}\phi_{j}=-\lambda_{j}\phi_{j}$ with $\lambda_{j}>0$ for $j\geq 1$ . Therefore, if $f=\sum_{j=0}^{\infty}a_{j}\phi_{j}\in L^{2}(\gamma_{N})$ , then

[TABLE]

Furthermore, if $f_{1}=\sum_{j=0}^{\infty}a_{j}\phi_{j},f_{2}=\sum_{j=0}^{\infty}b_{j}\phi_{j}\in L^{2}(\gamma_{N})$ , then

[TABLE]

•

The OU semigroup acts on $L^{2}(\gamma_{N})$ by

[TABLE]

Therefore, if $f=\sum_{j=0}^{\infty}a_{j}\phi_{j}\in L^{2}(\gamma_{N})$ , then

[TABLE]

•

The associated Dirichlet form is given by

[TABLE]

whenever $f_{1}$ and $f_{2}$ are twice-differentiable functions in $L^{2}(\gamma_{N})$ such that both expectations above are finite. In particular, if $f_{1}=f_{2}=\sum_{j=0}^{\infty}a_{j}\phi_{j}\in L^{2}(\gamma_{N})$ is twice-differentiable, then

[TABLE]

Lemma 4.3.

For any twice differentiable $f\in L^{2}(\gamma_{N})$ with $\mathcal{L}f\in L^{2}(\gamma_{N})$ , we have

[TABLE]

Proof.

Take any $0\leq s\leq t$ . By the law of total variance, we have

[TABLE]

In particular, if we write $f$ in the form $f=\sum_{j=0}^{\infty}a_{j}\phi_{j}$ , then

[TABLE]

Therefore,

[TABLE]

Hence

[TABLE]

∎

Proof of Proposition 4.1.

Let $({\boldsymbol{g}}_{t})_{t\geq 0}$ be the OU flow from (4.1), and write

[TABLE]

Recall that $\langle\cdot\rangle_{t}$ denotes expectation with respect to $\mu_{n,{\boldsymbol{g}}_{t}}^{\beta}$ . Let $Z_{n,t}(\beta)$ and $F_{n,t}(\beta)$ be the associated partition function and free energy, respectively. That is, with $H_{n,t}\coloneqq\sum_{i}g_{i}(t)\varphi_{i}$ , we have

[TABLE]

So that we can use the finite-dimensional facts discussed before, define $H_{n,t,N}\coloneqq\sum_{i=1}^{N}g_{i}(t)\varphi_{i}$ , as well as

[TABLE]

Define $f:\mathbb{R}^{N}\to\mathbb{R}$ by

[TABLE]

so that $f({\boldsymbol{g}}_{t})=F_{n,t,N}(\beta)$ , where ${\boldsymbol{g}}_{t}$ is understood to mean $(g_{1}(t),\dots,g_{N}(t))$ . Note that $f\in L^{2}(\gamma_{N})$ , since $\log^{2}x\leq x+x^{-1}$ for $x>0$ , and so using the same arguments as in Lemma 3.7 yields

[TABLE]

Similar to (3.1), for general $\mathfrak{f}\in L^{2}(\Sigma_{n})$ , we define

[TABLE]

Observe that

[TABLE]

which implies

[TABLE]

as well as

[TABLE]

where the derivative is with respect to $\beta$ . Note that

[TABLE]

Furthermore,

[TABLE]

We thus have

[TABLE]

From (4.16), it is clear that $\mathcal{L}f\in L^{2}(\gamma_{N})$ . Therefore, by Lemma 4.3 and (4.15),

[TABLE]

Moreover, from (4.10) we know

[TABLE]

We can now apply (3.2a) (together with (3.12)) and (3.11) to take the limit $N\to\infty$ in the two previous displays and obtain

[TABLE]

Consequently, for any $\varepsilon>0$ , Chebyshev’s inequality shows

[TABLE]

Now consider that

[TABLE]

Therefore, if $\beta$ is a point of differentiability for $p(\cdot)$ , then for any sequence $(t(n))_{n\geq 1}$ , Lemma 3.9 guarantees

[TABLE]

When $t=t(n)=T/n$ for fixed $T$ , (4.17) and (4.18) together show

[TABLE]

Assuming $p^{\prime}(\beta)<\beta$ , we let $\kappa=\kappa(\beta)\coloneqq\frac{\beta-p^{\prime}(\beta)}{\beta}>0$ . Then the previous display implies

[TABLE]

The proof is completed by taking $T=T(\beta,\varepsilon)$ sufficiently large that

[TABLE]

∎

4.2. Proof of Proposition 4.2

Let us rewrite (4.1) as

[TABLE]

Recall that $\langle\cdot\rangle_{0}=\langle\cdot\rangle$ . For any $f\in L^{2}(\Sigma_{n})$ , we have

[TABLE]

In light of Lemma 3.11, we anticipate that for $t=O(n^{-1})$ ,

[TABLE]

Indeed, the process that will satisfy the conclusions of Proposition 4.2 is

[TABLE]

To prove so, the following lemma will suffice. Recall that

[TABLE]

Lemma 4.4.

For any $T,\varepsilon>0$ , the following statements hold:

(a)

If $\beta$ is a point of differentiability for $p(\cdot)$ , then there is a sequence of nonnegative random variables $(M_{n})$ depending only on $\beta$ , $T$ , and $\varepsilon$ , such that

[TABLE]

and for every $f\in L^{2}(\Sigma_{n})$ , $t\in[0,\frac{T}{n}]$ ,

[TABLE]

(b)

There exist $\delta_{1}=\delta_{1}(\beta,T,\varepsilon)>0$ sufficiently small and $n_{0}=n_{0}(\beta,T,\varepsilon)$ sufficiently large, that for every $n\geq n_{0}$ , $f\in L^{2}(\Sigma_{n})$ , $t\in[0,\frac{T}{n}]$ , and $\delta\in(0,\delta_{1}]$ , we have

[TABLE]

Before checking these facts, let us use them to prove Proposition 4.2. The idea is to use the above sequence $M_{n}$ to control the differences $Q_{t}(\varphi_{i})^{2}-\langle\varphi_{i}\rangle^{2}$ simultaneously across all $i$ and $t\in[0,\frac{T}{n}]$ ; this will allow us to prove (4.3). On the other hand, (4.23) shows that when $\langle\mathcal{R}_{1,2}\rangle$ is small, $Q_{t}(\varphi_{i})^{2}$ remains close to $Q_{0}(\varphi_{i})^{2}=\langle\varphi_{i}\rangle^{2}$ . That this approximation holds uniformly over $t\in[0,\frac{T}{n}]$ will lead to (4.4).

Proof of Proposition 4.2.

First we prove part (a). Let $T,\varepsilon>0$ be fixed. From Lemma 4.4(a), we identify a sequence of random variables $(M_{n})$ such that (4.22) holds, and

[TABLE]

Under our definition (4.20), we have

[TABLE]

Now Markov’s inequality and (4.24) together imply

[TABLE]

which completes the proof of (a).

Next we prove part (b). Let $\varepsilon_{1},\varepsilon_{2}>0$ be given. Similar to above, for any $\delta>0$ we have

[TABLE]

From Lemma 4.4(b), we choose $\delta_{1}$ sufficiently small that (4.23) holds for all $\delta\in(0,\delta_{1}]$ , with $\varepsilon=\varepsilon_{1}\varepsilon_{2}$ . We then have, for all $n$ sufficiently large,

[TABLE]

Then applying Markov’s inequality yields (4.4). ∎

It now remains to prove Lemma 4.4. To do so, we will make use of the following preparatory result, which in fact is the common thread between the proofs of Theorems 1.4 and 1.5. Let ${\boldsymbol{h}}=(h_{i})_{i=1}^{\infty}$ be an independent copy of the disorder ${\boldsymbol{g}}$ . We will use $\mathbb{E}_{{\boldsymbol{h}}}$ and $\operatorname{Var}_{{\boldsymbol{h}}}$ to denote expectation and variance with respect to ${\boldsymbol{h}}$ , conditional on ${\boldsymbol{g}}$ . All statements involving these conditional quantities will be almost sure with respect to $\mathbb{P}$ , although we will not repeatedly write this.

Lemma 4.5.

Recall the constant $\mathscr{E}_{n}$ from (• ‣ 1.1). For any $t\geq 0$ , the following statements hold:

(a)

For any $f\in L^{2}(\Sigma_{n})$ ,

[TABLE]

(b)

For any measurable $f:\Sigma_{n}\to[0,1]$ ,

[TABLE]

Proof.

For any $f\in L^{2}(\Sigma_{n})$ ,

[TABLE]

Now, for all $x\in[-1,1]$ , we have $|\operatorname{e}^{t^{2}x}-1|\leq\operatorname{e}^{t^{2}}|x|$ . In particular, since

[TABLE]

we see from (LABEL:general_f) that

[TABLE]

Alternatively, if $f:\Sigma_{n}\to[0,1]$ , then we can use the equalities in (LABEL:general_f) to write

[TABLE]

∎

We are now ready to prove Lemma 4.4.

Proof of Lemma 4.4.

Let $f\in L^{2}(\Sigma_{n})$ be arbitrary. Recall the random variable $Q_{t}(f)$ defined in (4.19). Observe that for fixed $t\geq 0$ , $\operatorname{e}^{-t}{\boldsymbol{W}}(\operatorname{e}^{2t}-1)$ is equal in law to $\sqrt{1-\operatorname{e}^{-2t}}{\boldsymbol{h}}$ , where ${\boldsymbol{h}}$ is an independent copy of ${\boldsymbol{g}}$ . Therefore, if we define

[TABLE]

then

[TABLE]

Since the conclusions of Lemma 4.4 depend only on marginal distributions at fixed $t\leq T/n$ , it suffices to prove bounds of the form

[TABLE]

where $M_{n}$ satisfies (4.21), and

[TABLE]

So henceforth we fix $T,\varepsilon>0$ , and $t\in[0,\frac{T}{n}]$ . We will need the following four claims. In checking these claims, we will frequently use the following inequality, which holds for any $c\geq 0$ :

[TABLE]

Claim 4.6.

For any $q\in(-\infty,0]\cup[1,\infty)$ ,

[TABLE]

Claim 4.7.

For any $q\geq 2$ ,

[TABLE]

Claim 4.8.

Given any $q>0$ , set $k=\lfloor\log_{2}\frac{n}{qT}\rfloor$ . For all $n$ large enough that $k\geq 1$ ,

[TABLE]

Claim 4.9.

For any even $q\geq 2$ and $\varepsilon>0$ , the following inequalities hold for all $n\geq(2q+1)T$ :

[TABLE]

and thus

[TABLE]

Before proving the claims, we use them to obtain the desired statements.

4.2.1. Proof of Lemma 4.4(a)

First note that for any random variables $W$ and $Z$ ,

[TABLE]

Therefore,

[TABLE]

Let $\delta$ be a positive number to be chosen later. Anticipating the application of Claims 4.8 and 4.9, we condense notation by defining

[TABLE]

Because of (LABEL:prep_for_full_bound), we seek a bound of the form

[TABLE]

Therefore, once we set

[TABLE]

and take expectation, (LABEL:prep_for_full_bound) becomes

[TABLE]

which is exactly (4.27). To complete the proof of Lemma 4.4(a), we need to show that given any $\varepsilon>0$ , we can choose $\delta$ sufficiently small that (4.21) holds ( $M_{n}$ depends on $\delta$ through $W_{n}^{(4)}$ and $W_{n}^{(8)}$ ).

Indeed, by Cauchy–Schwarz we have

[TABLE]

Next we observe that for $q\geq 4$ and $n$ sufficiently large such that $k=\lfloor\log_{2}\frac{n}{qT}\rfloor\geq 1$ ,

[TABLE]

Meanwhile, if $q\geq 4$ and $n\geq 2(q+1)T$ , then

[TABLE]

By Lemma 3.11, the previous display shows

[TABLE]

In light of (4.37) and (4.38), it is clear from this inequality that $\delta$ can be chosen sufficiently small that (4.21) holds.

4.2.2. Proof of Lemma 4.4(b)

To establish (4.28), it will be easier to replace $X^{\prime}/Y^{\prime}$ by $X^{\prime\prime}/Y^{\prime\prime}$ , where

[TABLE]

By Lemma 4.5(a),

[TABLE]

and so

[TABLE]

as well as

[TABLE]

Because

[TABLE]

we have $\mathbb{E}_{{\boldsymbol{h}}}(Y^{\prime\prime})=1$ and can thus apply Chebyshev’s inequality to obtain

[TABLE]

We will use these inequalities in the following bound:

[TABLE]

Now,

[TABLE]

and

[TABLE]

In addition,

[TABLE]

Using (4.39), (4.40), and (4.42)–(4.44) in (LABEL:observation_0), we find

[TABLE]

In particular, for any $\delta>0$ and $n$ large enough that $\mathscr{E}_{n}\leq\delta/2$ ,

[TABLE]

and so (4.35) implies

[TABLE]

Given $\varepsilon>0$ , we choose $\theta$ and $\delta$ small enough (in that order, and depending only on $\beta$ , $T$ , and $\varepsilon$ ) so that the rightmost expression above is at most $\mathds{1}_{B_{\delta}}\varepsilon\langle f(\sigma)^{2}\rangle$ . Moreover, it is clear that once $\theta$ and $\delta$ are chosen, $\mathds{1}_{B_{\delta}}$ could be replaced by $\mathds{1}_{B_{\delta^{\prime}}}$ for any $\delta^{\prime}\in(0,\delta)$ , and the rightmost expression will be bounded from above by $\mathds{1}_{B_{\delta^{\prime}}}\varepsilon\langle f(\sigma)^{2}\rangle$ . Taking expectations on both sides yields (4.28).

4.2.3. Proof of Claim 4.6

Assume $q\leq 0$ or $q\geq 1$ . Using Jensen’s inequality, we have

[TABLE]

4.2.4. Proof of Claim 4.7

Assume $q\geq 2$ . By Cauchy–Schwarz and Jensen’s inequality, we have

[TABLE]

4.2.5. Proof of Claim 4.8

Assume $q>0$ . By Jensen’s inequality,

[TABLE]

Recall that $k=\lfloor\log_{2}\frac{n}{qT}\rfloor$ , and we assume $k\geq 1$ . By (4.29),

[TABLE]

which implies

[TABLE]

Repeated applications of Cauchy–Schwarz yield

[TABLE]

By similar manipulations,

[TABLE]

Together, (4.45)–(4.48) yield (4.32).

4.2.6. Proof of Claim 4.9

Assume $q\geq 2$ is even. By Cauchy–Schwarz and Jensen’s inequality, we have

[TABLE]

For any $L>0$ , we have the inequality $(\operatorname{e}^{x}-1)^{q}\leq C(L,q)|x|$ for all $x\leq L$ . Hence

[TABLE]

Assume $L\geq 2\beta Tp^{\prime}(\beta)$ so that whenever

[TABLE]

it follows that

[TABLE]

We thus have

[TABLE]

Combining (LABEL:next_1)–(LABEL:next_3), we have now shown that

[TABLE]

Finally, given $\varepsilon>0$ , we choose $L$ large enough that $\operatorname{e}^{-L}\leq\varepsilon$ , thereby producing (LABEL:bad_prep_var_bound). Then (4.34) is the special case when $f\equiv 1$ . ∎

5. Proof of Theorem 1.4

In this section, we consider perturbations to the environment of the form

[TABLE]

where the ${\boldsymbol{h}}^{(j)}$ ’s are independent copies of ${\boldsymbol{g}}$ . An important observation is that

[TABLE]

We will continue to use $\mathbb{E}$ to denote expectation with respect to ${\boldsymbol{g}}$ and the ${\boldsymbol{h}}^{(k)}$ ’s jointly, whereas $\mathbb{E}_{{\boldsymbol{h}}^{(k)}}$ will denote expectation with respect to ${\boldsymbol{h}}^{(k)}$ conditional on ${\boldsymbol{g}}$ and ${\boldsymbol{h}}^{(j)}$ , $1\leq j\leq k-1$ . As before, all statements involving $\mathbb{E}_{{\boldsymbol{h}}^{(k)}}$ and $\operatorname{Var}_{{\boldsymbol{h}}^{(k)}}$ are to be interpreted as almost sure statements.

As in Section 3, $\langle\cdot\rangle_{\beta}$ will denote expectation with respect to $\mu_{n,{\boldsymbol{g}}}^{\beta}$ . On the other hand, we will write $\llangle\cdot\rrangle_{k}$ to denote expectation under the measure $\mu_{n,{\boldsymbol{g}}^{(k)}}^{\beta}$ , where the dependence on $\beta$ is understood. That is,

[TABLE]

For $\delta>0$ , define the set

[TABLE]

where $\mathcal{A}_{\delta,0}=\mathcal{A}_{\delta}$ is the set under consideration in Theorem 1.4, whose proof will rely on Propositions 5.1 and 5.3 below.

Proposition 5.1.

For any $\delta_{0}>0$ , there exists $n_{0}=n_{0}(\delta_{0})$ such that for all $n\geq n_{0}$ , $k\geq 1$ , and $\delta\geq\delta_{0}$ ,

[TABLE]

Proof.

For any measurable $f:\Sigma_{n}\to[0,1]$ , an application of (5.2), followed by Cauchy–Schwarz and Jensen’s inequality, gives

[TABLE]

So we define the random variable

[TABLE]

and consider, for fixed $\sigma^{1}$ , the function $f_{\sigma^{1}}(\sigma^{2})=0\vee\frac{1}{n}\sum_{i}\varphi_{i}(\sigma^{1})\varphi_{i}(\sigma^{2})$ . By (4.26), $f_{\sigma^{1}}$ is $[0,1]$ -valued, and (• ‣ 1.1) implies

[TABLE]

So the above estimate shows

[TABLE]

In particular, when $n$ is sufficiently large that $\mathscr{E}_{n}\leq\delta$ ,

[TABLE]

We have thus shown $\mathcal{A}_{\delta,k-1}\subset\mathcal{A}_{X\sqrt{\delta},k}$ , which implies

[TABLE]

where in the second inequality we have used the fact that if $\delta_{1}\leq\delta_{2}$ , then $\mathcal{A}_{\delta_{1},k}\subset\mathcal{A}_{\delta_{2},k}$ . To handle the last term in the above display, we note that for any $p\geq 1$ ,

[TABLE]

Now, for any $\theta\in\mathbb{R}$ and any $k\geq 1$ ,

[TABLE]

Hence

[TABLE]

Choosing $t=\delta^{-1/4}$ and $p=4$ , we arrive at

[TABLE]

which holds for all $n$ such that $\mathscr{E}_{n}\leq\delta$ . ∎

Next we consider the event

[TABLE]

where $B_{\delta,0}=B_{\delta}$ is the event under consideration in Theorem 1.5.

Lemma 5.2.

Assume $\beta$ is a point of differentiability for $p(\cdot)$ , and $p^{\prime}(\beta)<\beta$ . For any $\varepsilon>0$ , there is $\delta=\delta(\beta,\varepsilon)>0$ sufficiently small that for any positive constant $K$ , the following is true. If $k(n)\in\{0,1,\dots,K\}$ for all $n$ , then

[TABLE]

Proof.

By Theorem 1.5, there is $\delta>0$ sufficiently small that

[TABLE]

Let us write $\beta_{n}\coloneqq\beta\sqrt{1+\frac{k(n)}{n}}$ , and then observe that

[TABLE]

Since $\sqrt{1+\frac{k(n)}{n}}\leq 1+\frac{k(n)}{n}\leq 1+\frac{K}{n}$ , we have $0\leq\beta_{n}-\beta\leq\frac{\beta K}{n}$ , and thus Lemma 3.12(c) gives

[TABLE]

By Lemma 3.9, the right-hand side above converges to [math] almost surely as $n\to\infty$ . In particular,

[TABLE]

and so (5.3) follows from (5.4) and (5.5). ∎

Proposition 5.3.

Given any $\alpha>0$ , there are positive constants $C_{1}(\alpha,\beta)$ and $C_{2}(\beta)$ such that the following holds for any $\delta_{0}\in(0,1)$ . There exists $n_{0}=n_{0}(\delta_{0})$ so that for every $n\geq n_{0}$ , $k\geq 1$ , and $\delta\in[\delta_{0},1)$ ,

[TABLE]

Proof.

Let $\delta_{0}\in(0,1)$ be given, and take $n_{0}$ such that $\mathscr{E}_{n}\leq\delta_{0}/2$ for all $n\geq n_{0}$ . Consider any $\delta\in[\delta_{0},1)$ , and define the random variables

[TABLE]

Step 1. Show that $X_{1}$ is concentrated at $Y_{1}$ , but $X_{2}$ is not concentrated at $Y_{2}$ when $B_{\alpha,k-1}^{\mathrm{c}}$ occurs.

First observe that for any $\theta\in(-\infty,0]\cup[1,\infty)$ , Jensen’s inequality implies

[TABLE]

In particular, for any $t>\operatorname{e}^{\frac{\beta^{2}}{2}}\geq Y_{2}$ ,

[TABLE]

On the other hand,

[TABLE]

We have the upper bound

[TABLE]

as well as the lower bound

[TABLE]

Meanwhile, we have $\mathscr{E}_{n}\leq\delta_{0}/2\leq\delta/2$ for all $n\geq n_{0}$ . Hence Lemma 4.5(b) implies

[TABLE]

Using (5.9)–(5.11) in (5.8) yields

[TABLE]

So on the event $B_{\alpha,k-1}^{\mathrm{c}}=\{\frac{1}{n}\sum_{i}\llangle\varphi_{i}\rrangle_{k-1}^{2}>\alpha\}$ , (5.12) shows

[TABLE]

for all $n\geq n_{0}$ . Given $\alpha$ and $\beta$ , we fix $t=t(\alpha,\beta)$ large enough such that

[TABLE]

Because of (5.14b), the inequalities (5.7) and (5.13) together yield

[TABLE]

for all $n\geq n_{0}$ .

Step 2. Since $X_{1}\approx Y_{1}$ , obtain an upper bound on the error in the following approximation:

[TABLE]

Simple algebra gives

[TABLE]

and

[TABLE]

Step 3. Since $X_{2}$ is not concentrated at $Y_{2}$ when $B_{\alpha,k-1}^{\mathrm{c}}$ occurs, obtain a lower bound on the gap in the following application of Jensen’s inequality:

[TABLE]

We consider the function $f:(-Y_{1},\infty)\to[0,1]$ given by

[TABLE]

In particular, we consider its Taylor series approximation about $Y_{2}$ ,

[TABLE]

where $\xi_{x}$ belongs to the interval between $x$ and $Y_{2}$ . We note that such an expansion exists because the identity $Y_{1}+Y_{2}=\operatorname{e}^{\frac{\beta^{2}}{2}}$ shows $Y_{2}>-Y_{1}$ . Jensen’s inequality implies

[TABLE]

We will now produce a lower bound on the Jensen gap.

First observe that $f^{\prime\prime}$ is decreasing on $(-Y_{1},\infty)$ . Consequently, if $x\in[Y_{2},t]$ , then $f^{\prime\prime}(\xi_{x})\geq f^{\prime\prime}(x)\geq f^{\prime\prime}(t)$ . Similarly, if $x\leq Y_{2}$ , then $f^{\prime\prime}(\xi_{x})\geq f^{\prime\prime}(Y_{2})\geq f^{\prime\prime}(t)$ . Therefore, for all $n\geq n_{0}$ , we have

[TABLE]

where the second term in the final expression need not depend on $\alpha$ since $Y_{1}/(8t^{3})\leq 1$ .

Step 4. Reckon the final bound.

In summary, for all $n\geq n_{0}$ ,

[TABLE]

∎

Proof of Theorem 1.4.

Let $\varepsilon>0$ be given. From Lemma 5.2, we fix $\alpha=\alpha(\beta,\varepsilon)>0$ so that for any bounded sequence $(k(n))_{n\geq 1}$ of nonnegative integers, we have

[TABLE]

We wish to find $\delta_{*}>0$ , depending only on $\beta$ and $\varepsilon$ , such that $\mathbb{E}\llangle\mathds{1}_{\mathcal{A}_{\delta_{*}}}\rrangle\leq\varepsilon$ .

Let $\delta_{0}\in(0,1)$ , its exact value to be decided later. From Proposition 5.3, we know that for all $n\geq n_{0}=n_{0}(\delta_{0})$ and $\delta\in[\delta_{0},1)$ ,

[TABLE]

And from Proposition 5.1, we can assume

[TABLE]

Linking the two inequalities, we find that

[TABLE]

where now we fix the constants $\mathbf{C}_{1}(\beta,\varepsilon)$ and $\mathbf{C}_{2}(\beta)$ . Note that $\delta_{0}\leq\delta\leq\delta^{1/4}<1$ , and so this reasoning can be iterated. Iterating $K$ times produces the estimate

[TABLE]

which implies the existence of some $k=k(n)\in\{0,1,\dots,K-1\}$ such that

[TABLE]

So we take $K=K(\beta,\varepsilon)$ large enough that

[TABLE]

and then choose $\delta_{0}=\delta_{0}(\beta,K)$ small enough that

[TABLE]

We now have, for all $n\geq n_{0}$ ,

[TABLE]

Combining this bound with (5.18), we see that

[TABLE]

To now complete the proof, we must obtain from this result an analogous one with $k=0$ .

As in the proof of Lemma 5.2, we will write $\beta_{n}\coloneqq\beta\sqrt{1+\frac{k}{n}}$ . For $\eta>0$ , define the set

[TABLE]

It follows from (5.1) that

[TABLE]

Since $0\leq\beta_{n}-\beta\leq\frac{\beta K}{n}$ , Lemma 3.12(b) implies

[TABLE]

Denote the right-hand side above by $\Delta_{n}$ . Take $\delta_{*}\coloneqq\frac{1}{2}{\delta_{0}}\leq\frac{1}{2}\delta_{0}^{1/4^{k}}$ . From the above display, $\mathcal{A}_{\delta_{*},0}\subset\widetilde{\mathcal{A}}_{\delta_{*}+\Delta_{n},k}.$ Hence

[TABLE]

And by Lemma 3.12(a),

[TABLE]

From the previous two displays and (5.22), we have

[TABLE]

Finally, Lemma 3.9 shows that $\Delta_{n}\to 0$ almost surely and in $L^{1}$ as $n\to\infty$ . Consequently, $\limsup_{n\to\infty}\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{\delta_{*},0}}\rangle_{\beta}\leq\varepsilon$ . ∎

6. Proof of equivalence of Theorems 1.3 and 1.4

Theorem 1.3 is implied by Theorem 1.4 once we establish the following result. Recall the definitions (1.4) and (1.6).

Proposition 6.1.

Suppose $H_{n}$ is defined by (• ‣ 1.1), where $(g_{i})_{i=1}^{\infty}$ are i.i.d. random variables with zero mean and unit variance (not necessarily Gaussian). Assume (• ‣ 1.1)–(• ‣ 1.1). Then the following two statements are equivalent:

$\mathrm{(S1)}$

For every $\varepsilon>0$ , there exist integers $k=k(\beta,\varepsilon)$ and $n_{0}=n_{0}(\beta,\varepsilon)$ and a number $\delta=\delta(\beta,\varepsilon)>0$ such that the following is true for all $n\geq n_{0}$ . With $\mathbb{P}$ -probability at least $1-\varepsilon$ , there exist $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ such that

[TABLE]

$\mathrm{(S2)}$

For every $\varepsilon>0$ , there exists $\delta=\delta(\beta,\varepsilon)>0$ sufficiently small that

[TABLE]

6.1. Proof of $\mathrm{(S2)}\Rightarrow\mathrm{(S1)}$

Let $\varepsilon>0$ be given. By $\mathrm{(S2)}$ , we can choose $\delta>0$ small enough and $n_{0}$ large enough so that for all $n\geq n_{0}$ ,

[TABLE]

It follows from Markov’s inequality that

[TABLE]

Now, by the Paley–Zygmund inequality, for any $j\neq k+1$ ,

[TABLE]

Therefore,

[TABLE]

Choosing $k=\lceil-\delta^{-2}\log(\varepsilon/2)\rceil\vee 0$ , we have

[TABLE]

Therefore,

[TABLE]

This completes the proof, since

[TABLE]

6.2. Proof of $\mathrm{(S1)}\Rightarrow\mathrm{(S2)}$

We begin with a lemma that roughly states the following. If many random variables each have non-negligible positive correlation with a distinguished variable, then at least one pair of these variables has non-negligible positive correlation.

Lemma 6.2.

For any $\delta\in(0,1]$ , there exists $N_{0}=N_{0}(\delta)$ such that the following holds for any integer $N\geq N_{0}$ and any $\sigma^{0}\in\Sigma_{n}$ . If $\sigma^{1},\dots,\sigma^{N}\in\mathcal{B}(\sigma^{0},\delta)\subset\Sigma_{n}$ , then

[TABLE]

Proof.

Consider the $(N+1)\times(N+1)$ matrix $\mathcal{R}=(\mathcal{R}_{j,k})_{0\leq i,j\leq N}$ , where

[TABLE]

Observe that $\mathcal{R}$ is positive semi-definite: for any ${\boldsymbol{x}}\in\mathbb{R}^{N+1}$ ,

[TABLE]

Now let $\eta\coloneqq 0\vee\max_{1\leq j<k\leq N}\mathcal{R}_{j,k}$ . For ${\boldsymbol{x}}=(1,-x,\dots,-x)\in\mathbb{R}^{1+N}$ with $x\geq 0$ , our assumptions give

[TABLE]

We now take $x=\delta/(1+\eta N)$ to obtain

[TABLE]

Supposing that $\eta<\delta^{2}/2$ , we further see

[TABLE]

which yields a contradiction as soon as $\frac{\delta^{2}N}{1+\delta^{2}N/2}>1$ . ∎

We will contrast Lemma 6.2 with the one below, which says that if $\delta$ is small enough, then any non-negligible subset of $\mathcal{A}_{n,\delta}$ has many nearly orthogonal elements.

Lemma 6.3.

For any $\varepsilon_{1},\varepsilon_{2}>0$ and positive integer $N$ , there is $\delta=\delta(\varepsilon_{1},\varepsilon_{2},N)>0$ such that the following holds. If $\mathcal{A}\subset\mathcal{A}_{n,\delta}$ with $\langle\mathds{1}_{\mathcal{A}}\rangle\geq\varepsilon_{1}$ , then there are $\sigma^{1},\dots,\sigma^{N}\in\mathcal{A}$ such that

[TABLE]

Proof.

Set $\delta\coloneqq\varepsilon_{1}\varepsilon_{2}/N$ . Observe that for any $\sigma\in\mathcal{A}$ , we have the following implication:

[TABLE]

Therefore, one can inductively choose

[TABLE]

where (6.3) guarantees that

[TABLE]

Hence $\sigma^{k}\in\mathcal{A}\setminus(\mathcal{B}(\sigma^{1},\varepsilon_{2})\cup\cdots\mathcal{B}(\sigma^{k-1},\varepsilon_{2}))$ can be found so long as $k\leq N$ .

∎

We can now complete the proof. Assume that $\mathrm{(S1)}$ holds. Suppose, contrary to $\mathrm{(S2)}$ , that there is some $\varepsilon\in(0,1)$ such that for every $\delta>0$ ,

[TABLE]

Note that for any $n$ such that $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{n,\delta}}\rangle\geq 4\varepsilon$ , we have

[TABLE]

and thus $\mathbb{P}(\langle\mathds{1}_{\mathcal{A}_{n,\delta}}\rangle\geq 2\varepsilon)\geq 2\varepsilon$ .

From $\mathrm{(S1)}$ , we choose $k$ and $\delta$ so that for all $n$ large enough (depending on $\varepsilon$ on $\beta$ ), the following is true with $\mathbb{P}$ -probability at least $1-\varepsilon$ : There exist $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ such that

[TABLE]

Once $\delta$ has been determined, choose $N$ so that the conclusion of Lemma 6.2 holds. Then, given the values of $k$ and $N$ , choose $\delta^{\prime}$ so that the conclusion of Lemma 6.3 holds with $\varepsilon_{1}=\varepsilon/k$ and $\varepsilon_{2}=\delta^{2}/2$ .

In summary, if $n$ is large enough, and $\mathbb{E}\langle\mathds{1}_{\mathcal{A}_{n,\delta^{\prime}}}\rangle\geq 4\varepsilon$ (by (6.4), there are infinitely many $n$ for which this is the case), the following is true. With $\mathbb{P}$ -probability at least $2\varepsilon-\varepsilon=\varepsilon$ , we have both $\langle\mathds{1}_{\mathcal{A}_{n,\delta^{\prime}}}\rangle\geq 2\varepsilon$ and (6.5) for some $\sigma^{1},\dots,\sigma^{k}\in\Sigma_{n}$ . In this case, we have

[TABLE]

Therefore, there is some $j$ such that

[TABLE]

By our choice of $\delta^{\prime}$ , we can find $\sigma^{1},\dots,\sigma^{N}\in\mathcal{A}_{n,\delta^{\prime}}\cap\mathcal{B}(\sigma^{j},\delta)$ satisfying

[TABLE]

But $\sigma^{1},\dots,\sigma^{N}\in\mathcal{B}(\sigma^{j},\delta)$ , and so the above display contradicts (6.2).

7. Polymer measures are asymptotically non-atomic

In this section we prove that directed polymers on the lattice are asymptotically non-atomic. It is a striking phenomenon that at sufficiently small temperatures, the polymer endpoint distribution places a non-vanishing mass on a single element of $\mathbb{Z}^{d}$ (which is random and varies with $n$ ) [28]. The fact that the polymer measures themselves do not share this property, stated below as Theorem 7.1, justifies the investigation of replica overlap as an order parameter for path localization. To emphasize the fact that the Gaussian environment can be replaced by a general one, we reintroduce notation for directed polymers.

Let $(\omega(i,x):i\geq 1,x\in\mathbb{Z}^{d})$ be a collection of i.i.d. random variables. We will assume that

[TABLE]

and also that

[TABLE]

in order to avoid trivialities. Let $\mathcal{P}_{n}$ denote the set of nearest-neighbor paths of length $n$ in $\mathbb{Z}^{d}$ starting at the origin. Note that $|\mathcal{P}_{n}|=(2d)^{n}$ . To each ${\boldsymbol{x}}=(0,x_{1},\dots,x_{n})$ in $\mathcal{P}_{n}$ we associate the Hamiltonian energy

[TABLE]

The polymer measure is then defined by

[TABLE]

Theorem 7.1.

Assume (7.1). Then for any $d\geq 1$ and any $\beta\in[0,\infty)$ ,

[TABLE]

The remainder of Section 7 is to prove Theorem 7.1. We begin by defining the passage time,

[TABLE]

We will denote the set of maximizing paths by

[TABLE]

It is well-known (for instance, see [39]) that there is a finite constant $\lambda$ such that

[TABLE]

The first equality above is a consequence of the superadditivity of $L_{n}$ , and the second equality leads to a short proof of the following standard fact.

Lemma 7.2.

$\lambda>\mathbb{E}(\omega(i,x))$ .

Proof.

Let ${\boldsymbol{a}}=(1,0,\dots,0)\in\mathbb{Z}^{d}$ and ${\boldsymbol{0}}=(0,\dots,0)\in\mathbb{Z}^{d}$ . Observe that $L_{2}\geq\max\{\omega(1,{\boldsymbol{a}})+\omega(2,{\boldsymbol{0}}),\,\omega(1,-{\boldsymbol{a}})+\omega(2,{\boldsymbol{0}})\}$ , and so

[TABLE]

where the final equality is strict because $\operatorname{Var}(\omega(i,x)^{2})>0$ . ∎

Definition 7.3.

For a nearest-neighbor path ${\boldsymbol{x}}=(x_{0},x_{1},\dots,x_{n})$ of length $n$ in $\mathbb{Z}^{d}$ , define the turns of ${\boldsymbol{x}}$ to be the following set of indices:

[TABLE]

The number of turns of ${\boldsymbol{x}}$ will be denoted $t({\boldsymbol{x}})\coloneqq|T({\boldsymbol{x}})|$ .

Lemma 7.4.

For any $\varepsilon>0$ , there is $\delta=\delta(\varepsilon,d)>0$ small enough that

[TABLE]

Proof.

Given an integer $j$ , $0\leq j\leq n-1$ , we count the elements of $\{{\boldsymbol{x}}\in\mathcal{P}_{n}:t({\boldsymbol{x}})=j\}$ as follows. First, the number of choices for $x_{1}$ is $2d$ . Next, a turn should occur at exactly $j$ of the coordinates $x_{1},\dots,x_{n-1}$ . Moreover, if a turn occurs at $x_{i}$ , then there are $2d-1$ choices for $x_{i+1}-x_{i}$ (so as to avoid $x_{i}-x_{i-1}$ ). Finally, if a turn does not occur at $x_{i}$ , then there is only one choice for $x_{i+1}-x_{i}$ , namely $x_{i}-x_{i-1}$ . Therefore, for any positive integer $k\leq\frac{n-1}{2}$ ,

[TABLE]

If $k=\lceil\delta n\rceil$ for $\delta\in(0,\frac{1}{2})$ , then Stirling’s approximation gives

[TABLE]

Therefore,

[TABLE]

Now choose $\delta$ sufficiently small that the right-hand side above is strictly less than $\log(1+\varepsilon)$ . Inverting the logarithm and choosing $C$ large enough now yields the desired result. ∎

Lemma 7.5.

Let $\{(\omega_{i},\omega_{i}^{\prime})\}_{i=1}^{\infty}$ denote a sequence of i.i.d. pairs of independent random variables. For any $\varepsilon>0$ and $\nu>0$ , there exists $D>0$ large enough that

[TABLE]

Proof.

Choose $D>0$ large enough that $p\coloneqq\mathbb{P}(\{|\omega_{i}|\geq D/2\}\cup\{|\omega_{i}^{\prime}|\geq D/2\})$ satisfies $p^{\nu}\leq\varepsilon/2$ . We then have

[TABLE]

∎

Proof of Theorem 7.1.

Let $\omega$ denote a generic copy of $\omega(i,x)$ , and $\bar{\omega}\coloneqq\mathbb{E}(\omega)$ . Set $\kappa\coloneqq(\lambda-\bar{\omega})/2$ , which is positive by Lemma 7.2. By assumption, there is $t>0$ such that $\mathbb{E}(\operatorname{e}^{t\omega})<\infty$ . Take any $s\in(0,t)$ and observe that for any given ${\boldsymbol{x}}\in\mathcal{P}_{n}$ ,

[TABLE]

Using dominated convergence, it is easy to show that

[TABLE]

and so we may choose $s$ sufficiently small that $\operatorname{e}^{-s\kappa}\mathbb{E}(\operatorname{e}^{s(\omega-\bar{\omega})})<1$ . Set $\eta\coloneqq 1-\operatorname{e}^{-s\kappa}\mathbb{E}(\operatorname{e}^{s(\omega-\bar{\omega})})$ , and then choose $\varepsilon>0$ sufficiently small that $(1+\varepsilon)(1-\eta)<1$ . With $\delta$ as in Lemma 7.4, we have the union bound

[TABLE]

By our choice of $\varepsilon$ , Borel–Cantelli implies that the following statement holds almost surely:

[TABLE]

On the other hand, it is apparent from (7.5) and our choice of $\kappa$ that almost surely, we have $L_{n}>(\bar{\omega}+\kappa)n$ for all large $n$ . For any such $n$ , we then have $H_{n}({\boldsymbol{x}})>(\bar{\omega}+\kappa)n$ for every ${\boldsymbol{x}}\in\mathcal{M}_{n}$ , the set of maximizing paths defined in (7.4). That is, almost surely:

[TABLE]

Together, the two previous displays show that almost surely,

[TABLE]

Recall from (7.6) that $T({\boldsymbol{x}})$ denotes the set of turns in the path ${\boldsymbol{x}}\in\mathcal{P}_{n}$ . For a given ${\boldsymbol{x}}\in\mathcal{P}_{n}$ and $i\in T({\boldsymbol{x}})$ , let ${\boldsymbol{x}}^{(i)}$ denote the unique element of $\mathcal{P}_{n}$ such that $x^{(i)}_{i}\neq x_{i}$ but $x^{(i)}_{j}=x_{j}$ for all $j\neq i$ . That is, $x^{(i)}_{i}-x^{(i)}_{i-1}=x_{i+1}-x_{i}$ while $x^{(i)}_{i+1}-x^{(i)}_{i}=x_{i}-x_{i-1}$ . Upon taking $\varepsilon=1/(4d)$ and $\nu=\delta/3$ in Lemma 7.5, a union bound gives

[TABLE]

Therefore, we can again apply Borel–Cantelli to see that almost surely,

[TABLE]

Now combining this statement with (7.7), we arrive at the following almost sure event:

[TABLE]

In particular, since $\mathcal{M}_{n}$ has at least one element (call it ${\boldsymbol{y}}$ ), we have the following for all $n\geq n_{4}$ :

[TABLE]

Since $D$ and $\delta$ do not depend on $n$ , (7.3) follows. ∎

8. Acknowledgments

We are grateful to Francis Comets for valuable feedback and discussion, and to the referees for their beneficial comments, suggestions, and edits.

Bibliography69

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Adler, R. J., and Taylor, J. E. Random fields and geometry . Springer Monographs in Mathematics. Springer, New York, 2007.
2[2] Aizenman, M., and Contucci, P. On the stability of the quenched state in mean-field spin-glass models. J. Statist. Phys. 92 , 5-6 (1998), 765–783.
3[3] Aizenman, M., Lebowitz, J. L., and Ruelle, D. Some rigorous results on the Sherrington-Kirkpatrick spin glass model. Comm. Math. Phys. 112 , 1 (1987), 3–20.
4[4] Arguin, L.-P., and Zindy, O. Poisson-Dirichlet statistics for the extremes of a log-correlated Gaussian field. Ann. Appl. Probab. 24 , 4 (2014), 1446–1481.
5[5] Auffinger, A., and Chen, W.-K. On properties of Parisi measures. Probab. Theory Related Fields 161 , 3-4 (2015), 817–850.
6[6] Auffinger, A., and Chen, W.-K. On concentration properties of disordered Hamiltonians. Proc. Amer. Math. Soc. 146 , 4 (2018), 1807–1815.
7[7] Auffinger, A., Chen, W.-K., and Zeng, Q. The SK model is infinite step replica symmetry breaking at zero temperature. Comm. Pure Appl. Math. 73 , 5 (2020), 921–943.
8[8] Auffinger, A., and Louidor, O. Directed polymers in a random environment with heavy tails. Comm. Pure Appl. Math. 64 , 2 (2011), 183–204.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Localization in Gaussian disordered systems at low temperature

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

1.1. Model and assumptions

Remark 1.1**.**

Remark 1.2**.**

1.2. Notation

1.3. Motivation

1.4. Results

Theorem 1.3**.**

Theorem 1.4**.**

Theorem 1.5**.**

1.5. Applications

1.5.1. Spin glasses

Theorem 1.6**.**

1.5.2. Directed polymers

Theorem 1.7**.**

1.6. Other Gaussian fields

1.7. Open problems

2. Proof sketches

2.1. Proof sketch of Theorem 1.5

2.2. Proof sketch of Theorem 1.4

2.3. Proof sketch of Theorem 1.3

3. General preliminaries

3.1. The Gibbs measure and partition function

Lemma 3.1**.**

Proof.

Claim 3.2**.**

Proof.

Claim 3.3**.**

Proof.

Claim 3.4**.**

Proof.

Claim 3.5**.**

Proof.

Remark 3.6**.**

Lemma 3.7**.**

Proof.

Corollary 3.8**.**

Proof.

3.2. Derivative of free energy

Lemma 3.9**.**

Proof.

Corollary 3.10**.**

Proof.

Lemma 3.11**.**

Proof.

3.3. Temperature perturbations

Lemma 3.12**.**

Proof.

4. Proof of Theorem 1.5

Proposition 4.1**.**

Proposition 4.2**.**

Proof of Theorem 1.5.

4.1. Proof of Proposition 4.1

4.1.1. General OU theory

Lemma 4.3**.**

Proof.

Proof of Proposition 4.1.

4.2. Proof of Proposition 4.2

Lemma 4.4**.**

Proof of Proposition 4.2.

Lemma 4.5**.**

Proof.

Proof of Lemma 4.4.

Claim 4.6**.**

Claim 4.7**.**

Claim 4.8**.**

Claim 4.9**.**

4.2.1. Proof of Lemma 4.4(a)

4.2.2. Proof of Lemma 4.4(b)

4.2.3. Proof of Claim 4.6

Remark 1.1.

Remark 1.2.

Theorem 1.3.

Theorem 1.4.

Theorem 1.5.

Theorem 1.6.

Theorem 1.7.

Lemma 3.1.

Claim 3.2.

Claim 3.3.

Claim 3.4.

Claim 3.5.

Remark 3.6.

Lemma 3.7.

Corollary 3.8.

Lemma 3.9.

Corollary 3.10.

Lemma 3.11.

Lemma 3.12.

Proposition 4.1.

Proposition 4.2.

Lemma 4.3.

Lemma 4.4.

Lemma 4.5.

Claim 4.6.

Claim 4.7.

Claim 4.8.

Claim 4.9.

Proposition 5.1.

Lemma 5.2.

Proposition 5.3.

Proposition 6.1.

6.1. Proof of $\mathrm{(S2)}\Rightarrow\mathrm{(S1)}$

6.2. Proof of $\mathrm{(S1)}\Rightarrow\mathrm{(S2)}$

Lemma 6.2.

Lemma 6.3.

Theorem 7.1.

Lemma 7.2.

Definition 7.3.

Lemma 7.4.

Lemma 7.5.