Localization in Gaussian disordered systems at low temperature
Erik Bates, Sourav Chatterjee

TL;DR
This paper demonstrates that in Gaussian disordered systems at low temperatures, the Gibbs measure concentrates around a few states, providing new insights into localization phenomena in spin glasses and directed polymers.
Contribution
It introduces a unified argument showing localization in Gaussian disordered systems, enabling results on path localization and Gibbs state exhaustiveness without relying on traditional identities.
Findings
Gibbs measure localizes in small neighborhoods of few states
Path localization for directed polymers achieved without exact solvability
Gibbs states are exhaustive in spin glasses without Ghirlanda-Guerra identities
Abstract
For a broad class of Gaussian disordered systems at low temperature, we show that the Gibbs measure is asymptotically localized in small neighborhoods of a small number of states. From a single argument, we obtain (i) a version of "complete" path localization for directed polymers that is not available even for exactly solvable models; and (ii) a result about the exhaustiveness of Gibbs states in spin glasses not requiring the Ghirlanda-Guerra identities.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
MnLargeSymbols’164 MnLargeSymbols’171
Localization in Gaussian disordered systems at low temperature
Erik Bates
Department of Mathematics
University of California, Berkeley
1067 Evans Hall
Berkeley, CA 94720-3840
and
Sourav Chatterjee
Department of Statistics
Stanford University
Sequoia Hall, 390 Jane Stanford Way
Stanford, CA 94305
Abstract.
For a broad class of Gaussian disordered systems at low temperature, we show that the Gibbs measure is asymptotically localized in small neighborhoods of a small number of states. From a single argument, we obtain (i) a version of “complete” path localization for directed polymers that is not available even for exactly solvable models; and (ii) a result about the exhaustiveness of Gibbs states in spin glasses not requiring the Ghirlanda–Guerra identities.
Key words and phrases:
Replica overlap, Gaussian disorder, spin glasses, directed polymers, path localization
2010 Mathematics Subject Classification:
60G15, 60G17, 60K37, 82B44, 82D30, 82D60
E.B. was partially supported by NSF grants DGE-114747 and DMS-1902734
S.C. was partially supported by NSF grant DMS-1608249.
1. Introduction
A ubiquitous theme in statistical mechanics is to understand how a system behaves differently at high and low temperatures. In a disordered system, where the interactions between its elements are governed by random quantities, the strength of the disorder is determined by temperature. Namely, high temperatures mean the disorder is weak, and the system is likely to resemble a generic one based on entropy. On the other hand, low temperatures indicate strong disorder, which creates dramatically different behavior in which the system is constrained to a small set of states that are energetically favorable. In the latter case, this concentration phenomenon is often called “localization”.
A useful statistic in distinguishing different temperature regimes is the so-called “replica overlap”. That is, given the disorder, one can study the similarity of two independently observed states. If the disorder is strong, then these two states should closely resemble one another with good probability, since we believe the system is bound to a relatively small number of possible realizations. Some version of this statement has been rigorously established in a number of contexts, most famously in spin glass theory but also in the settings of disordered random walks and disordered Brownian motion. Unfortunately, it does not follow that the number of realizable states is small, but only that there is small number of states that are observed with positive probability.
In the present study, our entry point to this problem is to consider conditional overlap. Whereas previous results in the literature show the overlap distribution between two independent states has a nonzero component, we ask whether the same is true even if one conditions on the first state. That is, does a typical state always have positive expected overlap with an independent one? We show that for a broad class of Gaussian disordered systems, the answer is yes, the key implication being that the entire realizable state space is small. Specifically, there is an number of states such that all but a negligible fraction of samples from the system will have positive overlap with one of these states.
The general setting, notation, motivation, and results are given in Sections 1.1–1.4, respectively. The consequences for spin glasses, directed polymers, and other Gaussian fields are discussed in Sections 1.5 and 1.6.
1.1. Model and assumptions
Let be an abstract probability space, and a sequence of Polish spaces equipped respectively with probability measures . For each , we consider a centered Gaussian field indexed by and defined on . Viewing this field as a Hamiltonian, we have the associated Gibbs measure at inverse temperature :
[TABLE]
Our results concern the relationship between the free energy,
[TABLE]
and the covariance structure of . We make the following assumptions:
- •
There is a deterministic function such that
[TABLE]
- •
For every ,
[TABLE]
- •
For every ,
[TABLE]
where is a nonnegative constant tending to [math] as .
- •
For each , there exist measurable real-valued functions on and i.i.d. standard normal random variables defined on such that for each , with -probability ,
[TABLE]
where the series on the right converges in .
Remark 1.1**.**
In all applications of interest (see Section 1.5), the hypothesis (• ‣ 1.1) is trivially satisfied with . Nevertheless, we assume throughout only that (at any rate). This modest relaxation is made so our results can apply to slightly more general models, for instance perturbations of the standard models we will soon describe.
Remark 1.2**.**
The condition (• ‣ 1.1) is very mild: For example, it always holds when is finite. More generally, a sufficient condition for the existence of a representation (• ‣ 1.1) is that is compact in the metric defined by (namely, the metric that defines the distance between and as the distance between the random variables and ). For a proof of this standard result, see [1, Theorem 3.1.1]. Furthermore, in all applications of interest, will actually be explicitly defined using a sum of the form (• ‣ 1.1).
1.2. Notation
Unless stated otherwise, “almost sure” and “in ” statements are with respect to . We will use and to denote expectation with respect to and , respectively. Absent any decoration, will always denote expectation with respect to , meaning
[TABLE]
At various points in the paper, we will decorate to denote expectation with respect to some perturbation of . The type of perturbation will change between sections. The symbols , , shall denote independent samples from if appearing within , or from if appearing within . We will refer to the vector as the disorder or random environment. Sometimes we will consider multiple environments at the same time, which will necessitate that we write instead of to emphasize the dependence on the environment .
In the sequel, will always mean , and we will condense our notation to when we are dealing with some fixed . Similarly, will be shortened to and will be shortened to . Also, will indicate a positive constant that depends only on the argument(s). In particular, no such constant depends on or . We will not concern ourselves with the precise value, which may change from line to line.
1.3. Motivation
Our results will be stated in terms of the correlation or overlap function,
[TABLE]
Note that (• ‣ 1.1) and (• ‣ 1.1) imply
[TABLE]
We will often abbreviate to .
The Gaussian process naturally defines a (pseudo)metric on , given by
[TABLE]
Given the metric topology, we can study the so-called “energy landscape” of on . The geometry of this landscape is intimately related to the free energy. By Jensen’s inequality,
[TABLE]
which in particular implies . In general, whether or not this inequality is strict determines the nature of the energy landscape: In order for , the fluctuations of must be relatively small so that the Jensen gap in (1.2) is . This behavior arises when the Gaussian deviations of are washed out by the entropy of , creating a more or less flat landscape. On the other hand, if , then these deviations will have overcome the entropy of , producing large peaks and valleys where is exceptionally positive or negative. From a physical perspective, this latter scenario is more interesting, as these peaks can account for an exponentially vanishing fraction of the state space even as their union accounts for a non-vanishing fraction of the mass of . The primary goal of this paper is to give a sufficient condition for when (in a sense Theorem 1.3 makes precise) places all of its mass on this union of peaks.
Suppose that is differentiable at . Using Gaussian integration by parts, it is not difficult to show (as we do in Corollary 3.10) that
[TABLE]
This identity has been observed before (e.g. see [3, 27, 63, 47], [19, Lemma 7.1], and [24, Theorem 6.1]). For this reason, the condition in which we are interested is . To improve upon (1.3), a first step is to show that if is bounded away from [math], then the random variable is itself stochastically bounded away from [math]. This is the content of Theorem 1.5. The more substantial contribution of this paper, however, is to bootstrap this result to a proof of Theorem 1.4, which roughly says that is stochastically bounded away from [math] even conditional on .
It follows from Corollary 3.10 that implies , but it is natural to ask whether the two conditions are equivalent. This equivalence is true for spin glasses [63, 47] and is believed to be true for directed polymers [24, Conjecture 6.1]. But at the level of generality considered in this paper, we are not aware of any conjecture. In any case, for the examples we consider in Section 1.5, both conditions will be true for sufficiently large .
1.4. Results
Our main result is Theorem 1.3, stated below. It says that at low temperatures, one can find a finite number of (random) states such that almost any sample from the Gibbs measure will have positive overlap with at least one of them. To state this precisely, let us define the sets
[TABLE]
In terms of the metric defined in (1.1), this is just the ball of radius centered at . Typically, such balls have vanishingly small size under as , which should be contrasted with the following behavior of the Gibbs measure.
Theorem 1.3**.**
Assume (• ‣ 1.1)–(• ‣ 1.1). If is a point of differentiability for , and , then for every , there exist integers and and a number such that the following is true for all . With -probability at least , there exist such that
[TABLE]
It is worth noting that in some cases, such as the directed polymer model defined in Section 1.5.2, it is possible (although unproven) that can be taken equal to if is chosen sufficiently small. For other models, however, such as polymers on trees or the Random Energy Model discussed in Section 1.6, will necessarily diverge as .
We will derive Theorem 1.3 as a corollary of Theorem 1.4, stated below. In fact, Theorem 1.3 is actually equivalent to Theorem 1.4, although the latter has a less transparent statement, which is why we have stated Theorem 1.3 as our main result.
Theorem 1.4 concerns the following function on . For given , we will write the conditional expectation of as
[TABLE]
(Note that the expectation can be exchanged with the sum because of Fubini’s theorem, in light of (• ‣ 1.1).) Given , we consider the set
[TABLE]
With this notation, the quantity is the probability that a state sampled from has expected overlap at most with an independent sample from . Theorem 1.4 says that at low temperatures and for small , this probability is typically small.
Theorem 1.4**.**
Assume (• ‣ 1.1)–(• ‣ 1.1). If is a point of differentiability for , and , then for every , there exists sufficiently small that
[TABLE]
To prove Theorem 1.4, we first have to prove a weaker theorem stated below. This result considers the following event in the -algebra ,
[TABLE]
and shows that its probability is small at low temperature.
Theorem 1.5**.**
Assume (• ‣ 1.1)–(• ‣ 1.1). If is a point of differentiability for , and , then for every , there exists sufficiently small such that
[TABLE]
Theorem 1.5 is proved in Section 4, Theorem 1.4 in Section 5, and the equivalence of Theorems 1.3 and 1.4 in Section 6. In Section 3, we provide some general facts that are needed in the main arguments. A detailed sketch of the proof technique is given in Section 2. We will often simplify notation by writing and , where the dependence on is understood and will not be a source of confusion.
1.5. Applications
For many applications, it would suffice to consider which is finite for every . Other applications, however, such as spherical spin glasses or directed polymers with a reference walk of unbounded support, require to be infinite. It is for this reason that we have stated the setting and results in the generality seen above. Now we discuss specific models of interest.
1.5.1. Spin glasses
Let (Ising case) or (spherical case), and take to be uniform measure on . In the mean-field models, the Hamiltonian is of the form
[TABLE]
We will assume
[TABLE]
which is more restrictive than what we require but standard in the literature. Standard applications of Gaussian concentration show that almost surely and in . Assumption (• ‣ 1.1) then follows from the convergence of , where is given by a formula depending on the model. In the Ising case, there is the celebrated Parisi formula [53, 54], proved by Talagrand [62] for even-spin models, building on the seminal work of Guerra [40]. It was later extended by Panchenko [51] to general mixed -spins. For the spherical model, there is a simpler and elegant formula predicted by Crisanti and Sommers [32], and proved by Talagrand [61] and Chen [22].
To accommodate assumptions (• ‣ 1.1) and (• ‣ 1.1), one should assume the function satisfies
[TABLE]
This is because
[TABLE]
Note that the second assumption in (1.11) is automatic if for all odd . When , (1.9) is the classical Sherrington–Kirkpatrick (SK) model [57] if , or the spherical SK model [44] if .
In the spin glass literature, is the usual replica overlap that is studied as an order parameter for the system [59]. Roughly speaking, converges to [math] when , but converges in law to a non-trivial distribution when . In the latter case, the model exhibits what is known as replica symmetry breaking (RSB). If the limiting distribution of , called the Parisi measure, contains distinct atoms (one of which must be [math] [5]), then is said to be RSB. For instance, spherical pure -spin models are RSB for large [52], and it was recently shown that some spherical mixed spin models are RSB at zero temperature [9]. In the Ising case, however, the Parisi measure is expected to have an infinite support throughout the low-temperature phase (with [math] in the support but not as an atom; see [17, Page 15]), a behavior referred to as full-RSB (FRSB). Proving such a statement is a problem of great interest and has been solved at zero temperature [7]. For spherical models, the situation is somewhat clearer; in [23], sufficient conditions were given for both RSB and FRSB, again at zero temperature.
The simplest type of symmetry breaking, RSB, admits the following heuristic picture. The state space is (from the perspective of ) separated into many orthogonal parts called “pure states”, within which the intra-cluster overlap concentrates on some positive value . In the RSB picture, the pure states are not necessarily orthogonal, but rather grouped together into larger clusters which are themselves orthogonal. In this case, the overlap could be (same pure state), (same cluster but different pure state), or [math] (different clusters). The complexity increases in the same fashion for general RSB. In FRSB, the clusters become infinitely nested, yielding a continuous spectrum of possible overlaps while maintaining “ultrametric” structure [49]. In any case, though, there should be asymptotically no part of the state space which is orthogonal to everything; that is, the pure states exhaust .
Absent the intricate hierarchical picture described above, the following rephrasing of Theorem 1.3 confirms this idea.
Theorem 1.6**.**
Assume (1.10) and (1.11), and that is a point of differentiability for such that . Then for every , there exist integers and and a number such that the following is true for all . With -probability at least , there exist such that
[TABLE]
The proof of the above Theorem follows simply from Theorem 1.3 and the observation that by (1.10), is continuous at [math].
Under strong assumptions on and the overlap distribution, namely the (extended) Ghirlanda–Guerra identities, much more precise results were proved by Talagrand [64, Theorem 2.4] and later Jagannath [42, Corollary 2.8]. For spherical pure spin models, similar results were proved by Subag [58, Theorem 1]. An advantage of our approach, beyond its generality, is that our assumptions on are elementary to check and fairly loose (they include all even spin models), and the temperature condition is explicit and sharp.
While the literature on replica overlaps in spin glasses is vast, the reader will find much information in [45, 65, 66, 50]; see also [43] and references therein.
1.5.2. Directed polymers
Given a positive integer , let be the set of all maps from into , and let be the law, projected onto , of a homogeneous random walk on starting at the origin. That is, there is some probability mass function on such that
[TABLE]
Let be i.i.d. standard normal random variables. The Hamiltonian for the model of directed polymers in Gaussian environment is then given by
[TABLE]
In this case, the overlap between two paths is the fraction of time they intersect:
[TABLE]
The assumption (• ‣ 1.1) holds for any [14, Section 2], although typically is taken to be standard simple random walk; all the references below refer to this case. Alternatively, one can consider point-to-point polymer measures, meaning the endpoint of the polymer is fixed. This case is studied in [55, 39] and accommodates the same structure as above, up to changing the reference measure .
Notice that the identity (1.3) immediately implies when . Theorem 1.5 goes a step further, showing that the random variable is itself stochastically bounded away from 0. For a certain class of bounded random environments, a quantitative version of Theorem 1.5 was proved by Chatterjee [21], but Theorem 1.4 is the first of its kind. Unlike some other conjectured polymer properties, the statement (1.7) has not been verified for the so-called exactly solvable models in [56, 31, 46, 13, 67]. For heavy-tailed environments, a stronger notion of localization is considered in [8, 68] and also discussed in [37, 16]. Historically, studying pathwise localization has found somewhat greater success in the context of continuous space-time polymer models [29, 30, 26, 25].
For polymers in Gaussian environment, it is known (see [24, Proposition 2.1(iii)]) that is bounded from above by a constant, and so as by (1.3). (While convexity guarantees is differentiable almost everywhere, it is an open problem to show that is everywhere differentiable, let alone analytic away from the critical value separating the high and low temperature phases.) In this sense, the polymer measure becomes completely localized near the maximizer of as . A main motivation for the present study was to formulate a version of “complete localization” for fixed in the low-temperature regime.
In [69, 15], complete localization was phrased in terms of the endpoint distribution: the law of under . Loosely speaking, what was shown is that if , then with probability at least , one can find sufficiently many (independent of ) random vertices in so that
[TABLE]
This behavior is called “asymptotic pure atomicity”, referring to the fact that even as grows large, the endpoint distribution remains concentrated on an number of sites (rather than diffuse polynomially as in simple random walk). This is analogous to the results of this paper, except that the endpoint statistic has been used to reduce the state space to . The pathwise localization in Theorem 1.3 describes a more global phenomenon occurring in the original state space . Rephrased below, it says that up to arbitrarily small probabilities, the Gibbs measure is concentrated on paths intersecting one of a few distinguished paths a positive fraction of the time.
Theorem 1.7**.**
Assume (1.12) and that is a point of differentiability for such that . Then for every , there exist integers and and a number such that the following is true for all . With -probability at least , there exist paths such that
[TABLE]
In Section 7, we demonstrate that path localization does not occur in the atomic sense (1.14). That is, any bounded number of paths will have a total mass under that decays to [math] as . For this reason, the definitions from [69, 15] of complete localization for the endpoint are inadequate for path localization, necessitating a statement in terms of overlap. This distinguishes the lattice polymer model from its mean-field counterpart on regular trees, which is simply the statistical mechanical version of branching random walk [36, 24]. For those models, the endpoint distribution on the leaves of the tree is obviously equivalent to the Gibbs measure because each leaf is the termination point of a unique path. Moreover, the results of [15] can be interpreted equally well (and improved upon) in that setting (see [12, 41]), and so we will not elaborate on the fact that polymers on trees also fit into the framework of this paper.
1.6. Other Gaussian fields
Here we mention several other models to which our results apply but for which they are not new. Indeed, each model below is known to exhibit Poisson–Dirichlet statistics for the masses assigned by to the “peaks” discussed in the motivating Section 1.3. In particular, asymptotically no mass is given to states having vanishing expected overlap with an independent sample.
- •
Derrida’s Random Energy Model (REM) [33, 34] is set on the hypercube with uniform measure, and has the simplest possible covariance structure: . With , the following formula holds [18, Theorem 9.1.2]:
[TABLE]
See also [60, Chapter 1], in particular Theorem 1.2.1.
- •
The generalized random energy models have non-trivial covariance structure [35], and can be tuned to have an arbitrary number of phase transitions. The condition is satisfied as soon as the first phase transition occurs. See also [18, Chapter 10].
- •
Finally, in [4] Arguin and Zindy studied a discretization of a log-correlated Gaussian field from [11, 10] which has the same free energy as the REM. Their particular model had the technical complication of correlations not following a tree structure, unlike for instance the discrete Gaussian free field.
1.7. Open problems
There are a number of open questions which, if solved, would enhance the theory presented in this paper. A partial list is the following.
- (1)
Understand conditions under which the number of localizing regions is exactly one. As mentioned before, this requires more conditions than (• ‣ 1.1)–(• ‣ 1.1), because it does not hold for some models (such as REM), whereas it is supposed to hold for many others. 2. (2)
A close cousin of the above problem is to understand conditions under which is itself guaranteed to be away from zero with high probability. This would have important implications about the FRSB picture in mean-field spin glasses and path localization in directed polymers. 3. (3)
Obtain a good quantitative bound on in terms of in Theorem 1.4. Our proof gives a very poor bound, since it is based on an iterative argument similar to those used in extremal combinatorics (see the proof sketch in Section 2.2). 4. (4)
For directed polymers, prove a stronger theorem about path localization that says a typical path localizes within a narrow neighborhood of one or more fixed paths, rather than saying that a typical path has nonzero intersection with one or more fixed paths. 5. (5)
Prove more general versions of Theorems 1.3, 1.4 and 1.5 that do not require the condition (• ‣ 1.1) guaranteeing asymptotically nonnegative correlations. This would allow the theory to include other models of interest, such as the Edwards–Anderson model [38] of lattice spin glasses. It is important to note, however, that the hypotheses and conclusions of these more general theorems may require adjustment in order to be physically meaningful. 6. (6)
For any finite , prove estimates that stochastically bound away from . More ambitiously, determine conditions which guarantee that concentrates around its expectation as . 7. (7)
Even when the spin glass correlation function takes negative values (recall that ), it is possible for the Gibbs measure to concentrate on a set such that . This is Talagrand’s positivity principle and is known to hold when the extended Ghirlanda–Guerra identities are satisfied; see [66, Section 12.3] or [50, Section 3.3]. Perhaps the methods of this paper can be adapted to use this input rather than the condition .
2. Proof sketches
The proofs of Theorems 1.4 and 1.5 are long, but they contain ideas that may be useful for other problems. Therefore, we have included this proof-sketch section which, while still rather lengthy, distills the arguments to their central ideas. It introduces some of the notations that will be used later in the manuscript; however, these notations will be reintroduced in the later sections, so it is safe to skip directly to Section 3 should the reader decide to do so.
2.1. Proof sketch of Theorem 1.5
For simplicity, let us assume that the representation (• ‣ 1.1) consists of only finitely many terms:
[TABLE]
Following the argument described below, the general case is handled by some routine calculations (made in Section 3.1) to check that sending poses no issues.
Given (1.3), it is clear that would imply (1.8) if we knew that concentrates around its mean as . Unfortunately, this may not be true in general. Therefore, as a way of artificially imposing concentration, we let the environment evolve as an Ornstein–Uhlenbeck (OU) flow, and then eventually take an average over a short time interval. Formally, this means we consider
[TABLE]
where are independent Brownian motions that are also independent of . Recall the OU generator , and the fact that for any with suitable regularity. By expanding in an orthonormal basis of eigenfunctions of , and expressing both and using the coefficients from this expansion, one can show that
[TABLE]
This inequality, established in Lemma 4.3, provides the proof’s essential estimate when applied to . For this , it is easy to verify that , and
[TABLE]
where and are the expected overlap and free energy, respectively, in the environment . Moreover, from standard methods (worked out in Section 3.2), it follows that with high probability. Combining these observations about with the general variance estimate (2.2), we arrive at
[TABLE]
In other words, averaging over a long enough interval, but whose size is still , results in a value close to the expectation suggested by (1.3). We choose large enough depending on , which determines the level of precision required in (2.3).
Next comes the most crucial step in the proof, where we show that if for some small , then for each , the quantity is also small with high probability. If , this leads to a contradiction to (2.3) if is small enough. To avoid this contradiction, the probability of happening in the first place must be small, which is what we want to show.
To demonstrate our crucial claim, we consider any , where and is large. First, note that
[TABLE]
where comes from the Brownian part of (2.1), and comes from the initial environment:
[TABLE]
Since , we have
[TABLE]
By standard arguments (again presented in Section 3.2), and are both close to with high probability under the Gibbs measure. Thus, for fixed , the random variable behaves like a constant inside . Consequently, we can reduce (2.4) to
[TABLE]
Now let , so that . Again since , we have
[TABLE]
Thus, if denotes expectation in only, then
[TABLE]
In the event that is small, the assumption (• ‣ 1.1) implies that with high probability under the Gibbs measure. Therefore, conditional on this event (which depends only on , not ), we have
[TABLE]
By a similar argument, we also have
[TABLE]
In summary, if , then
[TABLE]
and thus, with high probability,
[TABLE]
By following exactly the same steps with instead of , we show that
[TABLE]
Combining (2.5)–(2.7), we conclude that if , then .
2.2. Proof sketch of Theorem 1.4
We begin this proof sketch where the previous section left off, namely the observation that if the average overlap in environment is small, then Gibbs averages of the type in (2.6) and (2.7) are well concentrated. By the same type of argument — see Lemma 4.5(b) and (5.11) — we can say something more general: no matter the size of , these averages remain concentrated so as long as they are restricted to the set defined in (1.6), where conditional average overlap is small. That is, if is an independent Hamiltonian (i.e. defined with , an independent copy of ), then with high probability,
[TABLE]
In fact, the opposite is true off of the set . If is not too small relative to , then the fluctuations of due to are as . This is again an elementary calculation; see (5.8)–(5.12).
On the other hand, a convenient consequence of Gaussianity is that . That is, an environment perturbation is equivalent in distribution to a temperature perturbation. (In fact, this simple observation underlies the Aizenman–Contucci identities [2], the predecessor of the Ghirlanda–Guerra identities.) Therefore, if we keep track of the dependence on by writing , and abbreviate to , we have
[TABLE]
By rewriting the denominator in a trivial way and using our observation (2.8), we see that with high probability,
[TABLE]
In the last expression above, the only term depending on is the second summand in the denominator. Therefore, Jensen’s inequality gives
[TABLE]
A more careful analysis shows that the Jensen gap is large enough that we can replace the lower bound by , where and are positive constants. One important caveat is that this stronger lower bound is valid only when is not too small (so that the fluctuations of are order ), which is why Theorem 1.5 is needed beforehand. Reading (2.9)–(2.11) from start to end, we obtain
[TABLE]
While the above inequality is the most important step of the proof, a key shortcoming is that the set is defined using rather than . Since we will want to apply the inequality iteratively, we need to replace on the left-hand side by , where
[TABLE]
To make this replacement, we produce a complementary inequality, again using the equivalence of environment/temperature perturbations. For simplicity, let us assume , which is essentially realized by (• ‣ 1.1) for large . Observe that
[TABLE]
where we have applied Cauchy–Schwarz (and then ) and Jensen’s inequality (using the convexity of ). When , the final expression is at most , and so the inequality implies . Now, the random variable has moments of all orders (admitting simple upper bounds), and so it can be essentially regarded as a large constant. In particular, when is small, we will have with high probability, in which case . Combining these ideas with (2.12), we show
[TABLE]
More generally, for any integer ,
[TABLE]
This inequality can now be iterated, with being replaced by , then , and so on, as the expectation on the left is inserted on the right in the next iteration.
Since the left-hand side of (2.13) is always at most , we clearly obtain a contradiction if is larger than , where is the solution to . This would complete the proof of Theorem 1.4 if not for the subtlety that actually depends on in a non-trivial way. Nevertheless, (2.13) can still be used to derive a contradiction of the same spirit unless is small for some , where is large and tends to infinity as , but crucially does not depend on . This approach is reminiscent of tower-type arguments in extremal combinatorics.
Replacing by , we can then say is small. Finally, to deduce the smallness of from the smallness of , we make use of standard arguments showing that if an event is rare at inverse temperature , then it remains rare at inverse temperature .
2.3. Proof sketch of Theorem 1.3
To deduce Theorem 1.3 from Theorem 1.4, simply let be i.i.d. draws from the Gibbs measure. Then by the law of large numbers, when is large,
[TABLE]
with high probability. But by Theorem 1.4, we know that with high probability, is not close to zero. Therefore, with high probability, there must exist such that is not close to zero.
3. General preliminaries
In this preliminary section, we record several facts needed in the proofs of Theorems 1.4 and 1.5. These preparatory results are mostly elementary.
3.1. The Gibbs measure and partition function
In order for our results to apply to a broad collection of models, we have allowed the state space to be completely general, and the Hamiltonian to consist of countably infinite summands. We begin by checking that these assumptions pose no issues to computation. So for the remainder of Section 3.1, we fix the value of .
Let denote expectation with respect to the Gibbs measure when the Hamiltonian is replaced by the finite sum . That is,
[TABLE]
So that we can pass from to , we begin with the following lemma.
Lemma 3.1**.**
For all and any , the following limits hold almost surely and in for any :
[TABLE]
Proof.
We organize the proof into a sequence of claims.
Claim 3.2**.**
With -probability equal to ,
[TABLE]
Proof.
Observe that for fixed , the sequence is a martingale with respect to . Since
[TABLE]
the martingale convergence theorem guarantees that converges -almost surely as to a limit we call . Now Fubini’s theorem proves the claim:
[TABLE]
∎
Claim 3.3**.**
There exist nonnegative random variables and such that
[TABLE]
and
[TABLE]
Proof.
We simply take
[TABLE]
so that (3.3) is satisfied by definition. Since , we need only check (3.4) for . Observe that for any , is a submartingale. By Doob’s inequality, for any and any integer ,
[TABLE]
Therefore, for any ,
[TABLE]
which implies
[TABLE]
Since Tonelli’s theorem gives , (3.4) follows from the above display. ∎
Claim 3.4**.**
For any and any continuous function such that for all , for some , we have
[TABLE]
Proof.
By Claim 3.2 and the continuity of , we almost surely have that for -a.e. , as . And by hypothesis,
[TABLE]
Since
[TABLE]
and Claim 3.3 implies that almost surely , (3.5) now follows from dominated convergence (with respect to ). ∎
Claim 3.5**.**
For any and any continuous function such that for all , for some , we have
[TABLE]
Proof.
Recall that
[TABLE]
Since , the almost sure part of (3.7) is immediate from Claim 3.4. The convergence in is then a consequence of dominated convergence (with respect to ). Indeed, by Cauchy–Schwarz and Jensen’s inequality, we have the majorization
[TABLE]
where the final expression has moments of all orders by (3.4). ∎
We now complete the proof of Lemma 3.1 by taking for (3.2a), and , for (3.2b).
∎
Remark 3.6**.**
The essential feature of the above proof was checking in Claim 3.3 that (• ‣ 1.1) is enough to guarantee the first equality below:
[TABLE]
We will frequently use the above identity, an easy consequence of which is the following.
Lemma 3.7**.**
For any , we have
[TABLE]
as well as
[TABLE]
Proof.
By exchanging the order of expectation in the identity (which we are permitted to do by Tonelli’s theorem) and applying (3.8), we obtain (3.9). For (3.10), we apply Jensen’s inequality to obtain
[TABLE]
then take expectation of both sides, and again exchange the order of expectation. ∎
Let us also record two consequences of Lemma 3.1 that will be needed later in the paper.
Corollary 3.8**.**
For any , the following limits hold almost surely and in for any :
[TABLE]
Proof.
First we argue the almost sure statements. The statements will then follow from bounded convergence, since (• ‣ 1.1) gives the uniform bound
[TABLE]
So we fix the disorder . By Lemma 3.1, it is almost surely the case that for every , and as . We also know . In particular, given , we can choose so large that
[TABLE]
Given , there is such that for all ,
[TABLE]
In particular, for all ,
[TABLE]
and also
[TABLE]
∎
3.2. Derivative of free energy
This section records some important facts regarding convergence of the free energy’s derivative. By Lemma 3.1, it is almost surely the case that the random variable has exponential moments of all orders with respect to . Standard calculations then show that the free energy satisfies
[TABLE]
Recall from (• ‣ 1.1) that . Since is convex for every , is necessarily convex. This assumption implies the following lemma, which is a general fact about the convergence of convex functions.
Lemma 3.9**.**
If is differentiable at , and with as , then
[TABLE]
Proof.
Let . By differentiability, we can choose sufficiently small that
[TABLE]
where the middle inequality is due to convexity. Given , we next choose such that
[TABLE]
which is possible by the continuity of . Now, convexity of implies the following for all such that :
[TABLE]
Upon defining
[TABLE]
it follows that for all sufficiently large ,
[TABLE]
Analogously, (3.13), (3.14b), and (3.15b) together yield the lower bound
[TABLE]
By (• ‣ 1.1), both and tend to [math] almost surely and in as . As is arbitrary, the desired result follows. ∎
Corollary 3.10**.**
For every at which is differentiable,
[TABLE]
In particular, , and there is thus some such that
[TABLE]
Proof.
Using the notation of Lemma 3.1, we have
[TABLE]
By Gaussian integration by parts,
[TABLE]
and then Lemma 3.9 allows us to write
[TABLE]
which completes the proof of (3.17). The inequalities now follow from
[TABLE]
For the second part of the claim, we recall that is convex and thus absolutely continuous. Since , we then have
[TABLE]
Since the integrand is nonnegative, it follows that is non-decreasing for . ∎
So that we can be explicit in the inverse temperature parameter , for the remainder of the section we will write for expectation with respect to . In light of (3.12), Lemma 3.9 implies
[TABLE]
We will require the following stronger form of this result, which also appears in [6, Theorem 3]. Our proof is adapted from the elegant approach of [48], and included for completeness.
Lemma 3.11**.**
If is a point of differentiability for , then
[TABLE]
Proof.
By Lemma 3.9, it suffices to show that if is a point of differentiability for , then
[TABLE]
Fix and choose small enough that
[TABLE]
Given , differentiability allows us to take sufficiently close to to satisfy
[TABLE]
By adding and subtracting , we have
[TABLE]
A simple calculation, followed by Cauchy–Schwarz, shows
[TABLE]
By another application of Cauchy–Schwarz, we have
[TABLE]
From the previous two displays, we find
[TABLE]
In light of this inequality, (LABEL:trivial_integrals) now shows
[TABLE]
where
[TABLE]
In summary,
[TABLE]
where
[TABLE]
Therefore, convexity of implies
[TABLE]
As , (• ‣ 1.1) shows that and each converge to [math] almost surely and in . Thus (3.21) and the above display together yield the desired result, as is arbitrary. ∎
3.3. Temperature perturbations
Here we derive upper bounds for the effects of temperature perturbations on certain expectations with respect to .
Lemma 3.12**.**
The following statements hold for any .
- (a)
For any measurable ,
[TABLE]
- (b)
For any ,
[TABLE]
- (c)
Finally,
[TABLE]
Proof.
All three claims follow from two crucial observations. First, for any ,
[TABLE]
And second,
[TABLE]
Then part (a) immediately follows, since
[TABLE]
For part (b), we first observe that if , then
[TABLE]
where now the right-hand side is independent of and (almost surely) finite. Moreover, we have the following finiteness condition when summing over :
[TABLE]
It thus follows that
[TABLE]
In particular,
[TABLE]
As in part (a), (3.25) now proves (3.22). For part (c), we can argue similarly in order to obtain
[TABLE]
from which (3.25) proves (3.23). ∎
4. Proof of Theorem 1.5
Recall the event under consideration:
[TABLE]
The proof of Theorem 1.5 is a perturbative argument using an Ornstein–Uhlenbeck (OU) flow on the environment,
[TABLE]
where is a collection of independent Brownian motions that are also independent of , and the above definition is understood coordinate-wise. Within Section 4, we denote expectation with respect to by , not to be confused with used in Section 3. We now prove Theorem 1.5 by juxtaposing the following two propositions. Notice that if , then there is nothing to be done; therefore, we may henceforth assume so that conditioning on is well-defined.
Proposition 4.1**.**
If is a point of differentiability for , and , then there exists such that the following holds: For any , there is sufficiently large that
[TABLE]
More specifically,
[TABLE]
For the statement of the second result, let denote the -algebra generated by and .
Proposition 4.2**.**
Assume is a point of differentiability for . Then there is a process adapted to the filtration , such that the following statements hold:
- (a)
For any ,
[TABLE]
- (b)
For any , there exist sufficiently small and sufficiently large, that
[TABLE]
Proof of Theorem 1.5.
Let be given, and assume the hypotheses of Proposition 4.1. By that result, there is and large enough that
[TABLE]
Let be the process guaranteed by Proposition 4.2, and define the events
[TABLE]
By Proposition 4.2(a),
[TABLE]
And by Proposition 4.2(b), we can choose sufficiently small and sufficiently large that
[TABLE]
Observe that , and clearly the events and are disjoint. We thus have
[TABLE]
On the other hand,
[TABLE]
Putting the two previous displays together, we find
[TABLE]
and so
[TABLE]
∎
4.1. Proof of Proposition 4.1
We will need to recall some facts about Ornstein–Uhlenbeck processes. To avoid technical complications, we restrict ourselves to finite-dimensional OU processes, and then take an appropriate limit at a later stage.
4.1.1. General OU theory
Fix a positive integer , and consider a vector of i.i.d. standard normal random variables. Let be an independent -dimensional Brownian motion. The OU flow starting at is given by
[TABLE]
This is a continuous-time, stationary Markov chain. Let denote the OU semigroup; that is, for ,
[TABLE]
Denote the OU generator by . It is especially useful to consider the spectral decomposition of , whose eigenfunctions are the multivariate Hermite polynomials. For our purposes, it suffices to recall the following well-known facts (see, for instance, [20, Chapter 6]):
- •
Let denote the -dimensional standard Gaussian measure. There is an orthonormal basis of consisting of eigenfunctions of , where , , and with for . Therefore, if , then
[TABLE]
Furthermore, if , then
[TABLE]
- •
The OU semigroup acts on by
[TABLE]
Therefore, if , then
[TABLE]
- •
The associated Dirichlet form is given by
[TABLE]
whenever and are twice-differentiable functions in such that both expectations above are finite. In particular, if is twice-differentiable, then
[TABLE]
Lemma 4.3**.**
For any twice differentiable with , we have
[TABLE]
Proof.
Take any . By the law of total variance, we have
[TABLE]
In particular, if we write in the form , then
[TABLE]
Therefore,
[TABLE]
Hence
[TABLE]
∎
Proof of Proposition 4.1.
Let be the OU flow from (4.1), and write
[TABLE]
Recall that denotes expectation with respect to . Let and be the associated partition function and free energy, respectively. That is, with , we have
[TABLE]
So that we can use the finite-dimensional facts discussed before, define , as well as
[TABLE]
Define by
[TABLE]
so that , where is understood to mean . Note that , since for , and so using the same arguments as in Lemma 3.7 yields
[TABLE]
Similar to (3.1), for general , we define
[TABLE]
Observe that
[TABLE]
which implies
[TABLE]
as well as
[TABLE]
where the derivative is with respect to . Note that
[TABLE]
Furthermore,
[TABLE]
We thus have
[TABLE]
From (4.16), it is clear that . Therefore, by Lemma 4.3 and (4.15),
[TABLE]
Moreover, from (4.10) we know
[TABLE]
We can now apply (3.2a) (together with (3.12)) and (3.11) to take the limit in the two previous displays and obtain
[TABLE]
Consequently, for any , Chebyshev’s inequality shows
[TABLE]
Now consider that
[TABLE]
Therefore, if is a point of differentiability for , then for any sequence , Lemma 3.9 guarantees
[TABLE]
When for fixed , (4.17) and (4.18) together show
[TABLE]
Assuming , we let . Then the previous display implies
[TABLE]
The proof is completed by taking sufficiently large that
[TABLE]
∎
4.2. Proof of Proposition 4.2
Let us rewrite (4.1) as
[TABLE]
Recall that . For any , we have
[TABLE]
In light of Lemma 3.11, we anticipate that for ,
[TABLE]
Indeed, the process that will satisfy the conclusions of Proposition 4.2 is
[TABLE]
To prove so, the following lemma will suffice. Recall that
[TABLE]
Lemma 4.4**.**
For any , the following statements hold:
- (a)
If is a point of differentiability for , then there is a sequence of nonnegative random variables depending only on , , and , such that
[TABLE]
and for every , ,
[TABLE]
- (b)
There exist sufficiently small and sufficiently large, that for every , , , and , we have
[TABLE]
Before checking these facts, let us use them to prove Proposition 4.2. The idea is to use the above sequence to control the differences simultaneously across all and ; this will allow us to prove (4.3). On the other hand, (4.23) shows that when is small, remains close to . That this approximation holds uniformly over will lead to (4.4).
Proof of Proposition 4.2.
First we prove part (a). Let be fixed. From Lemma 4.4(a), we identify a sequence of random variables such that (4.22) holds, and
[TABLE]
Under our definition (4.20), we have
[TABLE]
Now Markov’s inequality and (4.24) together imply
[TABLE]
which completes the proof of (a).
Next we prove part (b). Let be given. Similar to above, for any we have
[TABLE]
From Lemma 4.4(b), we choose sufficiently small that (4.23) holds for all , with . We then have, for all sufficiently large,
[TABLE]
Then applying Markov’s inequality yields (4.4). ∎
It now remains to prove Lemma 4.4. To do so, we will make use of the following preparatory result, which in fact is the common thread between the proofs of Theorems 1.4 and 1.5. Let be an independent copy of the disorder . We will use and to denote expectation and variance with respect to , conditional on . All statements involving these conditional quantities will be almost sure with respect to , although we will not repeatedly write this.
Lemma 4.5**.**
Recall the constant from (• ‣ 1.1). For any , the following statements hold:
- (a)
For any ,
[TABLE]
- (b)
For any measurable ,
[TABLE]
Proof.
For any ,
[TABLE]
Now, for all , we have . In particular, since
[TABLE]
we see from (LABEL:general_f) that
[TABLE]
Alternatively, if , then we can use the equalities in (LABEL:general_f) to write
[TABLE]
∎
We are now ready to prove Lemma 4.4.
Proof of Lemma 4.4.
Let be arbitrary. Recall the random variable defined in (4.19). Observe that for fixed , is equal in law to , where is an independent copy of . Therefore, if we define
[TABLE]
then
[TABLE]
Since the conclusions of Lemma 4.4 depend only on marginal distributions at fixed , it suffices to prove bounds of the form
[TABLE]
where satisfies (4.21), and
[TABLE]
So henceforth we fix , and . We will need the following four claims. In checking these claims, we will frequently use the following inequality, which holds for any :
[TABLE]
Claim 4.6**.**
For any ,
[TABLE]
Claim 4.7**.**
For any ,
[TABLE]
Claim 4.8**.**
Given any , set . For all large enough that ,
[TABLE]
Claim 4.9**.**
For any even and , the following inequalities hold for all :
[TABLE]
and thus
[TABLE]
Before proving the claims, we use them to obtain the desired statements.
4.2.1. Proof of Lemma 4.4(a)
First note that for any random variables and ,
[TABLE]
Therefore,
[TABLE]
Let be a positive number to be chosen later. Anticipating the application of Claims 4.8 and 4.9, we condense notation by defining
[TABLE]
Because of (LABEL:prep_for_full_bound), we seek a bound of the form
[TABLE]
Therefore, once we set
[TABLE]
and take expectation, (LABEL:prep_for_full_bound) becomes
[TABLE]
which is exactly (4.27). To complete the proof of Lemma 4.4(a), we need to show that given any , we can choose sufficiently small that (4.21) holds ( depends on through and ).
Indeed, by Cauchy–Schwarz we have
[TABLE]
Next we observe that for and sufficiently large such that ,
[TABLE]
Meanwhile, if and , then
[TABLE]
By Lemma 3.11, the previous display shows
[TABLE]
In light of (4.37) and (4.38), it is clear from this inequality that can be chosen sufficiently small that (4.21) holds.
4.2.2. Proof of Lemma 4.4(b)
To establish (4.28), it will be easier to replace by , where
[TABLE]
By Lemma 4.5(a),
[TABLE]
and so
[TABLE]
as well as
[TABLE]
Because
[TABLE]
we have and can thus apply Chebyshev’s inequality to obtain
[TABLE]
We will use these inequalities in the following bound:
[TABLE]
Now,
[TABLE]
and
[TABLE]
In addition,
[TABLE]
Using (4.39), (4.40), and (4.42)–(4.44) in (LABEL:observation_0), we find
[TABLE]
In particular, for any and large enough that ,
[TABLE]
and so (4.35) implies
[TABLE]
Given , we choose and small enough (in that order, and depending only on , , and ) so that the rightmost expression above is at most . Moreover, it is clear that once and are chosen, could be replaced by for any , and the rightmost expression will be bounded from above by . Taking expectations on both sides yields (4.28).
4.2.3. Proof of Claim 4.6
Assume or . Using Jensen’s inequality, we have
[TABLE]
4.2.4. Proof of Claim 4.7
Assume . By Cauchy–Schwarz and Jensen’s inequality, we have
[TABLE]
4.2.5. Proof of Claim 4.8
Assume . By Jensen’s inequality,
[TABLE]
Recall that , and we assume . By (4.29),
[TABLE]
which implies
[TABLE]
Repeated applications of Cauchy–Schwarz yield
[TABLE]
By similar manipulations,
[TABLE]
Together, (4.45)–(4.48) yield (4.32).
4.2.6. Proof of Claim 4.9
Assume is even. By Cauchy–Schwarz and Jensen’s inequality, we have
[TABLE]
For any , we have the inequality for all . Hence
[TABLE]
Assume so that whenever
[TABLE]
it follows that
[TABLE]
We thus have
[TABLE]
Combining (LABEL:next_1)–(LABEL:next_3), we have now shown that
[TABLE]
Finally, given , we choose large enough that , thereby producing (LABEL:bad_prep_var_bound). Then (4.34) is the special case when . ∎
5. Proof of Theorem 1.4
In this section, we consider perturbations to the environment of the form
[TABLE]
where the ’s are independent copies of . An important observation is that
[TABLE]
We will continue to use to denote expectation with respect to and the ’s jointly, whereas will denote expectation with respect to conditional on and , . As before, all statements involving and are to be interpreted as almost sure statements.
As in Section 3, will denote expectation with respect to . On the other hand, we will write to denote expectation under the measure , where the dependence on is understood. That is,
[TABLE]
For , define the set
[TABLE]
where is the set under consideration in Theorem 1.4, whose proof will rely on Propositions 5.1 and 5.3 below.
Proposition 5.1**.**
For any , there exists such that for all , , and ,
[TABLE]
Proof.
For any measurable , an application of (5.2), followed by Cauchy–Schwarz and Jensen’s inequality, gives
[TABLE]
So we define the random variable
[TABLE]
and consider, for fixed , the function . By (4.26), is -valued, and (• ‣ 1.1) implies
[TABLE]
So the above estimate shows
[TABLE]
In particular, when is sufficiently large that ,
[TABLE]
We have thus shown , which implies
[TABLE]
where in the second inequality we have used the fact that if , then . To handle the last term in the above display, we note that for any ,
[TABLE]
Now, for any and any ,
[TABLE]
Hence
[TABLE]
Choosing and , we arrive at
[TABLE]
which holds for all such that . ∎
Next we consider the event
[TABLE]
where is the event under consideration in Theorem 1.5.
Lemma 5.2**.**
Assume is a point of differentiability for , and . For any , there is sufficiently small that for any positive constant , the following is true. If for all , then
[TABLE]
Proof.
By Theorem 1.5, there is sufficiently small that
[TABLE]
Let us write , and then observe that
[TABLE]
Since , we have , and thus Lemma 3.12(c) gives
[TABLE]
By Lemma 3.9, the right-hand side above converges to [math] almost surely as . In particular,
[TABLE]
and so (5.3) follows from (5.4) and (5.5). ∎
Proposition 5.3**.**
Given any , there are positive constants and such that the following holds for any . There exists so that for every , , and ,
[TABLE]
Proof.
Let be given, and take such that for all . Consider any , and define the random variables
[TABLE]
Step 1. Show that is concentrated at , but is not concentrated at when occurs.
First observe that for any , Jensen’s inequality implies
[TABLE]
In particular, for any ,
[TABLE]
On the other hand,
[TABLE]
We have the upper bound
[TABLE]
as well as the lower bound
[TABLE]
Meanwhile, we have for all . Hence Lemma 4.5(b) implies
[TABLE]
Using (5.9)–(5.11) in (5.8) yields
[TABLE]
So on the event , (5.12) shows
[TABLE]
for all . Given and , we fix large enough such that
[TABLE]
Because of (5.14b), the inequalities (5.7) and (5.13) together yield
[TABLE]
for all .
Step 2. Since , obtain an upper bound on the error in the following approximation:
[TABLE]
Simple algebra gives
[TABLE]
and
[TABLE]
Step 3. Since is not concentrated at when occurs, obtain a lower bound on the gap in the following application of Jensen’s inequality:
[TABLE]
We consider the function given by
[TABLE]
In particular, we consider its Taylor series approximation about ,
[TABLE]
where belongs to the interval between and . We note that such an expansion exists because the identity shows . Jensen’s inequality implies
[TABLE]
We will now produce a lower bound on the Jensen gap.
First observe that is decreasing on . Consequently, if , then . Similarly, if , then . Therefore, for all , we have
[TABLE]
where the second term in the final expression need not depend on since .
Step 4. Reckon the final bound.
In summary, for all ,
[TABLE]
∎
Proof of Theorem 1.4.
Let be given. From Lemma 5.2, we fix so that for any bounded sequence of nonnegative integers, we have
[TABLE]
We wish to find , depending only on and , such that .
Let , its exact value to be decided later. From Proposition 5.3, we know that for all and ,
[TABLE]
And from Proposition 5.1, we can assume
[TABLE]
Linking the two inequalities, we find that
[TABLE]
where now we fix the constants and . Note that , and so this reasoning can be iterated. Iterating times produces the estimate
[TABLE]
which implies the existence of some such that
[TABLE]
So we take large enough that
[TABLE]
and then choose small enough that
[TABLE]
We now have, for all ,
[TABLE]
Combining this bound with (5.18), we see that
[TABLE]
To now complete the proof, we must obtain from this result an analogous one with .
As in the proof of Lemma 5.2, we will write . For , define the set
[TABLE]
It follows from (5.1) that
[TABLE]
Since , Lemma 3.12(b) implies
[TABLE]
Denote the right-hand side above by . Take . From the above display, Hence
[TABLE]
And by Lemma 3.12(a),
[TABLE]
From the previous two displays and (5.22), we have
[TABLE]
Finally, Lemma 3.9 shows that almost surely and in as . Consequently, . ∎
6. Proof of equivalence of Theorems 1.3 and 1.4
Theorem 1.3 is implied by Theorem 1.4 once we establish the following result. Recall the definitions (1.4) and (1.6).
Proposition 6.1**.**
Suppose is defined by (• ‣ 1.1), where are i.i.d. random variables with zero mean and unit variance (not necessarily Gaussian). Assume (• ‣ 1.1)–(• ‣ 1.1). Then the following two statements are equivalent:
For every , there exist integers and and a number such that the following is true for all . With -probability at least , there exist such that
[TABLE]
For every , there exists sufficiently small that
[TABLE]
6.1. Proof of
Let be given. By , we can choose small enough and large enough so that for all ,
[TABLE]
It follows from Markov’s inequality that
[TABLE]
Now, by the Paley–Zygmund inequality, for any ,
[TABLE]
Therefore,
[TABLE]
Choosing , we have
[TABLE]
Therefore,
[TABLE]
This completes the proof, since
[TABLE]
6.2. Proof of
We begin with a lemma that roughly states the following. If many random variables each have non-negligible positive correlation with a distinguished variable, then at least one pair of these variables has non-negligible positive correlation.
Lemma 6.2**.**
For any , there exists such that the following holds for any integer and any . If , then
[TABLE]
Proof.
Consider the matrix , where
[TABLE]
Observe that is positive semi-definite: for any ,
[TABLE]
Now let . For with , our assumptions give
[TABLE]
We now take to obtain
[TABLE]
Supposing that , we further see
[TABLE]
which yields a contradiction as soon as . ∎
We will contrast Lemma 6.2 with the one below, which says that if is small enough, then any non-negligible subset of has many nearly orthogonal elements.
Lemma 6.3**.**
For any and positive integer , there is such that the following holds. If with , then there are such that
[TABLE]
Proof.
Set . Observe that for any , we have the following implication:
[TABLE]
Therefore, one can inductively choose
[TABLE]
where (6.3) guarantees that
[TABLE]
Hence can be found so long as .
∎
We can now complete the proof. Assume that holds. Suppose, contrary to , that there is some such that for every ,
[TABLE]
Note that for any such that , we have
[TABLE]
and thus .
From , we choose and so that for all large enough (depending on on ), the following is true with -probability at least : There exist such that
[TABLE]
Once has been determined, choose so that the conclusion of Lemma 6.2 holds. Then, given the values of and , choose so that the conclusion of Lemma 6.3 holds with and .
In summary, if is large enough, and (by (6.4), there are infinitely many for which this is the case), the following is true. With -probability at least , we have both and (6.5) for some . In this case, we have
[TABLE]
Therefore, there is some such that
[TABLE]
By our choice of , we can find satisfying
[TABLE]
But , and so the above display contradicts (6.2).
7. Polymer measures are asymptotically non-atomic
In this section we prove that directed polymers on the lattice are asymptotically non-atomic. It is a striking phenomenon that at sufficiently small temperatures, the polymer endpoint distribution places a non-vanishing mass on a single element of (which is random and varies with ) [28]. The fact that the polymer measures themselves do not share this property, stated below as Theorem 7.1, justifies the investigation of replica overlap as an order parameter for path localization. To emphasize the fact that the Gaussian environment can be replaced by a general one, we reintroduce notation for directed polymers.
Let be a collection of i.i.d. random variables. We will assume that
[TABLE]
and also that
[TABLE]
in order to avoid trivialities. Let denote the set of nearest-neighbor paths of length in starting at the origin. Note that . To each in we associate the Hamiltonian energy
[TABLE]
The polymer measure is then defined by
[TABLE]
Theorem 7.1**.**
Assume (7.1). Then for any and any ,
[TABLE]
The remainder of Section 7 is to prove Theorem 7.1. We begin by defining the passage time,
[TABLE]
We will denote the set of maximizing paths by
[TABLE]
It is well-known (for instance, see [39]) that there is a finite constant such that
[TABLE]
The first equality above is a consequence of the superadditivity of , and the second equality leads to a short proof of the following standard fact.
Lemma 7.2**.**
.
Proof.
Let and . Observe that , and so
[TABLE]
where the final equality is strict because . ∎
Definition 7.3**.**
For a nearest-neighbor path of length in , define the turns of to be the following set of indices:
[TABLE]
The number of turns of will be denoted .
Lemma 7.4**.**
For any , there is small enough that
[TABLE]
Proof.
Given an integer , , we count the elements of as follows. First, the number of choices for is . Next, a turn should occur at exactly of the coordinates . Moreover, if a turn occurs at , then there are choices for (so as to avoid ). Finally, if a turn does not occur at , then there is only one choice for , namely . Therefore, for any positive integer ,
[TABLE]
If for , then Stirling’s approximation gives
[TABLE]
Therefore,
[TABLE]
Now choose sufficiently small that the right-hand side above is strictly less than . Inverting the logarithm and choosing large enough now yields the desired result. ∎
Lemma 7.5**.**
Let denote a sequence of i.i.d. pairs of independent random variables. For any and , there exists large enough that
[TABLE]
Proof.
Choose large enough that satisfies . We then have
[TABLE]
∎
Proof of Theorem 7.1.
Let denote a generic copy of , and . Set , which is positive by Lemma 7.2. By assumption, there is such that . Take any and observe that for any given ,
[TABLE]
Using dominated convergence, it is easy to show that
[TABLE]
and so we may choose sufficiently small that . Set , and then choose sufficiently small that . With as in Lemma 7.4, we have the union bound
[TABLE]
By our choice of , Borel–Cantelli implies that the following statement holds almost surely:
[TABLE]
On the other hand, it is apparent from (7.5) and our choice of that almost surely, we have for all large . For any such , we then have for every , the set of maximizing paths defined in (7.4). That is, almost surely:
[TABLE]
Together, the two previous displays show that almost surely,
[TABLE]
Recall from (7.6) that denotes the set of turns in the path . For a given and , let denote the unique element of such that but for all . That is, while . Upon taking and in Lemma 7.5, a union bound gives
[TABLE]
Therefore, we can again apply Borel–Cantelli to see that almost surely,
[TABLE]
Now combining this statement with (7.7), we arrive at the following almost sure event:
[TABLE]
In particular, since has at least one element (call it ), we have the following for all :
[TABLE]
Since and do not depend on , (7.3) follows. ∎
8. Acknowledgments
We are grateful to Francis Comets for valuable feedback and discussion, and to the referees for their beneficial comments, suggestions, and edits.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Adler, R. J., and Taylor, J. E. Random fields and geometry . Springer Monographs in Mathematics. Springer, New York, 2007.
- 2[2] Aizenman, M., and Contucci, P. On the stability of the quenched state in mean-field spin-glass models. J. Statist. Phys. 92 , 5-6 (1998), 765–783.
- 3[3] Aizenman, M., Lebowitz, J. L., and Ruelle, D. Some rigorous results on the Sherrington-Kirkpatrick spin glass model. Comm. Math. Phys. 112 , 1 (1987), 3–20.
- 4[4] Arguin, L.-P., and Zindy, O. Poisson-Dirichlet statistics for the extremes of a log-correlated Gaussian field. Ann. Appl. Probab. 24 , 4 (2014), 1446–1481.
- 5[5] Auffinger, A., and Chen, W.-K. On properties of Parisi measures. Probab. Theory Related Fields 161 , 3-4 (2015), 817–850.
- 6[6] Auffinger, A., and Chen, W.-K. On concentration properties of disordered Hamiltonians. Proc. Amer. Math. Soc. 146 , 4 (2018), 1807–1815.
- 7[7] Auffinger, A., Chen, W.-K., and Zeng, Q. The SK model is infinite step replica symmetry breaking at zero temperature. Comm. Pure Appl. Math. 73 , 5 (2020), 921–943.
- 8[8] Auffinger, A., and Louidor, O. Directed polymers in a random environment with heavy tails. Comm. Pure Appl. Math. 64 , 2 (2011), 183–204.
