Surface energy and boundary layers for a chain of atoms at low temperature
Sabine Jansen, Wolfgang K\"onig, Bernd Schmidt, Florian Theil

TL;DR
This paper investigates the low-temperature behavior of a chain of atoms with Lennard-Jones interactions, focusing on surface energy, boundary layers, and large deviations of Gibbs measures, revealing zero-temperature limits and correlation decay bounds.
Contribution
It establishes large deviation principles for Gibbs measures at low temperature, characterizes surface energy functionals, and links zero-temperature limits to boundary layers and free energy corrections.
Findings
Gibbs measures satisfy large deviations principles with specific energy functionals.
Surface correction converges to zero-temperature surface energy.
Bulk measures can be approximated by Gaussian measures.
Abstract
We analyze the surface energy and boundary layers for a chain of atoms at low temperature for an interaction potential of Lennard-Jones type. The pressure (stress) is assumed small but positive and bounded away from zero, while the temperature goes to zero. Our main results are: (1) As at fixed positive pressure , the Gibbs measures and for infinite chains and semi-infinite chains satisfy path large deviations principles. The rate functions are bulk and surface energy functionals and . The minimizer of the surface functional corresponds to zero temperature boundary layers. (2) The surface correction to the Gibbs free energy converges to the zero temperature surface energy, characterized with the help of the minimum of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Surface energy and boundary layers for a chain of atoms at low temperature
Sabine Jansen
Mathematisches Institut, Ludwig-Maximilians-Universität, Theresienstr. 39, 80333 München, Germany
,
Wolfgang König
Weierstrass Institute Berlin, Mohrenstr. 39, 10117 Berlin and Technische Universität Berlin, Str. des 17. Juni 136, 10623 Berlin, Germany
,
Bernd Schmidt
Institut für Mathematik, Universität Augsburg, Universitätsstr. 14, 86159 Augsburg, Germany
and
Florian Theil
Mathematics Institute, University of Warwick, Coventry, CV4 7AL, UK
(Date: April 12, 2019)
Abstract.
We analyze the surface energy and boundary layers for a chain of atoms at low temperature for an interaction potential of Lennard-Jones type. The pressure (stress) is assumed small but positive and bounded away from zero, while the temperature goes to zero. Our main results are: (1) As at fixed positive pressure , the Gibbs measures and for infinite chains and semi-infinite chains satisfy path large deviations principles. The rate functions are bulk and surface energy functionals and . The minimizer of the surface functional corresponds to zero temperature boundary layers. (2) The surface correction to the Gibbs free energy converges to the zero temperature surface energy, characterized with the help of the minimum of . (3) The bulk Gibbs measure and Gibbs free energy can be approximated by their Gaussian counterparts. (4) Bounds on the decay of correlations are provided, some of them uniform in .
Keywords: atomistic models of elasticity - surface energy and boundary layers - semi-classical limit of transfer operators - uniform decay of correlations - path large deviations for stationary processes.
MSC2010 classification: 82B21, 74B20, 74G65, 60F10.
Contents
1. Introduction
The purpose of the present article is to analyze the low-temperature behavior for a one-dimensional chain of atoms that interact via a Lennard-Jones type potential. The model is atomistic and in terms of the Gibbs measures of classical statistical mechanics. Two limiting procedures are at play: the zero-temperature limit, for which the inverse temperature goes to infinity, and the thermodynamic limit, where the number of particles and the system size go to infinity. The order of the limits matters. When the zero-temperature limit is taken before the limit, the analysis of Gibbs measures is replaced by energy minimization, leading to variational models of non-linear elasticity. We perform instead the zero-temperature limit after the thermodynamic limit. The zero-temperature limit for infinite systems is far from trivial, see [vER07, CGU11, CH10] and the discussion in [BRS10].
For the one-dimensional Lennard-Jones interaction, it is known that energy minimizers (ground states) converge to a periodic lattice [GR79] (“crystallization”). For one-dimensional systems with pair potentials that decay faster than it is well-known that, in contrast, at positive temperature, no matter how small, there is no crystallization [BL15]. Nevertheless, some quantities can be approximated well by their zero-temperature counterpart. For the bulk free energy this is to be expected, for other quantities such as surface corrections this is already more subtle. For the decay of correlations, it is a priori not even clear what the zero-temperature counterpart should be; we propose a natural candidate, see Eqs (2.10) and (2.11).
At zero temperature, surface corrections and boundary layers have been studied, for example, in order to better understand variational models of fracture, see e.g. [BC07, SSZ11] and the references therein. Fracture might be expected for elongated chains, forced to stretch beyond their preferred length. At small positive temperature, large interparticle distances correspond to low pressure (stress) . We address this regime in a subsequent work and focus here on the elastic regime of positive pressure , though the case of small pressure is discussed in some comments.
Our main results come in four parts. They are listed in Sections 2.1–2.4 and proven in Sections 3–7. At zero temperature, we extend the result on bulk periodicity from [GR79] to a more general class of potentials and positive pressure, see Theorem 2.1. We prove the existence of bounded surface corrections, and characterize them with the help of an energy functional for semi-infinite chains (Theorem 2.2).
At positive temperature, we prove large deviations principles for the Gibbs measures and on and (product topology) as at fixed (Theorem 2.4). The speed is and the respective rate functions are energy functionals and whose minimizers are, respectively, the periodic bulk ground state and the zero-temperature boundary layer. The convergence of positive-temperature surface corrections to their zero-temperature counterpart is addressed in Theorem 2.5. These results are intimately related to path large deviations for Markov processes and Hamilton-Jacobi-Bellman equations [FK06], semi-classical analysis [Hel02], and a more direct approach to low-temperature expansions [SL17]. We remark that our results are valid for long range interactions which in particular are not assumed to have superlinear growth at infinity. The large deviations principle is complemented by a result on Gaussian approximations for the bulk Gibbs measure and the Gibbs free energy, valid for finite interaction range (Theorems 2.7 and 2.8).
Finally we study the temperature-dependence of correlations and informally discuss how correlations connect with effective interactions of defects and the decay of boundary layers. Theorem 2.9 provides a priori estimates that hold for all . In Theorem 2.11 we show that for finite and small positive pressure , the decay of correlations is exponential with a rate of decay that stays bounded as —the associated Markov chain has a spectral gap bounded away from zero. This uniform estimate is proven with perturbation theory for the transfer operator. For infinite , we provide instead a uniform estimate for restricted Gibbs measures (Proposition 2.10), which follows from the convexity of the energy (in a neighborhood of the periodic gound state) and techniques from the realm of Brascamp-Lieb inequalities [Hel02]. At vanishing pressure or fixed high pressure , the spectral gap might become exponentially small because of fracture or metastable wells [BdH15] in non-convex energy landscapes.
Bringing statistical mechanics into atomistic models of crystals and elasticity has a rich tradition [BH98, Wei02, BCF86, Pen02]. Modern developments include: the study of gradient Gibbs measures [FS97] with sophisticated tools such as renormalization groups and cluster expansions [AKM16], random walk representations [BFS82], and Witten Laplacians [Hel02]; scaling limits and gradient Young-Gibbs measures [Pre09, KL14, Run15]; the extension of approximation schemes, e.g., the quasi-continuum method, to positive temperature [BLBLP10, TM11]. In addition, there have been some inroads into the open problem of proving crystallization in the form of orientational order for two-dimensional models [Aum15, HMR14].
To the best of our knowledge, all of the aforementioned mathematical literature, notably on Gibbs gradient measures, is limited to potentials with a superlinear growth at infinity. This is in stark contrast with the decay to zero typically imposed in statistical mechanics of point particles [Rue69]. We work with potentials , an additional linear term enters because we work in the constant pressure ensemble, which is the most convenient ensemble for one-dimensional systems [Rue69, Section 5.6.6]. As a consequence, the by now classical combination of Bakry-Émery estimates and Holley-Stroock perturbation principle, see [Men14] and the references therein, becomes potentially more delicate. We use instead estimates on energy penalties, some aspects of which might generalize to higher-dimensional models.
Another aspect that might generalize to higher dimension concerns the large deviations principle. The existence of a large deviations principle for the Gibbs measure as , proven using a exponential tightness and fixed point equation for the measure, amounts to the construction of an infinite volume energy functional that vanishes on ground states only. In higher dimension, the role of the fixed point equation is taken by DLR-conditions named after Dobrushin, Lanford, Ruelle [Geo11] and the proof of a large deviations principle reduces to the investigation of a higher-dimensional analogue of a Bellman equation. The theory of the latter, for non-unique ground states, might mirror possible intricacies of the zero-temperature limit of Gibbs measure described in [vER07].
Finally we remark that the results of this work allow for a detailed analysis of typical atomic configurations at low temperature and low density. In [JKST19] we will in particular prove that, when the density is strictly smaller than the density of the ground state lattice, a system with particles fills space by alternating approximately crystalline domains (“clusters”) with empty domains (“cracks”). The number of domains is of the order of with the surface energy from Theorem 2.2 below.
2. Main results
2.1. Zero temperature
Let be a pair potential, a truncation parameter and the pressure. At zero temperature we allow for , at positive temperature we impose . The Gibbs energy at zero temperature and pressure for a system of particles with positions and interparticle spacings , , is
[TABLE]
The parameter restricts the range of the interaction: corresponds to a next-nearest neighbor interaction. This section deals with the minimization problem
[TABLE]
in the limit . Throughout we assume that the following assumption holds.
Assumption 1**.**
The pair potential is equal to on for some and a function on . There exist and , such that the following holds.
- (i)
Shape of : is the unique minimizer of and satisfies . is decreasing on and increasing and non-positive on .
- (ii)
Growth of : for all and for all .
- (iii)
Shape of : is decreasing on and increasing and non-positive on .
- (iv)
Growth of : for all and .
The assumption is satisfied, for example, by the Lennard-Jones potential . As we will see, parts (i) and (ii) of the assumption guarantee that energy minimizers at have interparticle spacings in , parts (iii) and (iv) ensure that is uniformly strictly convex in ; moreover the Hessian is diagonally dominant with positive diagonal entries and negative off-diagonal entries.
Assumption 2**.**
The pressure satisfies with .
At positive temperature we shall assume in addition that , , and for some results we need . The next theorem is the adaptation of a similar result by Gardner and Radin [GR79]. It is proven in Section 3.1.
Theorem 2.1** (Bulk properties).**
Let and as in Assumption 2.
- (a)
For every , the map has a unique minimizer . The mimizer has all its spacings in . 2. (b)
As along , we have where is the unique minimizer of . 3. (c)
The limit exists and is given by
[TABLE]
Let be the space of sequences with none or at most finitely many elements different from . Define
[TABLE]
When , is a function of the whole sequence. is the Gibbs energy of a semi-infinite chain, with additive constant chosen in such a way that at spacings the Gibbs energy is zero; represents the interaction of the left-most particle with everybody else. Let be the space of square summable strains.
Theorem 2.2** (Surface energy).**
Let and as in Assumption 2. Equip with the -metric. Then
- (a)
* extends to a continuous functional on .* 2. (b)
On it is strictly convex. 3. (c)
* has a unique minimizer. The minimizer lies in .* 4. (d)
The limit exists and is given by
[TABLE]
The theorem is proven in Section 3.2. Note that is the surface energy for a clamped chain with all spacings equal to and encodes the effect of boundary layers. is multiplied by because finite chains have two ends. We note that is exactly the boundary layer energy introduced by Braides and Cicalese [BC07]; Braides and Cicalese dealt with the special case of next-nearest neighbor interactions but more general potentials. For finite , see [SS18, Theorem 4.2].
For later purpose we also define a bulk functional
[TABLE]
It is defined, a priori, on the space of positive bi-infinite sequences that have at most finitely many elements . Denoting the space of square summable strains , an analysis similar to the one for the surface functional yields the following result.
Proposition 2.3** (Limiting bulk properties).**
Let and as in Assumption 2. Equip with the -metric. Then
- (a)
* extends to a continuous functional on .* 2. (b)
On it is strictly convex. 3. (c)
The unique minimizer of is the constant sequence . The minimum value is . 4. (d)
For every one has
[TABLE]
where is the total interaction between the left and right half-infinite chain.
2.2. Small positive temperature
Next we analyze infinite volume Gibbs measures on and in the limit . We focus on fixed positive but comment on vanishing at the end of the section. Let be the probability measure on defined by
[TABLE]
where
[TABLE]
Standard arguments (see Section 4) show there is a uniquely defined probability measure on the product space such that for every , every bounded continuous test function ,
[TABLE]
Similarly, there is a uniquely defined probabilty measure on such that for all local test functions as above, and all sequences with and ,
[TABLE]
Moreover the measure is shift-invariant and mixing. The measure describes the bulk behavior of a semi-infinite chain, the measure is the equilibrium measure for a semi-infinite chain and encodes the probability distribution of boundary layers.
Our first result is a large deviations principle for the equilibrium measure as . The rate function is a suitable extension of : define by
[TABLE]
In the same way extends to a map from to . Both and are equipped with the product topology.
Theorem 2.4**.**
Fix and . Assume that and . Then as , the equilibrium measures and satisfy large deviations principles with speed and respective rate functions and . The rate functions are good, i.e., lower semi-continuous with compact level sets.
The theorem is proven in Section 5.3. The large deviations principle for says that for every closed set and every open set (product topology)
[TABLE]
It is essential that we work in the product topology. Indeed we shall later see that is mixing, therefore for every , the measure gives full mass to sequences that have infinitely many spacings . Thus for every ball , we have hence , to be contrasted with the lower bound in Eq. (2.5).
Another consequence concerns the evaluation of the Gibbs energies of localized defects: suppose that because of some impurity, the energy is not but , where is, say, continuous in the product topology, localized in the bulk, and bounded from below. Then by Varadhan’s lemma [DZ98], as , the effective Gibbs energy converges to the zero temperature energy of the defect,
[TABLE]
Surface energies occur as a specific type of defect, when cancels all interactions between two half-infinite chains (see Proposition 4.9(a)), which leads to the following theorem. Define
[TABLE]
the Gibbs free energy per particle in the bulk and the surface correction .
Theorem 2.5**.**
Fix and . The limits (2.6) exist. If in addition and , then the bulk and surface Gibbs energy approach their zero-temperature counterparts when :
[TABLE]
This proves that the thermodynamic limit and the zero temperature limit can be exchanged, which is non-trivial (and in fact, fails when the pressure goes to zero too fast, see below).
One last consequence of Theorem 2.4 concerns the distribution of spacings and the pressure-density (or stress-strain) relation. The Gibbs free energy and our partition functions correspond to an ensemble where the overall length of the system is not fixed, but instead may fluctuate with a law that depends on the pressure—high pressures favor compressed states. In the thermodynamic limit , though, the average spacing between particles becomes a well-defined quantity, given by
[TABLE]
By the contraction principle [DZ98, Theorem 4.2.1], the distribution of under satisfies a large deviations principle with good rate function . The unique minimizer of is the ground state spacing . Lemma 5.1 implies that the distribution of spacings has exponential tails
[TABLE]
for some -independent constant .
Corollary 2.6**.**
Under the assumptions of Theorem 2.5, we have
[TABLE]
In particular, for large , we have where is the minimizer of the zero-stress Cauchy-Born energy density . Conversely, spacings (elongated chains) imply vanishing pressure . This is clearly apparent for nearest neighbor interactions (, Takahashi nearest neighbor gas [Tak42, LM66]), for which
[TABLE]
Comments on vanishing pressure. We add a superscript to indicate that zero-temperature quantities are evaluated at . When slower than any exponential, it is still true that . When with , one can show with [JKM15, Jan12] that
[TABLE]
At pressures vanishing faster than , the most likely configurations have very large spacings (dilute gas phase, ) and the previous results no longer apply. For , we expect that large deviations principles with rate functions and still hold (in fact our proofs still show weak large deviations principles). However rate functions have non-compact level sets and exponential tightness is lost. Moreover large spacings may contribute to the average (2.7) and Corollary 2.6 need no longer be true, thus allowing for spacings .
2.3. Gaussian approximation
Here we complement the large deviations result by a Gaussian approximation. This section deals with finite and the bulk measure only. Remember . We will see that the Hessian of at is associated with a positive-definite, bounded operator in . It is represented by a doubly-infinite matrix that is diagonally dominant. Write for the matrix elements of the inverse operator and let be the uniquely defined measure on , equipped with the product topology and its associated Borel -algebra, such that
[TABLE]
for all , and every finite-dimensional marginal of is a multi-dimensional Gaussian distribution. Equivalently, is the distribution of a Gaussian process with mean zero and covariance . More concrete expressions for the probability density functions of -dimensional marginals of are provided in Proposition 6.17 below.
In the following we identify the measure on with the measure on . We exclude the trivial case .
Theorem 2.7**.**
Assume , , and . Then for every , the -dimensional marginals of and have probability density functions and , and
[TABLE]
It follows that the distribution of the spacings, suitably rescaled, converges locally to the Gaussian measure : for every bounded function that depends on finitely many spacings only (bounded cylinder functions), we have
[TABLE]
For example, in the limit , the distribution of a single spacing is approximately normal, with mean and variance . We expect that Theorem 2.7 stays true for but a proof or disproof is beyond the scope of this article.
The next theorem says that the Gibbs free energy is close to the Gibbs free energy of the approximate Gaussian model.
Theorem 2.8**.**
Assume , , and . The Gibbs free energy satisfies, as ,
[TABLE]
where and is a positive-definite matrix.
The matrix is introduced in Eq. (6.18), see also Lemma 6.7, it is a function of the Hessian of the energy.
Remark* (Gaussian approximation and semi-classical expansions).*
If is smooth and is fixed, the Gibbs energy should admit an asymptotic expansion of form
[TABLE]
to arbitrarily high order , for some and coefficients . The first correction comes from a Gaussian approximation of the partition function (harmonic crystal), see Section 6, with the constant capturing the asymptotic behavior of the determinant of the Hessian around the energy minimum. Higher order corrections correspond to anharmonic effects. A similar expansion holds for . Rigorous results for finite are derived with semi-classical analysis [Hel02, Møl01, BM03] which build on the analogy with the limit from quantum mechanics. For and potentials with superlinear growth at infinity, independent results are given in [SL17].
2.4. Decay of correlations
Suppose that two defects change the energy functional from to , where we assume for simplicity that and depend on and alone. For large , we may expect that the Gibbs energies are approximately additive, i.e.,
[TABLE]
should be small when the defects are far apart. represents an effective interaction between the defects. In the study of systems with many defects it is important to understand how fast the effective interaction decreases at large distances. Some intuition is gained from the zero-temperature counterpart
[TABLE]
however in general the limits cannot be interchanged and a full study of (2.10) for large requires techniques beyond variational calculus.
A closely related problem is about the localization of changes induced by a defect: at zero temperature, if is a minimizer of , how fast does converge to the ground state spacing as ? On a similar note, how fast does for a minimizer of the surface energy (decay of boundary layers)? At positive temperature, the question is about the speed of convergence, for test functions , in
[TABLE]
as . Here , so that when denotes the left shift on . These questions naturally lead to the investigation of the decay of correlations. We start with a general result which holds for all .
Theorem 2.9**.**
Assume and . There exist such that for all , , and bounded ,
[TABLE]
When is finite and , we have the stronger bound
[TABLE]
The theorem is proven in Section 4.2. When is finite, it implies exponential decay of correlations as , however the rate can be exponentially small for large . When is infinite, Theorem 2.9 implies algebraic decay of correlations: for and sufficiently large , is negligible compared to and we find that as
[TABLE]
Better bounds are available for restricted Gibbs measures. Let be the measure conditioned on and the probability measure on obtained from the thermodynamic limit of .
Proposition 2.10**.**
Let . There exists such that for all , smooth , and ,
[TABLE]
Remark*.*
When is finite, the uniform algebraic decay for the restricted Gibbs measure is replaced with uniform exponential decay with -independent .
The proposition is proven in Section 7. It follows from the uniform convexity of the energy (Lemma 3.3) and known results from the realm of Brascamp-Lieb, Poincaré and Log-Sobolev inequalities. Proposition 2.10 differs from the estimate (2.12) in two ways: there is no exponentially large prefactor , and the rate of algebraic decay is instead of . Exponentially large prefactors are absent because the energy landscape has no local minimum. The improved algebraic decay arises, roughly, because the Gibbs measure is comparable to a Gaussian measure whose covariance is the inverse of the energy’s Hessian near the minimum, and instead of the tails of , it is the tails of that count.
We suspect that for large and small pressure, these improvements should carry over to the full Gibbs measure , but we have proofs for interactions involving finitely many neighbors only.
Theorem 2.11**.**
Assume , , and . There exists such that for all sufficiently large , suitable , all , and all , we have
[TABLE]
If , we can pick .
The theorem is proven in Section 6 with perturbation theory for compact integral operators in . When , the relevant operators are self-adjoint and spectral norms and operator norms coincide, leading to improved statements. We conclude with a few comments.
Lagrangian vs. Eulerian point of view. The theorems above formulate decay of correlations in terms of labelled spacings, which in the language of continuum mechanics is a Lagrangian viewpoint. On the other hand, in statistical mechanics of point particles it is more common to deal with unlabelled particles (Eulerian viewpoint) and correlations are between portions of space rather than labelled interparticle distances. The difference between the two approaches becomes quite clear for nearest neighbor interactions (, see Eq. (2.8)), for which the spacings are i.i.d. with probability density proportional to . Because of the independence of spacings, correlations in terms of spacings vanish, . On the other hand, the two-point function 111Intuitively, represents the probability for having one particle at [math] and one particle at . Rigorously, and for every , is the average number of ordered particle pairs in . studied in statistical mechanics of particles is a sum over the number of particles contained in ,
[TABLE]
with the -fold convolution of with itself. It is a well-known fact from renewal theory [Fel71, Chapter XI] that
[TABLE]
but in general the difference is non-zero finite for —in fact changing the convergence as can be arbitrarily slow, even though correlations of labelled interparticle spacings vanish identically. One should keep this difference in mind when browsing the literature.
Path-large deviations, non-linear semi-groups, Bellman equation. For , we may view as the law of a stationary Markov chain with state space and transition kernel defined in Eq. (6.6). Theorem 2.4 is a path-large deviations result for the Markov chain. Path large deviations are often investigated with the help of non-linear semi-groups and Hamilton-Jacobi-Bellman equations [FK06]. In our context, a natural non-linear semi-group is
[TABLE]
and for sufficiently smooth we have a convergence of the form
[TABLE]
where solves
[TABLE]
Similar equations, motivated by quantum mechanics and geometric optics, appear in semi-classical analysis [Hel02, Eq. (5.4.4)]. Proposition 3.9 below provides an infinite- ersatz and is instrumental in the proof of Theorem 2.4.
Vanishing pressure. When faster than (see (2.9)), the Gibbs measure should no longer be comparable to a Gaussian. Instead, it should be close to the ideal gas measure, for which spacings are i.i.d. exponentially distributed with parameter , and we may again expect uniform exponential decay of correlations (for finite ). When at a speed comparable to , we should instead expect an exponentially small spectral gap: the Markov chain has two metastable wells, one corresponding to the optimal spacing and another well at infinity. The exponentially small spectral gap is associated with the fracture of the chain of atoms, in the spirit of “fracture as a phase transition” [Tru96].
3. Energy estimates
In this section we analyze the variational problems arising at zero temperature. Throughout the section we assume that as in Assumption 2.
3.1. Bulk periodicity
Lemma 3.1**.**
Every minimizer of lies in .
Proof.
Let . If for some , define a new configuration by shrinking to , leaving all other spacings unchanged: for and . Since is a strict minimizer of and increases on , shrinking the bonds decreases strictly and the original configuration could not have been a minimizer.
If some interparticle spacing is smaller than , we remove a particle and reattach it to one end of the chain as follows. Assume and let with . Let and , be associated particle positions. Thus and for all . The interaction of with all other particles is
[TABLE]
For finite we note that, if for an , then by Assumption 1(i). Removing the particle thus leads to a configuration of atoms whose energy has decreased by at least
[TABLE]
The last inequality holds because of Assumption 1(ii) and . We define a new configuration by attaching the removed particle to either end of the chain at a distance . Since by Assumption 2, this decreases further, so overall the new configuration has strictly smaller energy, and the original sequence of spacings cannot be a minimizer of . ∎
At zero pressure, it is a well-known fact that the -particle energy is subadditive, . Indeed placing two ,-particle minimizers side by side with large mutual distance, because of at , yields an -particle configuration with energy . Positive pressure penalizes large mutual distances between two consecutive blocks, so the construction has to be modified.
Lemma 3.2**.**
Let and . Then for all , and the limit exists and satisfies for all .
Proof.
Let and be minimizers of and respectively. Define by concatenating and . By Lemma 3.1, all spacings are in . Therefore interactions that involve bonds from both blocks are for spacings , hence negative, and
[TABLE]
As a consquence, is subadditive. By Fekete’s subadditive lemma, the limit exists and is equal to the infimum of , hence . Notice that since
[TABLE]
(In the terminology of statistical mechanics, the energy is stable [Rue69, Chapter 3.2].) ∎
The next lemma in particular shows that is uniformly convex on . For later purposes, we state and prove this on a slightly larger set.
Lemma 3.3**.**
There are constants such that for all with , and with for , the Hessian of at satisfies
[TABLE]
for all . Moreover, the submatrix of the Hessian has strictly positive diagonal entries and non-positive off-diagonal entries . In particular, this matrix is monotone.
Note that the Hessian is independent of the pressure .
Proof.
Let be the collection of discrete intervals of length . Then for all
[TABLE]
For and we have hence ; it follows that the off-diagonal entries of the Hessian are non-positive. Next we show that the row-sums are bounded from below by some constant if .
[TABLE]
Assumption 1 guarantees that for sufficiently small. Thus row sums are positive, off-diagonal matrix elements non-positive, and consequently diagonal elements positive. Moreover, with the diagonal elements are bounded from above by . The proof of the lemma is then completed with the help of standard arguments, for example every eigenvalue of lies in a Gershgorin circle with center and radius . In particular, is an M-matrix and thus monotone. ∎
Proof of Theorem 2.1.
(a) By Lemma 3.1 minimizers lie in the compact set . On that set the Hessian of is positive definite because of Lemma 3.3, so is strictly convex and the minimzer is unique.
(b) The convergence as along , where is the unique minimizer of , with the help of Lemma 3.3 is a straightforward adaptation of the corresponding proof in [GR79] and will be omitted. By Assumption 1(ii) we even have . We remark that the proof in [GR79] also shows that for . This in turn implies that the convergence is in fact uniform away from a boundary layer of vanishing volume fraction.
(c) This observation in combination with Lemma 3.2 yields (c). Note that since by Assumptions 1 and 2.
∎
Notice that also except for the exceptional cases in which only nearest neighbors interact, i.e. or for , and the pressure vanishes.
3.2. Surface energy
Proposition 3.4**.**
Let and . Then
[TABLE]
Proof.
For simplicity we write down the proof for ; the proof when is completely analogous. Fix and . Let with and . Let be the spacings of the -particle ground state, labelled by rather than . Choosing and large enough we may assume . Since the Hessian has matrix norm uniformly bounded from above (Lemma 3.3), changing the spacings to increases the energy by at most thus
[TABLE]
We decompose the energy of the modified configuration as where
[TABLE]
where gathers interactions that involve bonds from two consecutive blocks. The term represents the interactions between the left and right blocks. It satisfies
[TABLE]
which goes to zero as . Next we subtract from and distribute it as over the first three sums. The middle block contributes
[TABLE]
as . For the first block, we notice that
[TABLE]
Indeed the only missing piece are negative interactions between the left block and the right tail of a semi-infinite chain. The contribution of the right block is estimated in a similar way. We combine the estimates and let first , then , and finally and find
[TABLE]
For the upper bound, we take approximate minimizers of and glue them together to an -particle configuration by assigning them to the left and right boundaries, with spacings in between. This yields an -particle configuration with energy , and the required upper bound follows. ∎
Next we extend to the space of sequences with .
Lemma 3.5**.**
Let . Let , . Then for all , we have
[TABLE]
The right-hand side is absolutely convergent for all .
Proof.
Let . Using , we have
[TABLE]
The equilibrium condition yields
[TABLE]
and the alternate expression for follows. Next consider with for all . Under Assumption 1 the derivatives behave as and as with . It follows that decays like so that . The Cauchy-Schwarz inequality then shows
[TABLE]
for some suitable -independent constant . In particular, when the sum is absolutely convergent. In order to show that the double sum over and in Eq. (3.2) is absolutely convergent, we proceed with estimates analogous to Lemma 3.3. Assume first that all spacings are larger than . Set and note that, by Assumption 1(iii) for all , . Hence
[TABLE]
More generally, if , then and because of , there is an such that for all . Let . Summands with can be estimated as before. For and , we proceed as before as well, except that we replace by . This leaves a finite sum over and overall, the sum is absolutely convergent. ∎
Lemma 3.6**.**
The map , (z_{j})\mapsto\mathcal{E}_{\mathrm{surf}}\bigl{(}(z_{j})_{j\in\mathbb{N}}\bigr{)} defined by (3.2) is continuous.
Proof.
Let be sequences in such that in . As uniformly in , the estimates above show that for every , we can find such that the sum over contributes to and an amount bounded by . In the remaining finite sum the continuity of allows us to pass to the limit. The proof is easily concluded with an argument. ∎
Lemma 3.7**.**
The restriction of to is strictly convex and satisfies
[TABLE]
for suitable -independent constants .
Proof.
The proof of the convexity is similar to Lemma 3.3 and therefore omitted. For the coercivity, consider first . Let , the truncated strain, and . Then
[TABLE]
thus
[TABLE]
where . Next we cut and paste into the middle of a large ground state chain: let with , and the spacings of the -particle ground state. Let . A Taylor expansion of around the minmizer together with Lemma 3.3 and Theorem 2.1 yields
[TABLE]
On the other hand, let be a bound for interactions between blocks and remember by Lemma 3.2 and . Then
[TABLE]
We combine with Eq. (3.3) and let first , then , and conclude that with the help of Lemma 3.6. This proves the coercivity in the case . The proof for finite is similar. ∎
Lemma 3.8**.**
The surface energy has a unique minimizer in . The minimizer is in .
Proof.
We proceed as in Section 3.1. Let . If one of the ’s is larger than , we can define a new configuration by shrinking this spacing to , leaving all other configurations unchanged. This decreases . If one of the ’s is smaller than , let be the smallest among them, and with . Then we can define a new configuration by removing a participating particle and possibly shrinking a bond, i.e., . Since , just as in Lemma 3.1, we see that this decreases the energy. Repeating these steps if necessary, the initial configuration is mapped to a new one that has strictly lower energy and all spacings in .
The existence of a minimizer now follows from the coercivity proven in Lemma 3.7, the compactness of with respect to the weak -convergence (shifted by ) and the weak lower semicontinuity of on that set due to Lemmas 3.6 and 3.7. The minimizer is unique because of the strict convexity from Lemma 3.7. ∎
Proof of Theorem 2.2..
Clear from Lemmas 3.6, 3.7, 3.8 and Proposition 3.4. ∎
Proof of Proposition 2.3..
In complete analogy to Lemma 3.5 we obtain
[TABLE]
for all , and as in Lemma 3.6, we see that (3.4) defines a continuous map . The proof of strict convexity, even on for some , is again similar to Lemma 3.3. As in Lemma 3.8 we have that has a unique minimizer in , which lies in . Since and for every by (3.4), the minimizer of is . Clearly, . Finally, the formula connecting and is clear on and follows on by approximation. ∎
3.3. A fixed point equation
In the following we assume that has a hard core:
Assumption 3**.**
and as .
We extend , defined by (2.1) on , to by setting
[TABLE]
Our main aim in this subsection is to obtain the following characterisation of , cf. (2.4).
Proposition 3.9**.**
Let . Then is the unique lower semi-continuous solution (product topology) of the equation
[TABLE]
such that and if for one of the ’s.
Note that, by induction, (3.6) is equivalent to
[TABLE]
for all and . (Observe that for all by the decay assumption on and .)
We begin with a technical auxiliary result.
Lemma 3.10**.**
If and are such that
[TABLE]
then . Moreover, any satisfies
[TABLE]
Proof.
Let . The partial sum is equal to the energy plus an interaction
[TABLE]
(the inner sum being [math] if ) which is bounded from below by
[TABLE]
By adding and spacings to the left and right respectively, we may view as a block of spacings in an -particle configuration where . Let . The new configuration satisfies
[TABLE]
for some suitable constant that depends on , and only. Let be the -particle ground state with spacings labelled by rather than . Since by Lemma 3.2 and , we get
[TABLE]
Suppose that all spacings are in . We use a Taylor approximation around the minimizer , apply Lemma 3.3 and Theorem 2.1, and obtain.
[TABLE]
Letting we obtain an upper bound for the -norm of . If there are with or , we modify the configuration without increasing its energy as in the proof of Lemma 3.1 to obtain . When we shrink bonds to , leaving all other spacings unchanged, both and are strictly larger than so the truncated -norm \sum_{j=1}^{k}\min\bigl{(}(z_{j}-a)^{2},\varepsilon_{0}^{2}) is unaffected.
On the other hand suppose . Then we remove the particle , reattach it a distance to the left of the -particle block. This effects the change
[TABLE]
on the -norm. Both and are larger than , moreover
[TABLE]
So the truncated -norm increases by at most . Let be the number of times this step has to be performed. Iterating we arrive at a configuration with
[TABLE]
and for some , cf. (3.1). Making smaller if necessary we may assume . We combine with Eq. (3.8) for and and obtain
[TABLE]
We let and find that the truncated -norm of is finite. It follows in particular that there are only finitely many spacings , and is square summable. This establishes the first assertion.
In order to show the convergence of the partial sums to , first observe that satisfies (3.7) for . This is clear for and follows for general by continuity. If , the sequence of shifts converges to strongly and thus
[TABLE]
as . ∎
We have actually proven the following: for sufficiently small , suitable , and all ,
[TABLE]
Proof of Proposition 3.9..
Let . Observe that satisfies (3.6). This is clear for and for . For the remaining it follows from Lemma 3.6. We now show that is lower semi-continuous with respect to pointwise convergence. Without loss we suppose that converges to pointwise with for some constant . Passing to a subsequence (not relabelled) we may furthermore assume that . Fix an such that the estimate in Lemma 3.7 is satisfied. By (3.9)
[TABLE]
for some uniform constant since . For given we denote by the first index , if existent, with . Passing to a further subsequence (not relabelled) and choosing sufficiently large we may achieve that either such indices do not exist or that as . In both cases we get that for . In particular, for .
In the second case we define new configurations by applying the procedure detailed in the proof of Lemma 3.8 to the tails shrinking the bonds , , and deleting particles if , , so that
[TABLE]
In the first case we simply set . Since in the second case, we still have pointwise.
By (3.7) with we have
[TABLE]
From the decay properties of and it is easy to see that, for any , converges to . Since and , from Assumption 3 we also get for . So
[TABLE]
In particular, and so Lemma 3.7 implies that and in by coercivity and hence that
[TABLE]
by convexity. Summarizing we obtain
[TABLE]
Suppose, conversely, that a lower semi-continuous satisfies (3.6) with and if for some . We first note that, since , for any with one has
[TABLE]
by (3.7) and so by Lemma 3.10. It thus suffices to show that
[TABLE]
for all .
If , then is indeed finite by Lemma 3.5. We have \lim_{k\to\infty}\sum_{j=1}^{k}\big{(}h(z_{j},\ldots,z_{j+m-1})-e_{0}\big{)}=\mathcal{E}_{\mathrm{surf}}(z) by Lemma 3.10. Since the sequence of shifts converges to pointwise as , taking the in (3.7) yields
[TABLE]
Note that, as , this inequality also shows that .
For the reverse inequality, by choosing large enough in (3.7) we first see that (3.10) holds true for all . We denote by the truncation with for and for . Since pointwise and in as , lower semi-continuity of and strong continuity of (see Lemma 3.6) give
[TABLE]
where we have used that for all . ∎
We now restrict to the case . Let . By (3.7) with we have
[TABLE]
for any , where
[TABLE]
Taking the infimum over , with fixed and setting
[TABLE]
(recall Lemma 3.6) leads to
[TABLE]
In Chapter 6 we will need the following estimate.
Lemma 3.11**.**
Set and . Then, for any there exists a such that
[TABLE]
for all .
Proof.
Suppose is such that , in particular, for . If we construct a new configuration without changing the first spacings similarly as in the proofs of Lemma 3.1 and 3.8.
If , we define by setting for and . Then
[TABLE]
Now assume . We choose an with and define by setting for , and for . As in Lemmas 3.1 and 3.8 (in particular using that ), we see that
[TABLE]
The estimates (3.12) and (3.13) show that, for any with and there is a with such that
[TABLE]
where \delta=\min\big{\{}v(z_{\max}+\varepsilon)-v(z_{\max}),\ 2\alpha_{1}\sum_{n=m+1}^{\infty}(nz_{\min})^{-s}\big{\}}>0. Using (3.11) we arrive at
[TABLE]
The claim now follows by taking the infimum over with fixed conditioned on . ∎
A simpler proof gives the following estimate that will also be needed in Chapter 6.
Lemma 3.12**.**
For any there exists a such that for all .
Proof.
By continuity we may assume that . If , we define by setting for and . Then
[TABLE]
If . We choose the smallest with and define by setting for , and for . As in (3.13) we get
[TABLE]
This concludes the proof. ∎
4. Gibbs measures for the infinite and semi-infinite chains
Here we prove the existence of , , , and check that is shift-invariant and mixing, hence ergodic; the results and methods are fairly standard. In addition, we provide an a priori estimate on the decay of correlations with explicit analysis of the -dependence (Theorem 4.4) which to the best of our knowledge is new. The results from this section need only very little on the pair potential: we only use that has a hard core and that , for large , with . The technical assumption of a hard core frees us from superstability estimates [LP76, Rue76]. The decay of the potential ensures that the infinite volume Gibbs measure is unique, see e.g. [Geo11, Chapter 8.3] and [Pap84a, Pap84b, Kle85].
We follow the classical treatment of one-dimensional systems with transfer operators. For compactly supported pair potentials with a hard core (or, in our case, when is chosen finite), the transfer operators are integral operators in [Rue69, Chapter 5.6], see Section 6. For long-range interactions, the transfer operator (also known as Ruelle operator or Ruelle-Perron-Frobenius operator) acts instead from the left on functions of infinitely many variables, and from the right on measures [Rue68, GMS70, Rue78]. The formalism of transfer operators keeps being developed in the context of dynamical systems and ergodic theory [Bal00b, Bal00a].
For the decay of correlations, we adapt [Pol00] to the present context of continuous unbounded spins and carefully track the -dependence in the bounds. In Section 5.3, transfer operators will also help us investigate the large deviations behavior of the Gibbs measures; notably the eigenvalue equation from Lemma 4.1 translates into a fixed point equation for the rate function (see Lemma 5.4).
The results of this section hold for all and ; the additional condition is not needed.
4.1. Transfer operator
For and we abbreviate , cf. (2.1) and (3.5). The transfer operator acts on functions as
[TABLE]
The dual action on measures is defined by and is given by
[TABLE]
Lemma 4.1**.**
There exist and a probability measure on such that
[TABLE]
Moreover and the pair is unique.
We will show in Proposition 4.9 that is the measure satisfying (2.2). The non-compactness of forms an obstacle to the application of a Schauder-Tychonoff fixed point theorem for the map , see e.g. [Rue68, Proposition 2]. It might be possible to remove the obstacle using tightness estimates, but we prefer to follow a different route and exploit the known uniqueness of infinite volume Gibbs measures [Geo11, Chapter 8.3] instead.
Proof.
Let be a probability measure on , , and . We show that if is a Gibbs measure, then is a Gibbs measure as well. Let us first introduce the kernels needed to formulate that is a Gibbs measure. By [Geo11, Theorem 1.33] it is enough to look at one-point kernels. Pick . For and , let
[TABLE]
where sum runs over discrete intervals . Further define the kernel
[TABLE]
where and . The kernel acts on functions and measures in the usual way, in particular . Notice that for all . Indeed yields a function where -dependence has been integrated out, and integrating it against the probability measure does not change its value. Replacing with , we define in a completely analogous fashion conditional energies and kernels \gamma_{k}^{0}\bigl{(}(z_{j})_{j\in\mathbb{N}_{0}},B\bigr{)}.
Suppose that is a Gibbs measure, i.e., for all . Let be a measurable test function. Treat as a measure on . We check that for all . For , this property is inherited from the Gibbsianness of : we have
[TABLE]
Set . Note . Therefore
[TABLE]
hence . For , the required property follows from the definition of . Notice and
[TABLE]
Let . Then
[TABLE]
The previous identities hold for all non-negative test functions , consequently for all and is a Gibbs measure as well.
By [Geo11, Theorem 8.39], the Gibbs measure exists and is unique. Treating and both as measures on , we must therefore have , i.e., the unique Gibbs measure is an eigenmeasure of and in particular, there exists an eigenmeasure. Conversely, let be an eigenmeasure. Arguments similar to the investigation of given above, based on the iterated fixed point equation , show that for all and all , hence for all . Every eigenmeasure is a Gibbs measure. Since the latter is unique, the eigenmeasure is unique as well. Finally, since for , the eigenmeasure must satisfy . This holds for all , hence . ∎
Let be the probability measure on obtained by flipping , i.e., is the image of under the map . The measures represent equilibrium measures for the left and right half-infinite chains. Let
[TABLE]
be the total interaction between left and right half-infinite chains, cf. Proposition 2.3(d). We abbreviate the shifted versions as . Define by
[TABLE]
Thus represents an averaged contribution to the Boltzmann weight from the left half-infinite chain.
Lemma 4.2**.**
We have and .
Proof.
The normalization is obvious, for the eigenvalue equation let and use the eigenvalue equation for
[TABLE]
See also [Rue78, Section 5.12]. ∎
Define the operator
[TABLE]
so that and . Let be the probability measure on given by
[TABLE]
We will show in Proposition 4.9 that is the measure satisfying (2.3). Notice that for every bounded measurable function that depends on right-chain variables only,
[TABLE]
Let be the shift .
Lemma 4.3**.**
- (a)
* is shift-invariant.*
- (b)
For all and all , we have .
The proof is standard [Rue78] and therefore omitted. The lemma can be rephrased as follows: let be a stochastic process with law , defined on some probability space . Then is stationary, and
[TABLE]
Our next task is to show that the process is not only stationary but in fact ergodic and to estimate the decay of correlations.
4.2. Ergodicity
Bounds on correlations are most conveniently expressed with the help of variations, semi-norms that quantify how much a function depends on faraway variables. Notice that . Let be a function and . The th variation of on is
[TABLE]
When the constraint on initial values is empty, is sometimes called the oscillation of [Geo11, Eq. (8.2)]. The oscillation vanishes if and only is constant. Notice that decays algebraically: for , as ,
[TABLE]
It follows that the variation is summable, . Set
[TABLE]
Notice that for all , is independent of and . In fact the pressure only enters the oscillation . By a slight abuse of notation we identify a function with the function , and write instead of . The results of this subsection hold for all .
Theorem 4.4**.**
Let and . The measure is mixing with respect to shifts, i.e., as , for all . Moreover for and all bounded , , ,
[TABLE]
We prove Theorem 4.4 with Pollicott’s method of conditional expectations [Pol00]. For alternative approaches, see [Sar02] and the references therein. The principal idea is the following: for , let be the projection
[TABLE]
onto the subspace of functions that depend on the first coordinates only, i.e., . In terms of the stationary process with law ,
[TABLE]
Notice that
[TABLE]
where is the norm. Let . Then
[TABLE]
The difference enclosed in parentheses represents a truncation error; it is made small by choosing large. On the subspace of mean-zero functions, the truncated operator satisfies a contraction property uniformly in (Lemma 4.7), and goes to zero exponentially fast as .
Lemma 4.5**.**
We have for all and .
Proof.
Let , such that for all . Then
[TABLE]
and . The claim then follows from the definition (4.1) of the invariant function. ∎
Lemma 4.6**.**
Let be a bounded function. Then ,
[TABLE]
Proof.
Let on and on so that
[TABLE]
Pick so that for . Then
[TABLE]
We integrate out , observe , and deduce
[TABLE]
To conclude, we note
[TABLE]
∎
Lemma 4.7**.**
Let such that . Then for all and
[TABLE]
Proof.
We adapt [Rue68, Proposition 3]. Consider first a non-negative function that depends on only, i.e., . Let such that for and as in the proof of Lemma 4.6. Then
[TABLE]
By Inequality (4.5) with we have , uniformly in . Thus for all . For non-negative with we have by Lemma 4.3
[TABLE]
Next let with and . Then and
[TABLE]
We integrate against , use , and find . This holds for every local function with . For general , we may apply the bound to and use and , and we are done. ∎
Lemma 4.8**.**
Let be a bounded map with . Then for all ,
[TABLE]
Proof.
A telescope summation, the triangle inequality, and Lemma 4.7 yield
[TABLE]
where in the second step we use that \nu_{\beta}((\mathcal{S}_{\beta}^{n}\Pi_{n})^{i}\bigl{(}\mathcal{S}_{\beta}^{n}\Pi_{n}-\mathcal{S}_{\beta}^{n}\bigr{)}(\mathcal{S}_{\beta}^{n})^{q-k-1}f\varphi_{\beta})=\nu_{\beta}(f\varphi_{\beta})=0 for by Lemma 4.3 and the third step follows from |\mathcal{S}_{\beta}^{n}\bigl{(}\Pi_{n}-\mathrm{id}\bigr{)}(\mathcal{S}_{\beta}^{n})^{q-k-1}f|\leq\mathcal{S}_{\beta}^{n}|\bigl{(}\Pi_{n}-\mathrm{id}\bigr{)}(\mathcal{S}_{\beta}^{n})^{q-k-1}f| and Lemma 4.3. By Eq. (4.4) and Lemma 4.6, this can be further estimated as
[TABLE]
Proof of Theorem 4.4.
Let be bounded functions and , . Using Eq. (4.2) and Lemmas 4.7 and 4.8, we get
[TABLE]
since . The explicit estimate on the decay of correlations follows. That is mixing then follows from standard approximation arguments. ∎
Proof of Theorem 2.9.
The estimate for infinite is an immediate consequence of Theorem 2.9. For finite and , the truncation error in Lemma 4.8 for a function actually vanishes since and . The bound simplifies accordingly. ∎
4.3. Thermodynamic limit
Proposition 4.9**.**
Let and .
- (a)
The Gibbs free energy and its surface correction defined by the limits (2.6) exist and are given by
[TABLE] 2. (b)
Eqs. (2.2) and (2.3) hold true.
Proof.
We compute
[TABLE]
Let . We note
[TABLE]
and with (4.3) deduce
[TABLE]
Now uniformly on . By Theorem 4.4, where . Consequently as
[TABLE]
from which part (a) of the lemma follows. A computation analogous to Eq. (4.6) shows that for every local test function ,
[TABLE]
Part (b) of the lemma then follows from Theorem 4.4. ∎
5. Large deviations as
Here we analyze the behavior of the bulk and surface Gibbs measures and and of the energies and . The large deviations result for the surface measure is a consequence of the eigenvalue equation from Lemma 4.1, exponential tightness, and the uniqueness of the solution to the fixed point equation in Proposition 3.9. Since the bulk measure is absolutely continuous with respect to the product measure of two independent half-infinite chains (Eq. (4.2) and Proposition 4.9(b)), we may go from the surface to the bulk measure with the help of Varadhan’s integral lemma [DZ98, Chapter 4.3]. The asymptotic behavior of is based on the representation from Proposition 4.9(a).
5.1. A tightness estimate
The following estimate will help us prove that the infinite-volume measure is exponentially tight (see the proof of Lemma 5.3) which enters the proof of Theorem 2.4.
Lemma 5.1**.**
For all , , , and , we have
[TABLE]
Proof.
Fix and . For with we define a new configuration by setting and leaving all other spacings unchanged. This decreases the Gibbs energy by an amount at least
[TABLE]
A change of variables thus yields
[TABLE]
and the proof of the lemma is easily concluded. ∎
5.2. Gibbs free energy in the bulk
Lemma 5.2**.**
Let at fixed . Then
[TABLE]
Proof of Lemma 5.2.
The relation between and has been proven in Proposition 4.9. We proceed with an upper bound for and . For , define by . Revisiting the proof of Lemma 3.1, we see that
[TABLE]
It follows that
[TABLE]
and
[TABLE]
whence . For a lower bound, we let be the minimizer of and choose so small that by Lemma 3.3
[TABLE]
for every . We get
[TABLE]
This yields
[TABLE]
and . ∎
5.3. Large deviations principles for and
Here we prove Theorem 2.4.
Lemma 5.3**.**
Every sequence has a subsequence along which satisfies a large deviations principle with speed and some good rate function.
Remark*.*
If , we lose exponential tightness and only know that every sequence has a subsequence along which it satisfies a weak large deviations principle [DZ98, Lemma 4.1.23], which means that the upper bound in (2.5) is required to hold for compact sets rather than closed sets.
Proof.
The lemma is a consequence of exponential tightness. Let . Define . is compact in the product topology. Passing to the limit in Lemma 5.1, we find
[TABLE]
for all and . Therefore
[TABLE]
It follows that the family of measures is exponentially tight, i.e., for every , we can find a compact subset such that . endowed with the product topology is separable and metrizable and therefore has a countable base. Lemma 4.1.23 in [DZ98] applies and yields the claim. ∎
Lemma 5.4**.**
Suppose that Assumption 3 holds true and assume that along some subsequence the measure satisfies a large deviations principle with good rate function . Then satisfies
[TABLE]
on . In particular, if for some .
Proof.
Write instead of . We will see that the fixed point equation for follows from the eigenvalue equation in Lemma 4.1 and the asymptotics of the principal eigenvalue provided in Lemma 5.2. According to these,
[TABLE]
for any where the -term comes from and is independent of .
We first show that can only be finite on . Fix and for consider the open set . A repeated application of Lemma 4.1 and Lemma 5.2 give
[TABLE]
Let be a lower bound for on . Then
[TABLE]
and
[TABLE]
Hence
[TABLE]
It follows that
[TABLE]
Since was arbitrary we have shown that on . In particular, as satisfies a large deviations principle on with rate function , the same large deviations principle holds on .
We now establish another (weak) large deviations principle on . Let be a (relatively) closed set and a compact interval. Then (5.1) with yields
[TABLE]
Write for the inner integral. As is bounded from below and for every fixed , is continuous n with respect to the product topology, we deduce from Varadhan’s lemma [DZ98, Chapter 4.3] that
[TABLE]
for all . Next we note that for all , , and suitable ,
[TABLE]
For bounded away from we may exploit that the derivative of is bounded and drop the first term, making larger if need be. Plugging these estimates into the definition of , we find that for some and all ,
[TABLE]
It follows that the upper bound (5.2) is uniform on compact subsets of and
[TABLE]
A similar argument shows that for all and all (relatively) open subsets ,
[TABLE]
Taking monotone limits, the latter inequality is seen to extend to and . It follows that , as a family of probability measures on , satisfies a weak large deviations principle with rate function . (It is indeed sufficient to consider product sets. This is easy to see for the lower bound: If is open, then for any one finds with h(\bar{z}_{1},\bar{z}_{2},\ldots)-e_{0}+I(\bar{z}_{2},\bar{z}_{3},\ldots)-\varepsilon\leq\inf_{z\in U}\bigl{(}h(z_{1},z_{2},\ldots)-e_{0}+I(z_{2},z_{3},\ldots)\bigr{)}, from which it follows that (5.4) holds for instead of . The upper bound for a general compact is obtained by covering, for given , , where for each , and are chosen such that h(x_{1},x_{2},\ldots)-e_{0}+I(x_{2},x_{3},\ldots)-\varepsilon\leq\inf_{z\in(\alpha_{x},b_{x})\times B_{\delta(x)}(x)}\bigl{(}h(z_{1},z_{2},\ldots)-e_{0}+I(z_{2},z_{3},\ldots)\bigr{)}. This is possible since is lower semicontinuous. With the help of (5.3) we can now deduce that (5.3) holds for instead of .)
Since is a Polish space, the rate function in a weak large deviations principle is uniquely defined [DZ98, Chapter 4.1], hence on . To finish the proof it remains to observe that also on because both and are equal to on that set. ∎
Proof of Theorem 2.4.
The large deviations principle for with good rate function is an immediate consequence of Lemmas 5.3 and 5.4 and Proposition 3.9. As a consequence, satisfies a deviations principle with good rate function on and on , The large deviations principle for thus follows from Eq. (4.2), Lemmas 4.3.4 and 4.3.6 in [DZ98], and
[TABLE]
by Proposition 2.3, and the observation that is continuous on . ∎
5.4. Surface corrections to the Gibbs free energy
Proof of Theorem 2.5.
The statements about have already been proven in Lemma 5.2. For , we start from the formula in Proposition 4.9(a), to which we apply Lemma 5.2, Theorem 2.4 and Varadhan’s lemma. This yields
[TABLE]
But now for with
[TABLE]
with . So
[TABLE]
6. Gaussian approximation
Here we prove Theorems 2.7 and 2.8 on the Gaussian approximation to the bulk measure when is finite. We start from a standard idea, namely perturbation theory for transfer operators [Hel02], however we need to put some work into a good choice of transfer operator as the standard symmetrized choice (6.2) does not work well. This aspect is explained in more detail in Section 6.1. Throughout this section satisfies . Remember .
6.1. Decomposition of the energy. Choice of transfer operator
For finite , the treatment with transfer operators from Section 4.1 can be considerably simplified: instead of an operator that acts on functions of infinitely many variables, the transfer operator becomes an integral operator in ( space with respect to Lebesgue measure). There are several possible choices, corresponding each to an additive decomposition of the energy. Let and
[TABLE]
Let us block variables as . Then for we have
[TABLE]
with only finitely many non-zero summands. By Proposition 2.3 the sum extends to by continuity. The transfer operator associated with the representation (6.1) is the integral operator with kernel ; it is clearly related to the -th power of the transfer operator from Section 4.1. The analysis is simpler for a symmetrized operator with kernel
[TABLE]
which has the advantage of being Hilbert-Schmidt: The pressure term present in and ensures that decays exponentially fast when so that . The transfer operator corresponds to a rewriting of (6.1),
[TABLE]
For the analysis of the limit , we would like to have a transfer operator that concentrates in some sense around the optimal spacings so that we may approximate it with a Gaussian operator. When , unfortunately, the function need not have its minimum at , with . Therefore we introduce yet another variant of the transfer operator: we look for a function such that
[TABLE]
and , and work with the kernel
[TABLE]
By a slight abuse of notation we use the same letter for the integral operator
[TABLE]
in . The function is defined as follows. Set
[TABLE]
and
[TABLE]
Remember
[TABLE]
Lemma 6.1**.**
Assume , , and . Then:
- (a)
For all , we have . 2. (b)
The function is bounded, and we have
[TABLE] 3. (c)
* for all .*
Proof.
One easily checks
[TABLE]
which yields
[TABLE]
For , we have hence . This proves part (a) of the lemma. The symmetry in part (c) is immediate from the reversal symmetry of . For (b), we note that
[TABLE]
the formula for follows. Because of
[TABLE]
and , , we have
[TABLE]
The roles of and can be exchanged, hence is bounded. ∎
6.2. Some properties of the transfer operator
Lemma 6.2**.**
Assume , , and and . Then:
- (a)
The kernels and are related as follows:
[TABLE] 2. (b)
The operator is a Hilbert-Schmidt operator in , and the kernel has the symmetry .
The lemma follows from Lemma 6.1, the elementary proofs are omitted.
By the Krein-Rutman theorem [KR48], [Dei85, Chapter 6], the operator norm is a simple eigenvalue of , the associated eigenfunction can be chosen strictly positive on , and the other eigenvalues of have absolute value strictly smaller than , i.e.,
[TABLE]
By Lemma 6.2(b), the function is a left eigenfunction of :
[TABLE]
Let be the rank-one projection in given by
[TABLE]
Then and an induction over shows
[TABLE]
Since is nothing else but the spectral radius of , it follows that
[TABLE]
The spectral properties of are related to the Gibbs free energy and the Gibbs measure as follows.
Lemma 6.3**.**
Assume , , and . Then:
- (a)
The Gibbs free energy is given by . 2. (b)
The -dimensional marginals of the bulk Gibbs measure have probability density function
[TABLE]
with . 3. (c)
For all and all bounded , writing f_{0}\bigl{(}(z_{j})_{j\in\mathbb{Z}}\bigr{)}:=f(z_{0},\ldots,z_{d-1}) and g_{n}\bigl{(}(z_{j})_{j\in\mathbb{Z}}\bigr{)}:=g(z_{nj},\ldots,z_{nj+d-1}), we have
[TABLE]
with some constant that does not depend on , , or . If , we can pick and .
Proof of Lemma 6.3.
For , the partition function is given by
[TABLE]
For the second identity we have used Lemma 6.2(a). The function is bounded by Lemma 6.1(b) and is integrable because grows linearly when . Therefore and are in , and as ,
[TABLE]
It follows that
[TABLE]
which proves part (a) of the lemma. The standard proof of part (b) is omitted (compare [Hel02, Chapter 4]). For (c), we use the formula for the - dimensional marginal provided by (b). Let us choose multiplicative constants in such a way that . Then
[TABLE]
Eq. (6.4) yields
[TABLE]
where refers to the -norm for functions and the operator norm for the operator. We further bound and and conclude with (6.5). If , the operators are symmetric, hence the operator norm is the same as the spectral radius and the estimates simplify accordingly. ∎
Remark* (Associated Markov chain).*
Define the kernel
[TABLE]
on . Then is a Markov kernel with invariant measure where
[TABLE]
If in the bulk Gibbs measure we group spacing in blocks as , we obtain a probability measure on . This measure is exactly the distribution of the two-sided stationary Markov chain with state space , transition kernel , and initial law .
6.3. Gaussian transfer operator
Here we introduce the Gaussian counterpart to the transfer operator and study its spectral properties. We start from the quadratic approximation to the bulk energy . The differentiability of in a neighborhood of the constant sequence is checked in Lemma 6.11 below, for the definition of the Gaussian transfer operator we only need the infinite matrix of partial derivatives at .
In the following we block variables as for and for . Remember the decomposition (6.1). Set and define the matrices
[TABLE]
We note the following relations:
[TABLE]
The Hessian at is a doubly infinite, band-diagonal matrix with block form
[TABLE]
Note that Lemma 3.3 implies that is positive definite. We look for a quadratic form on that is positive-definite and satisfies
[TABLE]
One candidate choice could be
[TABLE]
but it is not easily related to . We make a different choice which mimicks the definition of and show later that this amounts to picking the Hessian of (see Lemma 6.12 below).
We introduce the quadratic counterparts to the functions , , and from Section 6.2. Remember the bulk Hessian from (6.9). Since it is positive-definite, there exist uniquely defined positive-definite matrices and such that
[TABLE]
for all . The quadratic forms associated with and are the Gaussian counterparts to the functions and , respectively. Finally set
[TABLE]
and
[TABLE]
We will see in the proof of Lemma 6.12 that , and are the Hessians of at , at and at , respectively. The relation between and is clarified in Lemma 6.7 below. We are going to work with the kernel
[TABLE]
and the associated integral operator . In Section 6.4 we show that is a good approximation for , here we study the operator on its own. Clearly it is enough to understand the integral operator with kernel
[TABLE]
since and are related by the change of variables , see Eq. (6.21) below.
Lemma 6.4**.**
Assume , . Then the quadratic form is positive-definite: for some and all .
Proof.
First we show that is positive semi-definite, by an argument similar to Lemma 6.2(a). Define
[TABLE]
Clearly
[TABLE]
hence
[TABLE]
for all and is positive semi-definite. Next let be a zero of the quadratic form associated with . Then by (6.13), the function must be minimal at , hence . Similarly, the function must be minimal at , hence . Thus is a critical point of . But is strictly convex because is positive-definite, therefore the critical point is a global minimizer of which yields . It follows that is positive-definite. ∎
It follows from Lemma 6.4 that , hence is Hilbert-Schmidt with strictly positive integral kernel and Krein-Rutman theorem is applicable. So we may ask for its principal eigenvalue and eigenvector and its spectral gap. It is natural to look for a Gaussian eigenfunction.
Lemma 6.5**.**
Let be a positive-definite, symmetric matrix. Then the following two statements are equivalent:
- (i)
* is an eigenfunction of .* 2. (ii)
The function satisfies the quadratic Bellman equation
[TABLE]
Proof.
The proof is by a straightforward completion of squares: write
[TABLE]
with -matrices . The diagonal blocks and are positive-definite because is positive-definite, therefore is positive-definite as well. Then
[TABLE]
It follows that
[TABLE]
and
[TABLE]
Therefore (i) and (ii) hold true if and only if solves
[TABLE]
In particular, (i) and (ii) are equivalent. ∎
In Lemma 6.7 below we check that is of the form
[TABLE]
for some positive-definite matrix .
Lemma 6.6**.**
The principal eigenvalue of is and the principal eigenfunction is (up to scalar multiples).
Proof.
A close look at our definitions shows that solves (6.14) (it is positive-definite because is). Indeed, by the definition of , , we have
[TABLE]
Therefore, by Lemma 6.5, the function is an eigenfunction of . The matrix in (6.15) is equal to , and we find that the principal eigenvalue of is . ∎
In order to identify the block in (6.16), we introduce the quadratic analogue to the function . Let and be the matrices from (6.7) and . The infinite matrix is band-diagonal with block structure
[TABLE]
The matrix differs from the bulk Hessian (6.9) by the upper left corner : we have
[TABLE]
By a reasoning similar to Lemma 3.3, the Hessian of is positive-definite. Therefore there is a uniquely defined positive-definite -matrix such that
[TABLE]
for all . (Analogous arguments as in the proof of Lemma 6.12 show that is the Hessian of at .) Set
[TABLE]
and
[TABLE]
(remember the symmetries (6.8)).
Lemma 6.7**.**
The matrix solves
[TABLE]
and Eq. (6.16) holds true. Moreover
[TABLE]
Proof.
Clearly
[TABLE]
hence
[TABLE]
by a completion of squares similar to the proof of Lemma 6.5. We add to both sides, remember (6.17), and obtain the equation for . It is easy to see that
[TABLE]
which proves (6.16). Furthermore,
[TABLE]
hence,
[TABLE]
Let us check that the two expressions for are indeed identical, and that . Combining with (6.17) and (6.19), the two expressions for become
[TABLE]
and
[TABLE]
The two expressions are indeed equal, and from the end formula and (6.8) we read off that . Actually
[TABLE]
which is the analogue of .
Now we compute . The off-diagonal blocks of are the same as those of . The upper left diagonal block is
[TABLE]
A similar computation yields the lower right block. Altogether we find
[TABLE]
and the lemma follows. ∎
Finally we come back to the -dependent operator .
Proposition 6.8**.**
Assume and . The principal eigenvalue of is
[TABLE]
and the normalized, positive principal eigenfunction is
[TABLE]
Proof.
Let be the unitary operator given by
[TABLE]
We have
[TABLE]
hence
[TABLE]
and the principal eigenvalue and eigenfunction of are obtained from those of in Lemma 6.6 by straightforward transformations. ∎
Remark*.*
When , all eigenvalues and eigenfunctions of (hence ) can be computed explicitly, and the eigenfunctions are expressed with Hermite polynomials. See [Hel02, Section 5.2] on the harmonic Kac operator.
6.4. Perturbation theory
Remember the unitary operator from (6.20) and the relation . The main technical result of this section is the following.
Proposition 6.9**.**
Assume , , and . We have as .
Before we come to the proof of the proposition, we state a corollary on the principal eigenvalue and eigenfunction. Remember the quantities , , defined before Lemma 6.3. We choose multiplicative constants so that . Let , , be an enumeration of the eigenvalues of with and
[TABLE]
Corollary 6.10**.**
Under the assumptions of Proposition 6.9: Let and be as in Proposition 6.8. Then as ,
[TABLE]
and
[TABLE]
The corollary follows from Proposition 6.9 and standard perturbation theory for compact operators [RS78]. The proof of Proposition 6.9 builds on several lemmas. First we show that is in a neighborhood of its global minimizer.
Lemma 6.11**.**
The mapping is in some open neighborhood in of the constant sequence .
Proof.
Note that
[TABLE]
defines a function in a neighborhood of which vanishes for . Moreover, using that minimizes on and so , we see that also
[TABLE]
For all the derivative of at is given by
[TABLE]
for all with for all but finitely many . So
[TABLE]
Since
[TABLE]
for in a neighborhood of with a uniform constant , the right hand side of (6.22) extends to a uniformly continuous function there. Writing
[TABLE]
for , a standard approximation argument shows that indeed is in a neighborhood of also in with given by (6.22). In fact, is even on a neighborhood of in and
[TABLE]
This follows similarly as above by extending the derivative of , where we now use that the mappings , and , are uniformly continuous in a neighborhood of and so extends to a continuous mapping from a neighborhood of to (the space of bounded linear operators on ) given by (6.23). ∎
Next we show that is in fact the Hessian of .
Lemma 6.12**.**
Assume , , and . We have for all , moreover as ,
[TABLE]
The lemma leaves open whether is the unique global minimizer of .
Proof.
The first part of the lemma has already been proven in Lemma 6.2(a). With , as in (6.10) and (6.11) we let as in (6.12). It remains to show that . Since, for a suitable , is convex on , see (the proof of) Proposition 2.3, Lemma 3.12 shows that there is a unique function on a neighborhood of in with values in , such that
[TABLE]
As is positive definite, the implicit function theorem shows that this mapping is and satisfies
[TABLE]
as well as
[TABLE]
The latter identity implies
[TABLE]
so that is indeed near and
[TABLE]
In particular, since ,
[TABLE]
The same analysis applied to the quadratic approximation , leads to
[TABLE]
too. So we have . A completely analogous reasoning gives and it follows that . ∎
Lemma 6.13**.**
Assume . For some and all ,
[TABLE]
Proof.
Since the pair potential is bounded from below, we have for some constant
[TABLE]
In combination with Lemma 6.1 this yields the claim. ∎
In order to estimate , we split the configuration space into a neighborhood of and its complement and treat blocks separately. For , we write for the multiplication operator with the indicator function .
Lemma 6.14**.**
Suppose that is compact, contains an open neighborhood of , and is such that for all . Then
[TABLE]
Proof.
By Lemma 6.12, for every , there is a such that for all with and , we have
[TABLE]
Choosing small enough we may assume without loss of generality that . We estimate
[TABLE]
for some . On , the function stays bounded away from [math], therefore
[TABLE]
A similar estimate clearly holds true for as well. Hence
[TABLE]
This holds true for every , so the left-hand side converges to zero. Since operator norms are bounded by Hilbert-Schmidt norms, the lemma follows. ∎
Lemma 6.15**.**
Assume that is such that and is invariant under reversals, . Then .
Proof.
We may view as an operator in . The Krein-Rutman theorem is applicable and shows that is a simple eigenvalue and there exists an eigenfunction that is strictly positive on . Because of the symmetry , the function is a left eigenfunction. Moreover for all , we have
[TABLE]
so for all strictly positive functions ,
[TABLE]
We choose and . The scalar product becomes
[TABLE]
with . By Lemma 6.1(b) , remembering , we have
[TABLE]
Define and for , . Then we recognize
[TABLE]
where the constant depends on , , and alone. As stay bounded away from , we obtain
[TABLE]
for some and all and . It follows that . ∎
Lemma 6.16**.**
Suppose that and are such that
[TABLE]
for some and all , . Assume also that is invariant under reversals, . Then
[TABLE]
Proof.
Revisiting the proof of Lemma 6.1, we see that
[TABLE]
Eqs. (6.25), (6.3) and (6.24) show that for all and . This estimate together with the growth estimate from Lemma 6.13 shows
[TABLE]
hence . The estimate on follows from the symmetry . ∎
Proof of Proposition 6.9.
Let , , and . The sets and are clearly invariant under reversals, moreover by Theorem 2.1(b), so is in the interior of and bounded away from . Thus and satisfy the assumptions of Lemmas 6.14 and 6.15. By Lemma 3.11, they also satisfy the condition (6.24) from Lemma 6.16. By the triangle inequality,
[TABLE]
The first term on the right-hand side, multiplied by , goes to zero by Lemma 6.14. For the second term, we estimate
[TABLE]
and conclude from Lemmas 6.15 and 6.16 that d . Bounding Hilbert-Schmidt norms, it is straightforward to check that as well, and the proof is complete. ∎
6.5. Proof of Theorems 2.7, 2.8 and 2.11
Proof of Theorem 2.8.
Combining Lemma 6.3(a) and Corollary 6.10, we obtain
[TABLE]
Proof of Theorem 2.11.
The theorem is an immediate consequence of Lemma 6.3(c) and Corollary 6.10. ∎
For the proof of Theorem 2.7, we first express the marginals of in terms of the matrices and from Eq. (6.7) and the matrix from (6.18). We group variables in blocks as usual and view as a measure on .
Proposition 6.17**.**
Under the assumptions of Theorem 2.7, the distributions of , , and () under have probability density functions proportional to
- (a)
, 2. (b)
, 3. (c)
**
respectively.
Proof.
We recall a standard fact on marginals of multivariate Gaussians and Schur complements. Suppose we are given a positive-definite -matrix in block form
[TABLE]
where are , and matrices, respectively. Think of as the Hessian of the energy. Consider the Gaussian measure on with covariance matrix and probability density function
[TABLE]
Then for all ,
[TABLE]
with the Schur complement of in . The inverse is equal to the upper left block of . Another characterization is provided by a completion of squares, similar to the proof of Lemma 6.5: we have
[TABLE]
Now let be the Hessian of at . By definition of , the distribution of is Gaussian with mean zero and covariance matrix . Let be the -matrix defined by . It is not difficult to check that the considerations above generalize to the infinite matrices at hand, hence for all ,
[TABLE]
Eq. (6.27) provides a variational description of the covariance matrix of the -dimensional marginal of . For , with and , Eq. (6.27) shows , by the definition (6.10) of . Combining with (6.16) we get
[TABLE]
This proves part (b) of the lemma. The proof of (c) is similar. Part (a) follows from (b) and a relation similar to (6.26). ∎
Proof of Theorem 2.7.
It is enough to treat the -dimensional marginals with . Let be the principal eigenfunction of , with multiplicative constant chosen so that . Set and
[TABLE]
By Lemma 6.3, the probability density for satisfies
[TABLE]
By Proposition 6.17, the analogous representation for the Gaussian density is
[TABLE]
with and the principal eigenfunction of , normalized so that . It follows that
[TABLE]
Using and , we get
[TABLE]
which goes to zero by Proposition 6.9 (see also Corollary 6.10). ∎
7. A Brascamp-Lieb type covariance estimate for
Here we prove Proposition 2.10. Key to the proof is a matrix lower bound for the Hessian of . For Gaussian measures with probability density proportional to and test functions , , we end up estimating the covariance . We follow [Men14], see also [OR07].
Proof of Proposition 2.10.
Revisiting the proof of Lemma 3.3, we obtain bounds on matrix elements of the Hessian. Let , . For we have
[TABLE]
with
[TABLE]
For we also have
[TABLE]
by Assumption 1(iv). Moreover
[TABLE]
again by Assumption 1(iv). Let be the -matrix with diagonal and off-diagonal entries ; notice that do not depend on . is symmetric and positive-definite.
The previous estimates together with [Men14, Remark 2.6] show that the energy satisfies the assumptions of [Men14, Theorem 2.3 and Proposition 3.5]. It follows that for all smooth ,
[TABLE]
Let be i.i.d. random variables with law
[TABLE]
and . We may decompose as plus an off-diagonal matrix, write a Neumann series for the inverse, and find that for
[TABLE]
Clearly
[TABLE]
By (7.1), we have for some constant . Following [Men14, Proposition 3.5] we may estimate, for each ,
[TABLE]
Similar estimates apply to other . Combining with (7.3) we find
[TABLE]
It follows that
[TABLE]
Notice that the series is convergent. The bound is plugged into the estimate (7.2) and the proposition follows by passing to the limit . ∎
Acknowledgments
We thank Nils Berglund, Andrew Duncan, André Schlichting, and Martin Slowik for helpful discussions.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AKM 16] S. Adams, M. Kotecký, and S. Müller, Strict convexity of the surface tension for non-convex potentials , Online preprint ar Xiv:1606.09541 v 1 [math-ph], 2016.
- 2[Aum 15] S. Aumann, Spontaneous breaking of rotational symmetry with arbitrary defects and a rigidity estimate , J. Stat. Phys. 160 (2015), no. 1, 168–208.
- 3[Bal 00a] V. Baladi, The magnet and the butterfly: thermodynamic formalism and the ergodic theory of chaotic dynamics , Development of mathematics 1950–2000, Birkhäuser, Basel, 2000, pp. 97–133.
- 4[Bal 00b] by same author, Positive transfer operators and decay of correlations , Advanced Series in Nonlinear Dynamics, vol. 16, World Scientific Publishing Co., Inc., River Edge, NJ, 2000.
- 5[BC 07] A. Braides and M. Cicalese, Surface energies in nonconvex discrete systems , Math. Models Methods Appl. Sci. 17 (2007), no. 7, 985–1037.
- 6[BCF 86] F. Bavaud, Ph. Choquard, and J.-R. Fontaine, Statistical mechanics of elastic moduli , J. Statist. Phys. 42 (1986), no. 3-4, 621–646.
- 7[Bd H 15] A. Bovier and F. den Hollander, Metastability , Grundlehren der Mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 351, Springer, Cham, 2015, A potential-theoretic approach.
- 8[BFS 82] D. Brydges, J. Fröhlich, and T. Spencer, The random walk representation of classical spin systems and correlation inequalities , Comm. Math. Phys. 83 (1982), no. 1, 123–150.
