Metastable Markov chains: from the convergence of the trace to the   convergence of the finite-dimensional distributions

Claudio Landim; Michail Loulakis; Mustapha Mourragui

arXiv:1703.09481·math.PR·October 3, 2019

Metastable Markov chains: from the convergence of the trace to the convergence of the finite-dimensional distributions

Claudio Landim, Michail Loulakis, Mustapha Mourragui

PDF

TL;DR

This paper studies metastable continuous-time Markov chains with multiple wells, establishing conditions for their finite-dimensional distributions to converge to a finite state Markov chain and representing the process as a convex combination of metastable states.

Contribution

It introduces sufficient conditions for convergence of finite-dimensional distributions and provides a representation of the process as a convex combination of metastable states.

Findings

01

Finite-dimensional distributions converge to a finite state Markov chain.

02

The process can be represented as a time-dependent convex combination of metastable states.

03

Conditions for convergence are explicitly characterized.

Abstract

We consider continuous-time Markov chains which display a family of wells at the same depth. We provide sufficient conditions which entail the convergence of the finite-dimensional distributions of the order parameter to the ones of a finite state Markov chain. We also show that the state of the process can be represented as a time-dependent convex combination of metastable states, each of which is supported on one well.

Equations423

E_{N}\;:=\;\big{\{}\eta\in{\mathbb{N}}^{{\mathbb{T}}_{L}}:\sum_{x\in{\mathbb{T}}_{L}}\eta_{x}=N\big{\}}\;.

E_{N}\;:=\;\big{\{}\eta\in{\mathbb{N}}^{{\mathbb{T}}_{L}}:\sum_{x\in{\mathbb{T}}_{L}}\eta_{x}=N\big{\}}\;.

g (0) = 0, g (1) = 1 and g (n) = \frac{a ( n )}{a ( n - 1 )}, n \geq 2,

g (0) = 0, g (1) = 1 and g (n) = \frac{a ( n )}{a ( n - 1 )}, n \geq 2,

(\sigma^{x,y}\eta)_{z}\;=\;\left\{\begin{array}[]{ll}\eta_{x}-1&\textrm{for $z=x$}\\ \eta_{y}+1&\textrm{for $z=y$}\\ \eta_{z}&\rm{otherwise}\;.\\ \end{array}\right.

(\sigma^{x,y}\eta)_{z}\;=\;\left\{\begin{array}[]{ll}\eta_{x}-1&\textrm{for $z=x$}\\ \eta_{y}+1&\textrm{for $z=y$}\\ \eta_{z}&\rm{otherwise}\;.\\ \end{array}\right.

(L_{N}f)(\eta)\;=\;\sum_{\stackrel{{\scriptstyle x,y\in{\mathbb{T}}_{L}}}{{x\not=y}}}g(\eta_{x})\,p(y-x)\,\big{\{}f(\sigma^{x,y}\eta)-f(\eta)\big{\}}\;.

(L_{N}f)(\eta)\;=\;\sum_{\stackrel{{\scriptstyle x,y\in{\mathbb{T}}_{L}}}{{x\not=y}}}g(\eta_{x})\,p(y-x)\,\big{\{}f(\sigma^{x,y}\eta)-f(\eta)\big{\}}\;.

μ_{N} (η) = \frac{N ^{α}}{Z _{N}} x \in T_{L} \prod \frac{1}{a ( η _{x} )},

μ_{N} (η) = \frac{N ^{α}}{Z _{N}} x \in T_{L} \prod \frac{1}{a ( η _{x} )},

{\mathscr{E}}^{x}_{N}\;:=\;\Big{\{}\eta\in E_{N}:\eta_{x}\geq N-\ell_{N}\Big{\}}\;.

{\mathscr{E}}^{x}_{N}\;:=\;\Big{\{}\eta\in E_{N}:\eta_{x}\geq N-\ell_{N}\Big{\}}\;.

E_{N} = E_{N}^{1} ⊔ \dots ⊔ E_{N}^{n}, \overset{˘}{E}_{N}^{x} = y \neq = x ⨆ E_{N}^{y} .

E_{N} = E_{N}^{1} ⊔ \dots ⊔ E_{N}^{n}, \overset{˘}{E}_{N}^{x} = y \neq = x ⨆ E_{N}^{y} .

Φ_{N} (η) = x = 1 \sum n x 1 {η \in E_{N}^{x}}, Ψ_{N} (η) = x = 1 \sum n x 1 {η \in E_{N}^{x}} .

Φ_{N} (η) = x = 1 \sum n x 1 {η \in E_{N}^{x}}, Ψ_{N} (η) = x = 1 \sum n x 1 {η \in E_{N}^{x}} .

T_{A} (t) = \int_{0}^{t} 1 {ξ^{N} (s) \in A} d s,

T_{A} (t) = \int_{0}^{t} 1 {ξ^{N} (s) \in A} d s,

S_{A} (t) = sup {s \geq 0 : T_{A} (s) \leq t} .

S_{A} (t) = sup {s \geq 0 : T_{A} (s) \leq t} .

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\Big{[}\int_{0}^{t}{\mathbf{1}}\{X_{N}(s)=0\}\,ds\Big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\Big{[}\int_{0}^{t}{\mathbf{1}}\{X_{N}(s)=0\}\,ds\Big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\Big{[}\int_{0}^{t}{\mathbf{1}}\{\xi^{N}(s)\in\Delta_{N}\}\,ds\Big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\Big{[}\int_{0}^{t}{\mathbf{1}}\{\xi^{N}(s)\in\Delta_{N}\}\,ds\Big{]}\;=\;0\;.

δ \to 0 lim N \to \infty lim sup η \in E_{N}^{x} max 2 δ \leq s \leq 3 δ sup P_{η} [ξ^{N} (s) \in Δ_{N}] = 0 .

δ \to 0 lim N \to \infty lim sup η \in E_{N}^{x} max 2 δ \leq s \leq 3 δ sup P_{η} [ξ^{N} (s) \in Δ_{N}] = 0 .

μ_{N}^{y} (ξ) = \frac{μ _{N} ( ξ )}{μ _{N} ( E _{N}^{y} )} 1 {ξ \in E_{N}^{y}}, ξ \in E_{N} .

μ_{N}^{y} (ξ) = \frac{μ _{N} ( ξ )}{μ _{N} ( E _{N}^{y} )} 1 {ξ \in E_{N}^{y}}, ξ \in E_{N} .

\|\mu-\nu\|_{\rm TV}\;=\;\frac{1}{2}\,\sum_{\eta\in\Omega}|\mu(\eta)-\nu(\eta)|\;=\;\sum_{\eta\in\Omega}\big{(}\mu(\eta)-\nu(\eta)\big{)}^{+}\;,

\|\mu-\nu\|_{\rm TV}\;=\;\frac{1}{2}\,\sum_{\eta\in\Omega}|\mu(\eta)-\nu(\eta)|\;=\;\sum_{\eta\in\Omega}\big{(}\mu(\eta)-\nu(\eta)\big{)}^{+}\;,

\begin{gathered}T^{N,R,x}_{\rm mix}\;=\;\inf\Big{\{}t>0:\max_{\eta\in{\mathscr{E}}^{x}_{N}}\|\delta_{\eta}\mathcal{S}^{R,x}_{N}(t)-\mu^{x}_{N}\|_{\rm TV}\leq\frac{1}{2e}\Big{\}}\;,\\ T^{N,T,x}_{\rm mix}\;=\;\inf\Big{\{}t>0:\max_{\eta\in{\mathscr{E}}^{x}_{N}}\|\delta_{\eta}\mathcal{S}^{T,x}_{N}(t)-\mu^{x}_{N}\|_{\rm TV}\leq\frac{1}{2e}\Big{\}}\;,\end{gathered}

\begin{gathered}T^{N,R,x}_{\rm mix}\;=\;\inf\Big{\{}t>0:\max_{\eta\in{\mathscr{E}}^{x}_{N}}\|\delta_{\eta}\mathcal{S}^{R,x}_{N}(t)-\mu^{x}_{N}\|_{\rm TV}\leq\frac{1}{2e}\Big{\}}\;,\\ T^{N,T,x}_{\rm mix}\;=\;\inf\Big{\{}t>0:\max_{\eta\in{\mathscr{E}}^{x}_{N}}\|\delta_{\eta}\mathcal{S}^{T,x}_{N}(t)-\mu^{x}_{N}\|_{\rm TV}\leq\frac{1}{2e}\Big{\}}\;,\end{gathered}

H_{{\mathscr{A}}}\;=\;\inf\big{\{}t>0:\xi^{N}(t)\in{{\mathscr{A}}}\big{\}}\;,\quad H^{+}_{{\mathscr{A}}}\;=\;\inf\big{\{}t>\tau_{1}:\xi^{N}(t)\in{{\mathscr{A}}}\big{\}}\;,

H_{{\mathscr{A}}}\;=\;\inf\big{\{}t>0:\xi^{N}(t)\in{{\mathscr{A}}}\big{\}}\;,\quad H^{+}_{{\mathscr{A}}}\;=\;\inf\big{\{}t>\tau_{1}:\xi^{N}(t)\in{{\mathscr{A}}}\big{\}}\;,

H_{A}^{B} = \int_{0}^{H_{A}} 1 {ξ^{N} (s) \in B} d s .

H_{A}^{B} = \int_{0}^{H_{A}} 1 {ξ^{N} (s) \in B} d s .

\lim_{N\to\infty}\max_{x\in S}\,\sup_{\eta\in{\mathscr{E}}^{x}_{N}}{\mathbb{P}}_{\eta}\big{[}H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}}>\delta\big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{x\in S}\,\sup_{\eta\in{\mathscr{E}}^{x}_{N}}{\mathbb{P}}_{\eta}\big{[}H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}}>\delta\big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{x\in S}\,\sup_{\eta\in{\mathscr{B}}_{N}^{x}}{\mathbb{P}}_{\eta}\big{[}H_{\Delta_{N}}\leq 2\,\varepsilon_{N}\big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{x\in S}\,\sup_{\eta\in{\mathscr{B}}_{N}^{x}}{\mathbb{P}}_{\eta}\big{[}H_{\Delta_{N}}\leq 2\,\varepsilon_{N}\big{]}\;=\;0\;.

N \to \infty lim x \in S max η \in B_{N}^{x} sup ∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} = 0 .

N \to \infty lim x \in S max η \in B_{N}^{x} sup ∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} = 0 .

∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} = \frac{1}{2} ζ \in E_{N}^{y} \sum ∣ f_{t} (ζ) - 1 ∣ μ_{N}^{y} (ζ),

∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} = \frac{1}{2} ζ \in E_{N}^{y} \sum ∣ f_{t} (ζ) - 1 ∣ μ_{N}^{y} (ζ),

∥ f_{0} ∥_{μ_{N}^{y}}^{2} = ζ \in E_{N}^{y} \sum f_{0} (ζ)^{2} μ_{N}^{y} (ζ) = ζ \in E_{N}^{y} \sum \frac{δ _{η, ζ}}{μ _{N}^{y} ( ζ ) ^{2}} μ_{N}^{y} (ζ) = \frac{1}{μ _{N}^{y} ( η )},

∥ f_{0} ∥_{μ_{N}^{y}}^{2} = ζ \in E_{N}^{y} \sum f_{0} (ζ)^{2} μ_{N}^{y} (ζ) = ζ \in E_{N}^{y} \sum \frac{δ _{η, ζ}}{μ _{N}^{y} ( ζ ) ^{2}} μ_{N}^{y} (ζ) = \frac{1}{μ _{N}^{y} ( η )},

∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} \leq \frac{1}{μ _{N}^{y} ( η ) ^{1/2}} e^{- λ_{R, y} ε_{N}} .

∥ δ_{η} S_{N}^{R, y} (ε_{N}) - μ_{N}^{y} ∥_{TV} \leq \frac{1}{μ _{N}^{y} ( η ) ^{1/2}} e^{- λ_{R, y} ε_{N}} .

N \to \infty lim y \in S max η \in B_{N}^{y} sup \frac{1}{μ _{N}^{y} ( η ) ^{1/2}} e^{- λ_{R, y} ε_{N}} = 0 .

N \to \infty lim y \in S max η \in B_{N}^{y} sup \frac{1}{μ _{N}^{y} ( η ) ^{1/2}} e^{- λ_{R, y} ε_{N}} = 0 .

N \to \infty lim \frac{μ _{N} ( Δ _{N} )}{μ _{N} ( E _{N}^{y} )} = 0,

N \to \infty lim \frac{μ _{N} ( Δ _{N} )}{μ _{N} ( E _{N}^{y} )} = 0,

\frac{1}{C _{0}} \leq \frac{μ _{N} ( E _{N}^{z} )}{μ _{N} ( E _{N}^{y} )} \leq C_{0} .

\frac{1}{C _{0}} \leq \frac{μ _{N} ( E _{N}^{z} )}{μ _{N} ( E _{N}^{y} )} \leq C_{0} .

\lim_{N\to\infty}\frac{T^{N,T,x}_{\rm mix}}{\varepsilon_{N}}\,\frac{1}{\mu_{N}^{x}({\mathscr{B}}_{N}^{x})}\,\Big{(}1+\ln\big{(}\frac{1}{\mu_{N}^{x}({\mathscr{B}}_{N}^{x})}\big{)}\Big{)}\;=\;0\;.

\lim_{N\to\infty}\frac{T^{N,T,x}_{\rm mix}}{\varepsilon_{N}}\,\frac{1}{\mu_{N}^{x}({\mathscr{B}}_{N}^{x})}\,\Big{(}1+\ln\big{(}\frac{1}{\mu_{N}^{x}({\mathscr{B}}_{N}^{x})}\big{)}\Big{)}\;=\;0\;.

\lim_{N\to\infty}\big{\|}\mu^{N,\eta^{N}}_{t_{1},\ldots,t_{k}}-\sum_{y_{1},\ldots,y_{k}\in S}\!\!{\boldsymbol{P}}_{x}\big{[}{\boldsymbol{X}}(t_{1})=y_{1},\ldots,{\boldsymbol{X}}(t_{k})=y_{k}\big{]}\,\mu^{y_{1}}_{N}\times\cdots\times\mu^{y_{k}}_{N}\,\big{\|}_{\rm TV}\;=\;0\;.

\lim_{N\to\infty}\big{\|}\mu^{N,\eta^{N}}_{t_{1},\ldots,t_{k}}-\sum_{y_{1},\ldots,y_{k}\in S}\!\!{\boldsymbol{P}}_{x}\big{[}{\boldsymbol{X}}(t_{1})=y_{1},\ldots,{\boldsymbol{X}}(t_{k})=y_{k}\big{]}\,\mu^{y_{1}}_{N}\times\cdots\times\mu^{y_{k}}_{N}\,\big{\|}_{\rm TV}\;=\;0\;.

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\big{[}t-T_{N}(t)\big{]}\;=\;0\;.

\lim_{N\to\infty}\max_{\eta\in{\mathscr{E}}_{N}}{\mathbb{E}}_{\eta}\big{[}t-T_{N}(t)\big{]}\;=\;0\;.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Metastable Markov chains: from the

convergence of the trace to the convergence of the finite-dimensional distributions

C. Landim, M. Loulakis, M. Mourragui

IMPA, Estrada Dona Castorina 110, CEP 22460 Rio de Janeiro, Brasil and CNRS UMR 6085, Université de Rouen, Avenue de l’Université, BP.12, Technopôle du Madrillet, F76801 Saint-Étienne-du-Rouvray, France.

[email protected]

School of Applied Mathematical and Physical Sciences, National Technical University of Athens, 15780 Athens, Greece and Institute of Applied and Computational Mathematics, Foundation for Research and Technology- Hellas (FORTH), 70013 Heraklion Crete, Greece.

[email protected]

CNRS UMR 6085, Université de Rouen, Avenue de l’Université, BP.12, Technopôle du Madrillet, F76801 Saint-Étienne-du-Rouvray, France.

[email protected]

Abstract.

We consider continuous-time Markov chains which display a family of wells at the same depth. We provide sufficient conditions which entail the convergence of the finite-dimensional distributions of the order parameter to the ones of a finite state Markov chain. We also show that the state of the process can be represented as a time-dependent convex combination of metastable states, each of which is supported on one well.

1. Introduction

Several different methods to prove the metastable behavior of Markov chains have been proposed in the last years [39, 11, 17, 18, 20, 10, 21].

Inspired by the potential theoretic approach to metastability, proposed by Bovier, Eckhoff, Gayrard and Klein in [12, 13], Beltrán and Landim introduced a general method, known as the martingale method, to derive the metastable behavior of a Markov process [3, 6, 7]. The reader will find in [7] a discussion on the similarities and differences between the martingale approach, the pathwise approach, put forward in [16] and presented in [39], and the potential theoretic approach, proposed in [12, 13] and reviewed in [11].

To insert the main results of the article in their context, we recall below the martingale method in the context of condensing zero-range processes [5, 30, 43]. Denote by ${\mathbb{N}}$ the non-negative integers, ${\mathbb{N}}=\{0,1,2,...\}$ , by ${\mathbb{T}}_{L}$ , $L\geq 1$ , the discrete, one-dimensional torus with $L$ points, and by $\eta$ the elements of ${\mathbb{N}}^{{\mathbb{T}}_{L}}$ called configurations. The total number of particles at $x\in{\mathbb{T}}_{L}$ for a configuration $\eta\in{\mathbb{N}}^{{\mathbb{T}}_{L}}$ is represented by $\eta_{x}$ . Let $E_{N}$ , $N\geq 1$ , be the set of configurations with $N$ particles:

[TABLE]

Fix $\alpha>1$ , and define $g:{\mathbb{N}}\to{\mathbb{R}}_{+}$ as

[TABLE]

where $a(0)=1$ , $a(n)=n^{\alpha}$ , $n\geq 1$ . In this way, $\prod_{i=1}^{n}g(i)=a(n)$ , $n\geq 1$ , and $\{g(n):n\geq 2\}$ is a strictly decreasing sequence converging to $1$ as $n\uparrow\infty$ .

Fix $1/2\leq p\leq 1$ , and denote by $p(x)$ the transition probability given by $p(1)=p$ , $p(-1)=1-p$ , $p(x)=0$ , otherwise. Let $\sigma^{x,y}\eta$ be the configuration obtained from $\eta$ by moving a particle from $x$ to $y$ :

[TABLE]

The nearest-neighbor, zero-range process associated to the jump rates $\{g(k):k\geq 0\}$ and the transition probability $p(x)$ is the continuous-time, $E_{N}$ -valued Markov process $\{\eta^{N}(t):t\geq 0\}$ whose generator $L_{N}$ acts on functions $f:E_{N}\to{\mathbb{R}}$ as

[TABLE]

Hence, if there are $k$ particles at site $x$ , at rate $pg(k)$ , resp. $(1-p)g(k)$ , one of them jumps to the right, resp. left. Since $g(k)$ decreases to $1$ as $k\to\infty$ , the more particles there are at some site $x$ the slower they jump, but the rate remains bounded below by $1$ .

This Markov process is irreducible. The stationary probability measure, denoted by $\mu_{N}$ , is given by

[TABLE]

where $Z_{N}$ is the normalizing constant.

Fix a sequence $\{\ell_{N}:N\geq 1\}$ such that $\ell_{N}\to\infty$ , $N/\ell_{N}\to\infty$ , and let ${\mathscr{E}}^{x}_{N}$ , $x\in{\mathbb{T}}_{L}$ , be the set of configurations in which all but $\ell_{N}$ particles sit at $x$ :

[TABLE]

According to equation (3.2) in [5], for each $x\in{\mathbb{T}}_{L}$ , $\mu_{N}({\mathscr{E}}^{x}_{N})\to 1/L$ as $N\uparrow\infty$ .

By the ergodic theorem, the process stays most of the time in the set $\sqcup_{x\in{\mathbb{T}}_{L}}{\mathscr{E}}^{x}_{N}$ . Since these sets are far apart, one expects the sets ${\mathscr{E}}^{x}_{N}$ to behave as wells of the dynamics: the process remains for a very long time in each of the sets ${\mathscr{E}}^{x}_{N}$ at the end of which it performs a quick transition to another set ${\mathscr{E}}^{y}_{N}$ .

If the process evolves as described in the previous paragraph, it is reasonable to call depth of the well ${\mathscr{E}}^{x}_{N}$ the average time the process remains in ${\mathscr{E}}^{x}_{N}$ before hitting another well. The symmetry of the model implies that in the zero-range process introduced above all wells have the same depth. This is an important difference between this dynamics and the previous ones in which a metastable behavior has been observed. In the latter ones, cf. [39, 11], the models feature one shallow and one deep well and the problem consists in describing the transition from the shallow well to the deep one, or in estimating the mean value of the transition time. In contrast, in the zero-range process, the presence of many wells of the same depth transforms the problem in the characterization of the evolution of the process among the wells.

Beltrán and Landim proposed in [3, 6] a mathematical formulation of this phenomenon which we present below in the context of a sequence of Markov chains, each of which takes values in a finite set.

Consider a sequence of finite sets $(E_{N}:N\geq 1)$ whose cardinality tends to infinity with $N$ . The elements of $E_{N}$ are called configurations and are denoted by the Greek letters $\eta$ , $\xi$ , $\zeta$ . Let $\{\eta^{N}(t):t\geq 0\}$ be a continuous-time, $E_{N}$ -valued, irreducible Markov chain.

The wells. Consider a partition ${\mathscr{E}}^{1}_{N},\dots,{\mathscr{E}}^{{\mathfrak{n}}}_{N}$ , $\Delta_{N}$ , ${\mathfrak{n}}\geq 2$ , of the set $E_{N}$ , and let

[TABLE]

Here and below we use the notation ${\mathscr{A}}\sqcup{\mathscr{B}}$ to represent the union of two disjoint sets ${\mathscr{A}}$ , ${\mathscr{B}}$ : ${\mathscr{A}}\sqcup{\mathscr{B}}={\mathscr{A}}\cup{\mathscr{B}}$ , and ${\mathscr{A}}\cap{\mathscr{B}}=\varnothing$ . As in the example above, the sets ${\mathscr{E}}^{x}_{N}$ have to be understood as the wells of the dynamics, the sets where the process remains most of the time, and $\Delta_{N}$ as the set which separates the wells.

The time scale. Let $(\theta_{N}:N\geq 1)$ be the time-scale at which one observes a transition from a well ${\mathscr{E}}^{x}_{N}$ to the set $\breve{{\mathscr{E}}}^{x}_{N}$ which consists of the union of all the other wells. This time-scale has to be determined in each model. As it can be expressed in terms of capacities (cf. Lemma 6.8 in [3]), its derivation corresponds to the calculation of the capacity between ${\mathscr{E}}^{x}_{N}$ and $\breve{{\mathscr{E}}}^{x}_{N}$ .

Denote by $\xi^{N}(t)$ the process $\eta^{N}(t)$ speeded-up by $\theta_{N}$ : $\xi^{N}(t)=\eta^{N}(t\theta_{N})$ . Note that the transitions between wells occur in time-intervals of order $1$ for the process $\xi^{N}(t)$ . This is the reason for changing the time scale and introducing $\xi^{N}(t)$ .

Model reduction. We expect the process to remain for a very long time in each well, a time much longer than the time it needs to equilibrate inside the well. If this description is correct, the hitting time of a new well should be asymptotically Markovian due to the loss of memory entailed by the equilibration.

Let $\Phi_{N}:E_{N}\to\{0,1,\dots,{\mathfrak{n}}\}$ , $\Psi_{N}:{\mathscr{E}}_{N}\to\{1,\dots,{\mathfrak{n}}\}$ be the projections defined by

[TABLE]

Note that $\Phi_{N}(\eta)=0$ for $\eta\in\Delta_{N}$ , while $\Psi_{N}$ is not defined on the set $\Delta_{N}$ . In general, $\Phi_{N}(\xi^{N}(t))$ is not a Markov chain, but only a hidden Markov chain. As the cardinality of $E_{N}$ increases to $\infty$ with $N$ , $\Phi_{N}(\xi^{N}(t))$ takes values in a much smaller state space than $\xi^{N}(t)$ . For this reason it is called the reduced chain.

The argument laid down above on equilibration and loss of memory suggests that $\Phi_{N}(\xi^{N}(t))$ converges to a Markov chain taking values in $\{0,1,\dots,{\mathfrak{n}}\}$ . However, the brief sojourns at $\Delta_{N}$ create an obstacle to the convergence. Starting from the well ${\mathscr{E}}^{x}_{N}$ , the process $\xi^{N}(t)$ makes many unsuccessful attempts before hitting a new well ${\mathscr{E}}^{y}_{N}$ . These attempts correspond to brief visits to $\Delta_{N}$ . A typical path of $\Phi_{N}(\xi^{N}(t))$ is illustrated in Figure 1. These short sojourns at $\Delta_{N}$ , which disappear in the limit, prevent the convergence (in the usual Skorohod topology) of the process $\Phi_{N}(\xi^{N}(t))$ to a $\{1,\dots,{\mathfrak{n}}\}$ -valued Markov chain.

To overcome this difficulty, we perform a small surgery in the trajectories by removing from them the pieces of the paths in $\Delta_{N}$ . This is done by considering the trace of the process $\xi^{N}(t)$ on ${\mathscr{E}}_{N}$ .

Trace process. Fix a proper subset ${\mathscr{A}}$ of $E_{N}$ . The trace of the process $\xi^{N}(t)$ on the set ${\mathscr{A}}$ , denoted by $\xi^{{\mathscr{A}}}(t)$ , is the process obtained from $\xi^{N}(t)$ by stopping its evolution when it leaves the set ${\mathscr{A}}$ and by restarting it when it returns to the set ${\mathscr{A}}$ . More precisely, denote by $T_{{\mathscr{A}}}(t)$ the total time spent at ${\mathscr{A}}$ before time $t$ :

[TABLE]

where ${\mathbf{1}}\{B\}$ represents the indicator of the set $B$ . Note that the function $T_{{\mathscr{A}}}$ is piecewise differentiable and that its derivative takes only the values $1$ and [math]. It is equal to $1$ when the process is in ${\mathscr{A}}$ and it is equal to [math] when it is not. Let $S_{{\mathscr{A}}}(t)$ be the generalized inverse of $T_{{\mathscr{A}}}(t)$ :

[TABLE]

The trace process is defined as $\xi^{{\mathscr{A}}}(t)=\xi^{N}(S_{{\mathscr{A}}}(t))$ . It is shown in [3, Proposition 6.1] that if $\xi^{N}(t)$ is a continuous-time, irreducible Markov chain, then $\xi^{{\mathscr{A}}}(t)$ is a continuous-time, ${\mathscr{A}}$ -valued, irreducible Markov chain whose jump rates can be expressed in terms of the probabilities of hitting times of the original chain.

Denote by $\xi^{{\mathscr{E}}_{N}}(t)$ the trace of the process $\xi^{N}(t)$ on ${\mathscr{E}}_{N}$ . By the previous paragraph, $\xi^{{\mathscr{E}}_{N}}(t)$ is an ${\mathscr{E}}_{N}$ -valued Markov process. If the time spent on $\Delta_{N}$ is negligible, we only removed from the original trajectory the short sojourns in $\Delta_{N}$ .

Metastability. Denote by $X_{N}(t)$ , $X^{T}_{N}(t)$ the hidden Markov chains given by $X_{N}(t)=\Phi_{N}(\xi^{N}(t))$ , $X^{T}_{N}(t)=\Psi_{N}(\xi^{{\mathscr{E}}_{N}}(t))$ , respectively. Note that $X_{N}(t)$ takes values in $\{0,1,\dots,{\mathfrak{n}}\}$ , while $X^{T}_{N}(t)$ takes values on the set $S:=\{1,\dots,{\mathfrak{n}}\}$ . Moreover, $X^{T}_{N}(t)$ is the trace on the set $S$ of the process $X_{N}(t)$ .

Let $D({\mathbb{R}}_{+},E_{N})$ be the space of right-continuous functions $\omega:{\mathbb{R}}_{+}\to E_{N}$ with left-limits endowed with the Skorohod topology. Let ${\mathbb{P}}_{\eta}={\mathbb{P}}^{N}_{\eta}$ , $\eta\in E_{N}$ , be the probability measure on the path space $D({\mathbb{R}}_{+},E_{N})$ induced by the Markov chain $\xi^{N}(t)$ starting from $\eta$ . Expectation with respect to ${\mathbb{P}}_{\eta}$ is represented by ${\mathbb{E}}_{\eta}$ .

In [3, 6, 7], a set of conditions have been introduced which yield that

(H1)

The dynamics $X^{T}_{N}(t)=\Psi_{N}(\xi^{{\mathscr{E}}_{N}}(t))$ is asymptotically Markovian: For all $x\in S$ , and sequences $\eta^{N}\in{\mathscr{E}}^{x}_{N}$ , under the measure ${\mathbb{P}}_{\eta^{N}}$ the process $X^{T}_{N}(t)$ converges in the Skorohod topology to a Markov chain denoted by ${\boldsymbol{X}}(t)$ ; 2. (H2)

The time spent in $\Delta_{N}$ is negligible: For all $t>0$

[TABLE]

The first condition asserts that the trace on $S$ of the process $X_{N}(t)$ converges to a Markov chain, while the second one states that the amount of time the process $X_{N}(t)$ spends outside $S$ vanishes as $N\uparrow\infty$ , uniformly over initial configurations in ${\mathscr{E}}_{N}$ .

The second condition can be restated as

[TABLE]

Soft topology. It is clear that the convergence of the process $X_{N}(t)$ to ${\boldsymbol{X}}(t)$ in the Skorohod topology does not follow from conditions (H1) and (H2). Consider, for example, a continuous-time, $S$ -valued Markov chain $Y(t)$ , and a sequence $\delta_{N}>0$ , $\delta_{N}\downarrow 0$ . Fix $t_{0}>0$ , and define the process $Y_{N}(t)$ by $Y_{N}(t)=Y(t){\mathbf{1}}\{t\not\in[t_{0}-\delta_{N},t_{0}+\delta_{N})\}$ . The sequence of processes $Y_{N}(t)$ fulfills properties (H1) and (H2), but $Y_{N}(t)$ does not converge to $Y(t)$ in the Skorohod topology. Actually, not even the $1$ -dimensional distributions converge.

This example is artificial, but in almost all models in which a metastable behavior has been observed (cf. the examples of Section 5), as mentioned in the subsection Model Reduction, due to the many and very short sojourns of $X_{N}(t)$ in [math], the process $X_{N}(t)$ can not converge in any of the Skorohod topologies to ${\boldsymbol{X}}(t)$ . To overcome this obstacle a weaker topology has been proposed in [31], called the soft topology, in which the convergence takes place.

The soft topology is, however, quite weak. For instance, the function which associates to a trajectory $\omega\in D([0,T],S\cup\{0\})$ the value $\sup_{0\leq t\leq T}|\omega(t)|$ is not continuous. For this reason, we put forward in this article an alternative definition of metastability. We propose to declare that the sequence of Markov chains $\eta^{N}(t)$ is metastable in the time-scale $\theta_{N}$ if the finite-dimensional distributions of $X_{N}(t)=\Phi_{N}(\xi^{N}(t))$ converge to the ones of ${\boldsymbol{X}}(t)$ . Moreover, we show that the conditions (H1), (H2) together with an extra condition on the visits to the set $\Delta_{N}$ , stated below in equation (2.1), entail the metastability of the Markov chains $\eta^{N}(t)$ in the FDD sense. This latter result, stated in Proposition 2.1 below, is the main contribution of this article.

We also show, in Proposition 2.2, that conditions (H1), (H2) together with slightly stronger assumptions entail the convergence of the state of the process to a time-dependent convex combination of metastable states.

2. Notation and Results

We present in this section the main results of the article. We adopt the notation introduced in the previous section: $\eta^{N}(t)$ is an $E_{N}$ -valued, irreducible Markov chain, whose state space can be decomposed as in (1.3).

Convergence of the finite-dimensional distributions. The main result of the article reads as follows:

Proposition 2.1.

Beyond (H1) and (H2), suppose that for all $x\in S$ ,

[TABLE]

Then, for all $x\in S$ , and all sequences $\{\eta^{N}:N\geq 1\}$ , $\eta^{N}\in{\mathscr{E}}^{x}_{N}$ , under ${\mathbb{P}}_{\eta^{N}}$ the finite-dimensional distributions of $X_{N}(t)$ converge to the finite-dimensional distributions of the chain ${\boldsymbol{X}}(t)$ .

The proof of this result is presented in Section 3, together with several, easier to verify, sufficient conditions for (2.1) to hold.

Slow variables. In all models where metastability has been proved the time-scale $\theta_{N}$ increases to $\infty$ with $N$ . Since it follows from the previous paragraphs that the finite-dimensional distributions of $\Phi_{N}(\xi^{N}(t))$ converge to the ones of the Markov chain ${\boldsymbol{X}}(t)$ , we say that $\Phi_{N}$ is a slow variable. In this sense, metastability consists in discovering the slow variables of the system and in deriving their asymptotic dynamics.

Convergence of the states. We have coined properties (H1) and (H2) as the metastable behavior of the Markov chain $\eta^{N}(t)$ in the time-scale $\theta_{N}$ . However, it has been pointed out that in mathematical-physics metastability means the convergence of the state of the process. The second result of this note fills the gap between these two concepts by establishing that properties (H1), (H2) together with conditions (M1), (M2) below lead to the convergence of the state of the process to a convex combination of states supported on the wells ${\mathscr{E}}^{x}_{N}$ . The precise statement of this result requires some notation.

Recall that we denote by $\xi^{N}(t)$ the Markov chain $\eta^{N}(t)$ speeded-up by $\theta_{N}$ : $\xi^{N}(t)=\eta^{N}(t\theta_{N})$ . Denote by $\mu_{N}$ the unique stationary state of the chain $\xi^{N}(t)$ , and by $\mu^{y}_{N}$ , $y\in S$ , the probability measure $\mu_{N}$ conditioned to ${\mathscr{E}}^{y}_{N}$ :

[TABLE]

Note that $\mu^{y}_{N}$ is defined on $E_{N}$ and it is supported on ${\mathscr{E}}^{y}_{N}$ .

Reflected process. For $x\in S$ , denote by $\{\xi^{N}_{R,x}(t):t\geq 0\}$ the Markov chain $\xi^{N}(t)$ reflected at ${\mathscr{E}}^{x}_{N}$ . This is the Markov chain obtained from $\xi^{N}(t)$ by forbidding jumps from ${\mathscr{E}}^{x}_{N}$ to its complement $({\mathscr{E}}^{x}_{N})^{c}$ . This mechanism produces a new Markov chain whose state space is ${\mathscr{E}}^{x}_{N}$ , which might be reducible.

We assume that for each $x\in S$ the reflected chain $\xi^{N}_{R,x}(t)$ is irreducible and that $\mu^{x}_{N}$ is its unique stationary state. In the reversible case this latter assumption follows from the irreducibility. In the non-reversible case, if the Markov chain $\eta^{N}(t)$ is a cycle chain (cf. [22, 35]) it is easy to define the sets ${\mathscr{E}}^{x}_{N}$ for the reflected chain on ${\mathscr{E}}^{x}_{N}$ to be irreducible. Let $(\mathcal{S}^{R,x}_{N}(t):t\geq 0)$ , be the semigroup of the Markov chain $\xi^{N}_{R,x}(t)$ .

Trace process. Similarly, we denote by $\xi^{N}_{T,x}(t)$ the trace on ${\mathscr{E}}_{N}^{x}$ of the process $\xi^{N}(t)$ , and by $(\mathcal{S}^{T,x}_{N}(t):t\geq 0)$ the semigroup of the Markov chain $\xi^{N}_{T,x}(t)$ .

Mixing times. Denote by $\|\mu-\nu\|_{\rm TV}$ the total variation distance between two probability measures defined on the same denumerable set $\Omega$ :

[TABLE]

where $x^{+}=\max\{x,0\}$ denotes the positive part of $x\in{\mathbb{R}}$ . Hereafter, the set $\Omega$ will be either the set $E_{N}$ or one of the wells ${\mathscr{E}}^{x}_{N}$ , $x\in S$ .

Denote by $T^{N,R,x}_{\rm mix}$ , $T^{N,T,x}_{\rm mix}$ the $(\frac{1}{2e})$ -mixing time of the reflected, trace processes, respectively:

[TABLE]

where $\delta_{\eta}$ stands for the Dirac measure concentrated on the configuration $\eta$ .

Hitting times. For a subset ${\mathscr{A}}$ of $E_{N}$ , denote by $H_{{\mathscr{A}}}$ , $H^{+}_{{\mathscr{A}}}$ the hitting time and the time of the first return to ${{\mathscr{A}}}$ :

[TABLE]

where $\tau_{1}$ represents the time of the first jump of the chain $\xi^{N}(t)$ : $\tau_{1}=\inf\{t>0:\xi^{N}(t)\not=\xi^{N}(0)\}$ .

For two subsets ${\mathscr{A}}\subset{\mathscr{B}}\subset E_{N}$ , denote by $H^{{\mathscr{B}}}_{{\mathscr{A}}}$ the hitting time on the set ${\mathscr{A}}$ of the trace process on ${\mathscr{B}}$ :

[TABLE]

Let $(\alpha_{N}:N\geq 1)$ , $(\beta_{N}:N\geq 1)$ be two sequences of positive numbers. The relation $\alpha_{N}\ll\beta_{N}$ means that $\lim_{N\to\infty}\alpha_{N}/\beta_{N}=0$ . In the next result, we assume that for each $x\in S$ there exists a set ${\mathscr{B}}_{N}^{x}\subset{\mathscr{E}}_{N}^{x}$ fulfilling the following conditions:

(M1)

For every $\delta>0$ we have

[TABLE] 2. (M2)

There exists a time-scale $(\varepsilon_{N}:N\geq 1)$ such that $\varepsilon_{N}\ll 1$ ,

[TABLE]

and

[TABLE]

Condition (M1) requires that the process restricted in ${\mathscr{E}}^{x}_{N}$ reaches the set ${\mathscr{B}}^{x}_{N}$ quickly. Additionally, condition (M2) imposes that it takes longer to leave the set ${\mathscr{E}}^{x}_{N}$ , when starting from ${\mathscr{B}}^{x}_{N}$ , than it takes to mix in ${\mathscr{E}}^{x}_{N}$ . Slightly more precisely, condition (M2) requests the existence of a time scale $\varepsilon_{N}$ , longer than the mixing time of the reflected process and shorter than the exit time from the set ${\mathscr{E}}^{x}_{N}$ . Note, however, that in condition (2.8) the initial configuration belongs to the set ${\mathscr{B}}^{x}_{N}$ , while in the definition of the mixing time the initial configuration may be any element of the set ${\mathscr{E}}^{x}_{N}$ . In any case, condition (2.8) is in force if $\varepsilon_{N}\gg T^{N,R,x}_{\rm mix}$ .

Assume that the chain is reversible. Fix $y\in S$ , denote by $p^{R,y}_{t}(\zeta,\xi)$ the transition probabilities of the reflected process $\xi^{N}_{R,y}(t)$ , and fix $\eta\in{\mathscr{B}}^{y}_{N}$ . By definition,

[TABLE]

where $f_{t}(\zeta)=p^{R,y}_{t}(\eta,\zeta)/\mu^{y}_{N}(\zeta)$ and $t=\varepsilon_{N}$ . By Schwarz inequality and a decomposition of $f_{t}$ along the eigenfunctions of the generator of the reflected process (cf. equation (12.5) in [37]), the square of the previous expression is bounded by $\exp\{-2\lambda_{R,y}t\}\|f_{0}\|_{\mu^{y}_{N}}^{2}$ , where $\lambda_{R,y}$ represents the spectral gap of $\xi^{N}_{R,y}(t)$ and $\|f_{0}\|_{\mu^{y}_{N}}$ the norm of $f_{0}$ in $L^{2}(\mu^{y}_{N})$ . Since

[TABLE]

as $t=\varepsilon_{N}$ , we conclude that

[TABLE]

Therefore, in the reversible case, condition (2.8) of (M2) is fulfilled provided

[TABLE]

Proposition 2.2.

Assume that conditions (H1), (H2), and (M1), (M2) are in force. Suppose, furthermore, that for all $y\in S$

[TABLE]

and that either of the following three conditions (a), (b) or (c) hold.

(a)

The process $\{\eta^{N}(t):t\geq 0\}$ is reversible. 2. (b)

There exists a constant $0<C_{0}<\infty$ , such that for all $y,z\in S$ , $N\geq 1$ ,

[TABLE] 3. (c)

The sets ${\mathscr{B}}_{N}^{x}$ referred to in (M1) and (M2) further satisfy

[TABLE]

For $k\in\mathbb{N}$ , $t_{1},\ldots,t_{k}>0$ , let $\mu^{N,\eta^{N}}_{t_{1},\ldots,t_{k}}$ stand for the joint law of $\big{(}\xi^{N}(t_{1}),\ldots,\xi^{N}(t_{k})\big{)}$ , with $\xi^{N}(0)=\eta^{N}$ . Then, for every $x\in S$ and every sequence $\{\eta^{N}:N\geq 1\}$ , $\eta^{N}\in{\mathscr{E}}^{x}_{N}$ ,

[TABLE]

Remark 2.3 (On the hypotheses of Proposition 2.2).

Separation of scales, in the sense that the process mixes in a well before jumping, is a common feature in metastable Markov chains and it is usually hidden in the proof of (H1). Conditions (M1)–(M2) is a mathematically concrete way to elicit this fact in generality. In the proof of the metastability of zero-range processes in [2], (M1)–(M2) are actually the way (H1) is established. On the other hand, the ”either of three” conditions are not so hard to check. This is clear for reversibility. Condition (b) can be readily checked when $\mu_{N}$ is known. Moreover, it is always satisfied if the rates of the limiting process are the rescaled rates of jumping between wells (which is an assumption for H1) and the limiting Markov chain is irreducible. As for (c) there are standard tools to estimate mixing times (cf. [37] in a general set-up, [2] in the context of metastability and Remark 3.9 below).

The article is organized as follows. Propositions 2.1 and 2.2 are proved in Sections 3, 4, respectively. In Section 5 we show that the assumptions of these propositions are in force for four different classes of dynamics. In the last section, we present a general bound for the probability that a hitting time of some set is smaller than a value in terms of capacities (which can be evaluated by the Dirichlet and the Thomson principles). Throughout this article, $c_{0}$ and $C_{0}$ are finite positive constants, independent of $N$ , whose values may change from line to line.

3. Convergence of the finite-dimensional distributions

In this section, we prove Proposition 2.1, and we present some sufficient conditions for (2.1). We will use the shorthand $T_{N}(t)$ for the time $T_{{\mathscr{E}}_{N}}(t)$ spent by the process $\xi^{N}(t)$ in ${\mathscr{E}}_{N}$ before time t. Likewise, we will denote the generalized inverse of $T_{N}(t)$ by $S_{N}(t)$ . Note that condition (H2) can be stated as

[TABLE]

Since $\{S_{N}(t)\geq t+\delta\}=\{T_{N}(t+\delta)\leq t\}=\{t+\delta-T_{N}(t+\delta)\geq\delta\}$ , it follows from the previous equation that for all $t\geq 0$ , $\delta>0$ ,

[TABLE]

The proof of Proposition 2.1 is based on the next technical result, which provides an estimate for the distribution of the trace process $X^{T}_{N}$ in terms of the distribution of the process $X_{N}$ .

Lemma 3.1.

Assume conditions (H1) and (H2). Then, for all $N\geq 1$ , $\delta>0$ , $y\in S$ , $\eta\in{\mathscr{E}}_{N}$ , and $r>3\delta$ ,

[TABLE]

where

[TABLE]

for all $r>0$ , $y\in S$ and

[TABLE]

Proof.

Fix $N\geq 1$ , $\delta>0$ , $y\in S$ , $\eta\in{\mathscr{E}}_{N}$ and $r>3\delta$ . By definition of $X^{T}_{N}$ , and since $S_{N}(r-3\delta)\geq r-3\delta$ ,

[TABLE]

where

[TABLE]

and

[TABLE]

By (3.2) with $t=r-3\delta$ ,

[TABLE]

On the other hand,

[TABLE]

Denote by $H$ the first time the process $X_{N}(s)$ hits the point $y$ after $r-3\delta$ :

[TABLE]

By the strong Markov property, the second term on the right hand side of the penultimate equation is equal to

[TABLE]

Recall from (1.3) the definition of $\breve{{\mathscr{E}}}^{y}_{N}$ . The previous probability is bounded by

[TABLE]

Since $s\leq 3\delta$ , the first term is bounded by

[TABLE]

By condition (H1), $J^{(2)}_{N}(y,\delta)$ vanishes as $N\to\infty$ and then $\delta\to 0$ . To complete the proof of the lemma, it remains to set $R^{(1)}_{N}(y,r,\delta)=\max_{\eta\in{\mathscr{E}}_{N}}J^{(1)}_{N}(\eta,r,\delta)+J^{(2)}_{N}(y,\delta)$ and to recall the estimate (3.3). ∎

Denote by ${\boldsymbol{P}}_{x}$ , $x\in S$ , the probability measure on $D({\mathbb{R}}_{+},S)$ induced by the Markov chain ${\boldsymbol{X}}(t)$ starting from $x$ . Since ${\boldsymbol{P}}_{x}[{\boldsymbol{X}}(t)\not={\boldsymbol{X}}(t-)]=0$ for all $t\geq 0$ , the finite-dimensional distributions of $X^{T}_{N}$ converge to the ones of ${\boldsymbol{X}}(t)$ .

Proof of Proposition 2.1.

We prove the result for one-dimensional distributions. The extension to higher order is immediate. Fix $x$ , $y\in S$ , $r>0$ , and a sequence $\{\eta^{N}:N\geq 1\}$ , $\eta^{N}\in{\mathscr{E}}^{x}_{N}$ . By assumption (H1), by Lemma 3.1, and by (2.1),

[TABLE]

Since

[TABLE]

the inequality in the penultimate formula must be an identity for each $y\in S$ , which completes the proof of the proposition. ∎

3.1. The assumption (2.1)

We conclude this section presenting three sets of sufficient conditions for the bound (2.1).

Remark 3.2.

To prove condition (2.1), one is tempted to argue that for all $2\delta\leq s\leq 3\delta$ , $\eta\in{\mathscr{E}}_{N}$ ,

[TABLE]

In many examples, however, it is not true that the right hand side vanishes, uniformly over configurations in ${\mathscr{E}}_{N}$ , as $N\to\infty$ and then $\delta\to 0$ . In condensing zero ranges processes or in random walks in a potential field, starting from certain configuration in a valley ${\mathscr{E}}^{x}_{N}$ , in a time interval $[0,\delta]$ , the chain $\xi^{N}(s)$ visits many times the set $\Delta_{N}$ and the right hand side of the previous inequality, for such configurations $\eta$ , is close to $1$ .

The next lemma provides sufficient conditions for assumption (2.1) to hold. It is tailor-made to cover the case where the metastable sets are singletons. This includes spin models on finite sets [38, 42, 4, 8, 18, 32, 36, 19], inclusion processes [9, 25], and random walks among random traps [26, 27].

Lemma 3.3.

Assume that for each $x\in S$ ,

[TABLE]

Then, (2.1) holds.

Proof.

Fix $x\in S$ , $\eta\in{\mathscr{E}}^{x}_{N}$ and $s>0$ . Multiplying and dividing the probability ${\mathbb{P}}_{\eta}[\xi^{N}(s)\in\Delta_{N}]$ by $\mu_{N}(\eta)$ , we obtain that

[TABLE]

In particular, condition (2.1) follows from the assumption of the lemma. ∎

The next condition is satisfied by random walks in a potential field [12, 33, 35, 34], illustrated by Example 5.3. It is instructive to think of the sets ${\mathscr{B}}^{x}_{N}\subset{\mathscr{E}}^{x}_{N}$ below, as the deep part of the well ${\mathscr{E}}^{x}_{N}$ . The first condition requires the process to reach the set ${\mathscr{B}}_{N}^{x}$ quickly, while the second one imposes that it will not attain the set $\Delta_{N}$ in a short time interval when starting from ${\mathscr{B}}_{N}^{x}$ .

Lemma 3.4.

Assume that for each $x\in S$ there exists a set ${\mathscr{B}}_{N}^{x}\subset{\mathscr{E}}_{N}^{x}$ such that

[TABLE]

Then, condition (2.1) is in force.

Proof.

Fix $x\in S$ , $\eta\in{\mathscr{E}}^{x}_{N}$ , $\delta>0$ , $s\in[2\delta,3\delta]$ , and write

[TABLE]

On the event $\{H_{{\mathscr{B}}_{N}^{x}}<+\infty\}$ let us denote $\xi_{{\mathscr{B}}}^{N}=\xi^{N}(H_{{\mathscr{B}}_{N}^{x}})$ . By the strong Markov property and since $s$ belongs to the interval $[2\delta,3\delta]$ , the first term on the right hand side is bounded by

[TABLE]

which completes the proof of the lemma. ∎

In Lemmata 3.6 and 3.8 below we present some conditions which imply that, for all $\delta>0$ , $\sup_{\eta\in{\mathscr{E}}^{x}_{N}}{\mathbb{P}}_{\eta}[H_{{\mathscr{B}}^{x}_{N}}\geq\delta]$ vanishes as $N\to\infty$ .

Recall from (2.2), (2.3) that $\mu^{x}_{N}$ represents the stationary measure $\mu_{N}$ conditioned to ${\mathscr{E}}^{x}_{N}$ , and $\mathcal{S}^{R,x}_{N}(t)$ the semigroup of the reflected process on ${\mathscr{E}}^{x}_{N}$ . The third set of conditions which yield (2.1) relies on the next estimate.

Lemma 3.5.

For every $0<T<\delta<s$ , $x\in S$ , and configuration $\eta\in{\mathscr{E}}^{x}_{N}$ ,

[TABLE]

Proof.

Fix $x\in S$ , $\eta\in{\mathscr{E}}^{x}_{N}$ , and $0<T<\delta<s$ . Clearly,

[TABLE]

On the set $\{H_{({\mathscr{E}}^{x}_{N})^{c}}>T\}$ , up to time $T$ , we may couple the chain $\xi^{N}(s)$ with the chain reflected at the boundary of ${\mathscr{E}}^{x}_{N}$ , which has been denoted by $\xi^{N}_{R,x}(s)$ . By the Markov property at time $T$ and replacing $\xi^{N}(s)$ by $\xi^{N}_{R,x}(s)$ , the second term of the previous equation becomes

[TABLE]

By definition of the total variation distance, and since, by assumption, the stationary measure of the reflected process is given by $\mu^{x}_{N}=\mu_{N}/\mu_{N}({\mathscr{E}}_{N}^{x})$ , this sum is less than or equal to

[TABLE]

The second term is equal to $\mu_{N}(\Delta_{N})/\mu_{N}({\mathscr{E}}_{N}^{x})$ , which completes the proof of the lemma. ∎

Recall from (2.5) that we denote by $H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}}$ ( $H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}}$ , respectively) the hitting time on the set ${\mathscr{B}}_{N}^{x}$ for the trace process on ${\mathscr{E}}_{N}$ ( ${\mathscr{E}}_{N}^{x}$ , respectively).

Lemma 3.6.

Under assumptions (H1) and (H2), for every $x\in S$ and $\eta\in{\mathscr{E}}_{N}^{x}$ ,

[TABLE]

Proof.

By definition of $H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}}$ , for any $\eta\in{\mathscr{E}}_{N}^{x}\setminus{\mathscr{B}}_{N}^{x}$ ,

[TABLE]

The first term on the right hand side of the preceding equation vanishes ,as $N\to\infty$ , by (3.1). The second term is bounded by

[TABLE]

Since the event $\{H_{{\mathscr{E}}_{N}\setminus{\mathscr{E}}_{N}^{x}}^{{\mathscr{E}}_{N}}\leq\delta/2\}$ can be expressed as $\{X^{T}_{N}(s)\not=X^{T}_{N}(0)$ for some $0<s\leq\delta/2\}$ , by assumption (H1), the first term on the right hand side of the last equation vanishes as $N\to\infty$ . Just as in the proof of Lemma 3.5, on the event $\{H_{{\mathscr{E}}_{N}\setminus{\mathscr{E}}_{N}^{x}}^{{\mathscr{E}}_{N}}>\delta/2\}$ we may couple the trace process $\xi^{{\mathscr{E}}_{N}}$ with the trace process on the well ${\mathscr{E}}_{N}^{x}$ . This permits to bound the last term in the preceding equation by ${\mathbb{P}}_{\eta}[H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}}>\delta/2]$ . Hence,

[TABLE]

The reverse implication is trivial, since $H_{{\mathscr{B}}_{N}^{x}}\geq H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}},$ pointwise. ∎

Corollary 3.7.

Assume that conditions (H1), (H2), (M1), (M2) and (2.10) are in force. Then, condition (2.1) is satisfied. In particular, under the assumptions of Proposition 2.2, the finite-dimensional distributions of the projected chain $X_{N}(t)$ converge, as $N\to\infty$ , to the finite-dimensional distributions of the Markov chain ${\boldsymbol{X}}(t)$ appearing in condition (H1).

Proof of Corollary 3.7.

The first assertion of the corollary is a straightforward consequence of the assumptions and Lemma 3.4, Lemma 3.5, Lemma 3.6 with $\eta\in{\mathscr{B}}_{N}^{x}$ . The second assertion follows from Proposition 2.1. ∎

Denote by $\lambda_{N}(\eta)$ , $\eta\in E_{N}$ , the holding rates of the Markov chain $\xi^{N}(t)$ . For two disjoint subsets ${\mathscr{A}}$ , ${\mathscr{B}}$ of $E_{N}$ , denote by ${\rm cap}_{N}({\mathscr{A}},{\mathscr{B}})$ the capacity between ${\mathscr{A}}$ and ${\mathscr{B}}$ :

[TABLE]

Similarly, for two disjoint subsets ${\mathscr{A}}$ , ${\mathscr{B}}$ of ${\mathscr{E}}^{x}_{N}$ we represent by ${\rm cap}_{N,x}({\mathscr{A}},{\mathscr{B}})$ the capacity between ${\mathscr{A}}$ and ${\mathscr{B}}$ for the trace process $\xi^{N}_{T,x}(t)$ :

[TABLE]

where $\lambda^{T,x}_{N}(\eta)$ stands for the holding rates of the trace process $\xi^{N}_{T,x}(t)$ .

The following lemma offers sufficient conditions for (M1), in terms of mixing time or capacity estimates. In view of Lemma 3.6, together with (H1) and (H2) these conditions also imply that

[TABLE]

Lemma 3.8.

Let $T^{N,T,x}_{\rm mix}$ represent the $\big{(}\frac{1}{2e}\big{)}$ -mixing time of the trace process on ${\mathscr{E}}_{N}^{x}$ . If, for every $x\in S$ either

[TABLE]

or

[TABLE]

are satisfied, then (M1) holds. If $\{\eta^{N}(t):t\geq 0\}$ is reversible, the logarithmic term in (3.9) can be dropped.

Proof.

To prove the assertion of the lemma under the assumption (3.8), note that by Proposition A.2 in [6],

[TABLE]

where the last equality follows from identity (A.10) in [6].

Assume, now, that (3.9) is in force. The following argument is inspired by Theorem 6 in [1] and Theorem 1.1 in [40]. We include it here for completeness. Recall from (2.3) that we denote by $\mathcal{S}^{T,x}_{N}(t)$ the semigroup of the trace process on ${\mathscr{E}}^{x}_{N}$ . Pick a time-scale $(\vartheta_{N}:N\geq 1)$ such that

[TABLE]

We may choose, for example,

[TABLE]

Recall that we denote by $\xi^{N}_{T,x}(t)$ the trace of the Markov chain $\xi^{N}(t)$ on ${\mathscr{E}}^{x}_{N}$ . For any $\eta\in{\mathscr{E}}_{N}^{x}$ , by definition of $\vartheta_{N}$

[TABLE]

Since this estimate is uniform in $\eta$ , we may iterate it, using the Markov property, to get

[TABLE]

This expression vanishes, as $N\to\infty$ , if (3.9) is satisfied and if we choose $\vartheta_{N}$ according to (3.10).

Finally, if the process is reversible, by Theorem 5 in [1], there exists a finite universal constant $C_{0}$ such that

[TABLE]

Hence, (3.7) follows from (3.5) by Markov’s inequality. ∎

The preceding lemma evidences the importance of an upper bound for the mixing time of the trace process. This is the content of Remark 3.9 below.

Denote by $R_{N}(\eta,\xi)$ , $\eta$ , $\xi\in E_{N}$ , the jump rates of the Markov chain $\xi^{N}(t)$ , and by $R^{T,x}_{N}(\eta^{\prime},\xi^{\prime})$ , $\eta^{\prime}$ , $\xi^{\prime}\in{\mathscr{E}}^{x}_{N}$ , the jump rates of the trace process $\xi^{N}_{T,x}(t)$ . Assume that the Markov chain $\xi^{N}(t)$ is reversible and denote by $D_{N}$ , $D_{N,T,x}$ the Dirichlet form of the processes $\xi^{N}(t)$ , $\xi^{N}_{T,x}(t)$ , respectively:

[TABLE]

for functions $f:E_{N}\to{\mathbb{R}}$ , $g:{\mathscr{E}}^{x}_{N}\to{\mathbb{R}}$ . By replacing, in the first line of the previous formula, the measure $\mu_{N}$ by the conditioned measure $\mu^{x}_{N}$ , and by restricting the sum to configurations $\eta,\xi\in{\mathscr{E}}^{x}_{N}$ , we obtain the Dirichlet form of the reflected process, denoted by $D_{N,R,x}(f)$ .

Denote by $T^{N,T,x}_{\rm rel}$ , $T^{N,R,x}_{\rm rel}$ the relaxation times of the trace process $\xi^{N}_{T,x}(t)$ , the reflected process $\xi^{N}_{R,x}(t)$ , respectively:

[TABLE]

where the supremum is carried over all functions $g:{\mathscr{E}}^{x}_{N}\to{\mathbb{R}}$ with mean zero with respect to $\mu^{x}_{N}$ , and $\|g\|_{\mu^{x}_{N}}$ represents the $L^{2}(\mu^{x}_{N})$ norm of $g$ : $\|g\|^{2}_{\mu^{x}_{N}}=\sum_{\eta\in{\mathscr{E}}^{x}_{N}}\mu^{x}_{N}(\eta)\,g(\eta)^{2}$ .

Remark 3.9.

Obtaining estimates for the mixing time $T^{N,T,x}_{\rm mix}$ of the trace process on the well ${\mathscr{E}}_{N}^{x}$ is often not harder than doing so for the mixing time $T_{\rm mix}^{N,R,x}$ of the reflected process on the well. Both processes have the same invariant measure $\mu_{N}^{x}$ and the former has higher jump rates. Indeed, by [3, Corollary 6.2], for any $\eta,\zeta\in{\mathscr{E}}_{N}^{x}$ , $\eta\not=\zeta$ ,

[TABLE]

Hence, the Dirichlet form corresponding to the trace on ${\mathscr{E}}_{N}^{x}$ dominates the Dirichlet form corresponding to the reflected process on ${\mathscr{E}}_{N}^{x}$ and, consequently, the relaxation time $T_{\text{\rm rel}}^{N,T,x}$ of the former is smaller than the relaxation time $T_{\text{\rm rel}}^{R,x}$ of the latter. Then, by the proof of [37, Theorem 12.3],

[TABLE]

The right hand side of the preceding inequality, which is often used as an upper bound for the mixing time $T_{\text{mix}}^{N,R,x}$ of the chain $\xi^{N}(\cdot)$ restricted in the well ${\mathscr{E}}_{N}^{x}$ , is also a bound for the mixing time of the trace process.

Remark 3.10.

In many interesting cases, e.g. random walks on a potential field [12, 33, 35, 34] or condensing zero-range processes [5, 30], the set ${\mathscr{B}}_{N}^{x}$ may be taken as a singleton.

4. Convergence of the state

In this section, we prove Proposition 2.2. From now on, we assume that the number of valleys is fixed and that the sequence of Markov chains fulfills conditions (H1), (H2), (M1), (M2) and (2.10).

Proof of Proposition 2.2.

We will prove the assertion for $k=1$ . The general case follows easily from Corollary 3.7 and the Markov property. The proof is divided in several steps. At each stage we write the main expression as the sum of a simpler one and a negligible remainder.

Fix $t>0$ , $x\in S$ , a sequence $\{\eta^{N}:N\geq 1\}$ , $\eta^{N}\in{\mathscr{E}}^{x}_{N}$ , and $0<\delta<t$ . By definition, $[\delta_{\eta^{N}}S^{N}(t)](\xi)={\mathbb{P}}_{\eta^{N}}[\xi^{N}(t)=\xi]$ can be written as

[TABLE]

where

[TABLE]

By Corollary 3.7, for every $0<\delta<t$ ,

[TABLE]

By the Markov property, the sum appearing in (4.1) is equal to

[TABLE]

Let $p(\eta)={\mathbb{P}}_{\eta^{N}}[\xi^{N}(t-\delta)=\eta]$ . We may rewrite this expression as

[TABLE]

where

[TABLE]

By (3.5) and condition (M1),

[TABLE]

By the strong Markov property, using the notation $\xi_{{\mathscr{B}}}^{N}=\xi^{N}(H_{{\mathscr{B}}_{N}^{y}})$ , the first term in (4.2) is equal to

[TABLE]

Let us now define $A_{\xi}=\{\xi^{N}(\delta-H_{{\mathscr{B}}_{N}^{y}})=\xi\}$ , $B_{N}=\{H_{{\mathscr{B}}_{N}^{y}}\leq\frac{\delta}{2}\}$ and recall the definition of the time-scale $\varepsilon_{N}$ introduced in condition (M2). Rewrite the previous sum as

[TABLE]

where

[TABLE]

By (2.7), this latter expression vanishes as $N\to\infty$ .

By the Markov property, the sum appearing in (4.3) is equal to

[TABLE]

where $A^{\prime}_{\xi}=\{\xi^{N}(\delta-H_{{\mathscr{B}}_{N}^{y}}-\varepsilon_{N})=\xi\}$ . On the set $\{H_{\Delta_{N}}>\varepsilon_{N}\}$ , we may replace the chain $\xi^{N}(t)$ by the reflected chain at ${\mathscr{E}}^{y}_{N}$ , denoted by $\xi^{N}_{R,y}(t)$ . The previous expression is thus equal to

[TABLE]

This sum can be rewritten as

[TABLE]

where, by (2.7) and a similar argument to the one following (4.3)

[TABLE]

Since, for every $\eta\in{\mathscr{B}}_{N}^{y},\,\xi\in E_{N}$ ,

[TABLE]

the first term of (4.4) is equal to

[TABLE]

where the remainder $R^{(5)}_{N}(t,\delta,\xi)$ is given by

[TABLE]

Therefore,

[TABLE]

so that, by (2.8),

[TABLE]

The first term in (4.5) can be written as

[TABLE]

where

[TABLE]

Therefore,

[TABLE]

The probability inside the expectation is less than or equal to

[TABLE]

where $\breve{{\mathscr{E}}}^{y}_{N}$ has been introduced in (1.3). Since $\mu^{y}_{N}(\zeta)=\mu_{N}({\mathscr{E}}^{y}_{N})^{-1}\mu_{N}(\zeta\cap{\mathscr{E}}^{y}_{N})\leq\mu_{N}({\mathscr{E}}^{y}_{N})^{-1}\mu_{N}(\zeta)$ , the first term is bounded by $\mu_{N}(\Delta)/\mu_{N}({\mathscr{E}}^{y}_{N})$ . On the other hand, the second term is less than or equal to

[TABLE]

Therefore,

[TABLE]

and, by assumption (H1) and (2.11),

[TABLE]

Lemma 4.1 below shows that the first term in (4.6) is equal to

[TABLE]

where

[TABLE]

We may rewrite the sum in (4.7) as

[TABLE]

where

[TABLE]

By (3.5) and condition (M1), for every $0<\delta<t$ ,

[TABLE]

In view of the definition of $p(\eta)$ , the first term in (4.8) can be written as

[TABLE]

where

[TABLE]

Clearly, $\sum_{\xi\in E_{N}}|R^{(9)}_{N}(t,\delta,\xi)|$ is less than or equal to

[TABLE]

where the supremum is carried over real numbers $r$ , $s$ in $[0,t]$ . By assumption (H1) and Corollary 3.7,

[TABLE]

Up to this point we proved that

[TABLE]

where

[TABLE]

Therefore, in view of (4.9),

[TABLE]

which completes the proof of the proposition, in view of (4.10) and Corollary 3.7. ∎

Lemma 4.1.

Under (H1), (M1), (M2), (2.10) and any of the assumptions (a), (b) or (c) of Proposition 2.2, for any $y\in S,\,s\in(\delta/2,\delta)$ we have

[TABLE]

Proof.

For all $\xi\in{\mathscr{E}}_{N}^{y}$ ,

[TABLE]

Hence,

[TABLE]

By (2.10), the first term of this sum vanishes, as $N\to\infty$ . It remains to show that the second term also vanishes under assumption (a), (b) or (c).

Assume first that (a) holds. Then, by reversibility, the last term in (4.11) is equal to

[TABLE]

This expression vanishes, as $N\to\infty$ , by assumption (H1). This completes the proof of the lemma under the hypothesis (a).

Assume now that condition (b) is in force. In this case, the last term in (4.11) is bounded by

[TABLE]

Here again, by assumption (H1), this expression vanishes, as $N\to\infty$ . This completes the proof of the lemma under the hypothesis (b).

Assume, finally, that condition (c) is fulfilled. Note that

[TABLE]

by Markov’s inequality. The last expression vanishes as $N\to\infty$ by (2.10). Define the stopping time $\sigma_{N}$ as

[TABLE]

By repeating the arguments that led to (3.5) and (3.11) we obtain that

[TABLE]

Let

[TABLE]

Conditioning first on $\sigma_{N}$ , and using (2.7), (2.8) and (4.12) yields that

[TABLE]

This concludes the argument. ∎

5. Examples

We present in this section four examples to evaluate the conditions introduced in the previous sections. The first example belongs to the class of models in which the metastable sets are singletons. In the second and third examples the metastable sets are not singletons, but the process visits all configurations of a metastable set before hitting a new metastable set. These processes are said to visit points. In the second example the assumptions of Lemma 3.3 are in force, but not in the third. For this latter class, we show that the conditions of Corollary 3.7 are fulfilled for an appropriate singleton set ${\mathscr{B}}^{x}_{N}$ . In the last example, the process does not visit all configurations of a metastable set before reaching a new metastable set. In these models the entropy plays an important role in the metastable behavior of the system. For this last model, we prove that the hypotheses of Lemma 3.4 hold.

The purpose of this section is not to show that the conditions of Lemmata 3.3, 3.4 or Corollary 3.7 are in force in great generality. Actually, in some cases, this requires lengthy arguments and a detailed analysis of the dynamics. We just want to convince the reader that this is possible. In other words, that one can deduce the convergence of the finite-dimensional distributions and the convergence of the state of the process from conditions (H1), (H2) and some reasonable additional conditions.

In the arguments below we use the Dirichlet and the Thomson principles for the capacities between two disjoint sets of $E_{N}$ . We do not recall these results here and we refer to [11, Section 7.3]

Example 5.1 (Inclusion process [25, 9]).

The inclusion process describes the evolution of particles on a countable set. Recall from (1.1) that we denote by ${\mathbb{T}}_{L}$ , $L\geq 1$ , the discrete, one-dimensional torus with $L$ points, by $E_{N}$ the set of configurations on ${\mathbb{T}}_{L}$ with $N$ particles, and by $\eta_{x}$ , $x\in{\mathbb{T}}_{L}$ , the total number of particles at $x$ for the configuration $\eta$ .

Fix a sequence $(d_{N}:N\geq 1)$ of strictly positive numbers. Recall from (1.2) the definition of the configuration $\sigma^{x,y}\eta$ . The reversible, nearest-neighbor, inclusion process associated to the sequence $d_{N}$ is the continuous-time, $E_{N}$ -valued Markov process $\{\eta^{N}(t):t\geq 0\}$ whose generator $L_{N}$ acts on functions $f:E_{N}\to{\mathbb{R}}$ as

[TABLE]

where $r(-1)=r(1)=1$ , $r(x)=0$ , otherwise.

The inclusion process is clearly irreducible and it is reversible with respect to the probability measure $\mu_{N}$ given by

[TABLE]

where $Z_{N}$ is the normalizing constant, $w_{N}(k)=\Gamma(k+d_{N})/k!\,\Gamma(d_{N})$ , and $\Gamma$ is the gamma function.

Assume that $d_{N}\log N\to 0$ , as $N\uparrow\infty$ . Denote by $\xi^{x,N}$ the configurations in which all particles are placed at site $x$ , $\xi^{x,N}_{x}=N$ , $\xi^{x,N}_{y}=0$ for $y\not=x$ , and let ${\mathscr{E}}^{x}_{N}=\{\xi^{x,N}\}$ . By [9, Proposition 2.1], $\mu_{N}({\mathscr{E}}^{x}_{N})\to 1/L$ as $N\uparrow\infty$ .

The metastable behavior of the inclusion process in the sense of conditions (H1), (H2) has been proved in [9, Theorem 2.3]. The time-scale at which a metastable behavior is observed is given by $\theta_{N}=1/d_{N}$ .

In this model the metastable sets ${\mathscr{E}}^{x}_{N}$ are singletons. This phenomenon occurs in many other models. For instance, in spin systems evolving in large, but fixed, volumes as the temperature vanishes (cf. the Ising model with an external field under the Glauber dynamics [38, 42, 4] and the Blume-Capel model with zero chemical potential and a small magnetic field [18, 32, 19]). It also occurs for random walks evolving among random traps [26, 27].

We claim that all hypotheses of Propositions 2.1, 2.2 are in force. Actually, with the exception of (H1) and (H2), all assumptions trivially hold because the metastable sets are singletons. Set ${\mathscr{B}}^{x}_{N}={\mathscr{E}}^{x}_{N}=\{\xi^{x,N}\}$ .

A. Conditions (H1) and (H2). We already mentioned that assumptions (H1) and (H2) have been proved in [9] with the time-scale $\theta_{N}=1/d_{N}$ .

B. Condition (2.1). By [9, Proposition 2.1], $\mu_{N}(\xi^{x,N})\to 1/L$ . In particular, the inclusion process satisfies the assumption of Lemma 3.3.

C. Condition (M1). Condition (M1) is empty because the sets ${\mathscr{E}}^{x}_{N}$ and ${\mathscr{B}}^{x}_{N}$ coincide.

D. Condition (2.7) of (M2). Since ${\mathscr{E}}^{x}_{N}=\{\xi^{x,N}\}$ , starting from $\xi^{x,N}$ , $H_{\Delta_{N}}$ corresponds to the first jump of the Markov chain $\xi^{N}(t)$ , denoted hereafter by $\tau_{1}$ : ${\mathbb{P}}_{\xi^{x}_{N}}[H_{\Delta_{N}}=\tau_{1}]=1$ . Since the process has been speeded-up by $\theta_{N}=1/d_{N}$ , $\tau_{1}$ is an exponential random variable of rate $2N$ . It is thus enough to choose a sequence $\varepsilon_{N}$ such that $\varepsilon_{N}\ll 1/N$ .

E. Condition (2.8) of (M2). This condition is empty because ${\mathscr{E}}^{x}_{N}=\{\xi^{x,N}\}$ . It holds for any sequence $\varepsilon_{N}>0$ .

F. Condition (2.10) of Proposition 2.2. This is a consequence of [9, Proposition 2.1] which asserts that $\mu_{N}(\xi^{x,N})\to 1/L$ .

G. Conditions (a), (b) or (c). Assumption (a) of Proposition 2.2 is in force as the process is reversible.

Example 5.2 (Condensing zero-range processes [5, 30, 43]).

This model has been introduced at the beginning of Section 2. Set $\theta_{N}=N^{1+\alpha}$ .

The condensing zero-range process is an example of a process which visits points in the sense that, starting from a well ${\mathscr{E}}^{x}_{N}$ , the dynamics visits all configurations of ${\mathscr{E}}^{x}_{N}$ before reaching another well. This property reads as follows. For all $x\in S={\mathbb{T}}_{L}$ ,

[TABLE]

where $\breve{{\mathscr{E}}}^{x}_{N}$ has been introduced in (1.3). Other examples of metastable dynamics which visit points are random walks in a potential field [16, 12, 33, 35].

We show below that all hypotheses of Propositions 2.1, 2.2 are in force. In certain cases we impose further assumptions on the dynamics, e.g., that it is reversible or that $|S|=2$ , to avoid lengthy arguments. The main tool to prove this assertion is the fact that the process visit points. Recall from (3.6) that we denote by ${\rm cap}_{N}({\mathscr{A}},{\mathscr{B}})$ the capacity between two disjoint subsets ${\mathscr{A}}$ and ${\mathscr{B}}$ of $E_{N}$ . Since $\xi^{N}(t)$ is the process $\eta^{N}(t)$ speeded-up by $\theta_{N}$ , by [5, Theorem 2.2] and [43, Theorem 6.3], for any disjoint subsets $A$ , $B$ of $S$ ,

[TABLE]

where $C(A,B)$ is the capacity between $A$ and $B$ for the random walk on $S$ with transition probabilities $p(y\!-\!x)$ , for $x,y\in S$ .

A. Conditions (H1) and (H2). Assumptions (H1) and (H2) have been proved in [5] in the reversible case, in [30] in the totally asymmetric case, $p=1$ , and in [43] in the asymmetric case $1/2<p<1$ .

B. Condition (2.1). We prove that the assumptions of Lemma 3.4 are in force in the reversible case for ${\mathscr{B}}^{x}_{N}=\{\xi^{x,N}\}$ , where $\xi^{x,N}$ represents the configurations in which all particles are placed at site $x$ .

Fix $x\in S$ and $\eta\in{\mathscr{E}}^{x}_{N}$ . By the Markov inequality and [3, Proposition 6.10],

[TABLE]

By (H1), page 806 in [5],

[TABLE]

Therefore, by (5.1), for every $\delta>0$ ,

[TABLE]

On the other hand, for every $s>0$ ,

[TABLE]

By equation (3.2) in [5], $\mu_{N}(\Delta_{N})\to 0$ , and by [5, Proposition 2.1], $\mu_{N}(\xi^{x,N})\to 1/Z_{S}>0$ . This shows that the second assumption of Lemma 3.4 is in force.

I. Seo extended the previous result to the asymmetric case $1/2<p<1$ in [43, Proposition 6.3].

C. Condition (M1). Since $H_{{\mathscr{B}}_{N}^{x}}^{{\mathscr{E}}_{N}^{x}}\leq H_{{\mathscr{B}}_{N}^{x}}$ , condition (M1) follows from (5.2).

D. Condition (2.7) of (M2). Since the exterior boundary of ${\mathscr{E}}^{x}_{N}$ is contained in $\Delta_{N}$ , under ${\mathbb{P}}_{\xi^{x,N}}$ , $H_{({\mathscr{E}}^{x}_{N})^{c}}=H_{\Delta_{N}}$ . We claim that

[TABLE]

for some finite constant $C_{0}$ . In particular, condition (2.7) of (M2) is fulfilled provided we choose $\varepsilon_{N}\,\theta_{N}\ll\ell^{\alpha}_{N}$ .

We turn to the proof of (5.3). By Corollary 6.4 we have

[TABLE]

On the other hand, by monotonicity of capacities

[TABLE]

Since the holding rates $\lambda_{N}(\eta)$ are uniformly bounded by $C_{0}\theta_{N}$ , if we denote by $\partial{\mathscr{E}}^{x}_{N}$ the interior boundary of the set ${\mathscr{E}}^{x}_{N}$ , the previous sum is bounded by $C_{0}\,\theta_{N}\,\mu_{N}(\partial{\mathscr{E}}^{x}_{N})$ . An explicit computation shows that the measure of $\partial{\mathscr{E}}^{x}_{N}$ is bounded by $\ell_{N}^{-\alpha}$ . The proof of this assertion is similar to the one of [5, Lemma 3.1] and is omitted. Hence, ${\rm cap}_{N}(\xi^{x,N},\Delta_{N})\;\leq\;C_{0}\,\theta_{N}\,\ell_{N}^{-\alpha}$ . Together with (5.4) and [5, Proposition 2.1], this gives (5.3). (Remark: In the case $|S|=2$ , it is possible to compute exactly ${\rm cap}_{N}(\xi^{x,N},\Delta_{N})$ and one gets that it is of order $\theta_{N}\,\ell_{N}^{-(1+\alpha)}$ . We lost a factor $1/\ell_{N}$ at the first estimate in the preceding display.)

E. Condition (2.8) of (M2). The proof relies on an estimate of the spectral gap. We prove this condition in the case of two sites, the general case can be handled using the martingale approach developed by Lu and Yau [28, Appendix 2].

Assume that $|S|=2$ , and denote by $\lambda_{R,1}$ the spectral gap of the process $\xi^{N}(t)$ reflected at ${\mathscr{E}}^{1}_{N}=\{0,\dots,\ell_{N}\}$ . We claim that

[TABLE]

On two sites, the zero-range process is a birth and death process, and the reflected process on ${\mathscr{E}}^{1}_{N}$ is the continuous-time Markov chain whose generator is given by

[TABLE]

where $g_{R,N}(\zeta)=\theta_{N}\,g(\zeta)$ for all $\zeta\not=N-\ell_{N}$ , and $g_{R,N}(N-\ell_{N})=0$ , due to the reflection at ${\mathscr{E}}^{1}_{N}$ . Denote by $\mu^{1}_{N}$ the stationary measure $\mu_{N}$ conditioned to ${\mathscr{E}}^{1}_{N}$ .

In order to prove (5.5), we have to show that there exists a finite constant $C_{0}$ such that

[TABLE]

for all $N\geq 1$ and all functions $f:\{0,\dots,\ell_{N}\}\to{\mathbb{R}}$ , where $\langle f,g\rangle_{\mu^{1}_{N}}$ represents the scalar product in $L^{2}(\mu^{1}_{N})$ .

Fix a function $f:\{0,\dots,\ell_{N}\}\to{\mathbb{R}}$ . By Schwarz inequality,

[TABLE]

The sum over $\xi^{\prime}$ is bounded by $C_{0}\eta^{1+\alpha}$ . Hence, since $\mu^{1}_{N}(\eta)\leq C_{0}\eta^{-\alpha}$ , changing the order of summations the previous expression is seen to be less than or equal to

[TABLE]

This expression is bounded by $C_{0}\,(\ell^{2}_{N}/\theta_{N})\,\langle f,(-{\mathcal{L}}^{R,1}_{N})f\rangle_{\mu^{1}_{N}}$ because $g$ is bounded below by a positive constant and the process is speeded-up by $\theta_{N}$ . This proves claim (5.6), and therefore (5.5).

We turn to condition (2.8) of (M2). We claim that this condition is fulfilled provided $\varepsilon_{N}\theta_{N}\gg\ell^{2}_{N}$ . Indeed, since $\mu^{y}_{N}({\xi^{x,N}})\geq c_{0}$ , in view of (2.9), we have to show that

[TABLE]

which follows from (5.5) if $\varepsilon_{N}\theta_{N}\gg\ell^{2}_{N}$ .

For $|S|=2$ , in view of (D) and (E) above, conditions (2.7) and (2.8) of (M2) are fulfilled for any sequence $\varepsilon_{N}$ such that $\ell^{2}_{N}\ll\varepsilon_{N}\theta_{N}\ll\ell^{1+\alpha}_{N}$ .

F. Condition (2.10) of Proposition 2.2. By [5, Remark 2.5],

[TABLE]

G. Conditions (a), (b) or (c). Assumption (b) of Proposition 2.2 is in force since $\mu_{N}({\mathscr{E}}^{x}_{N})=\mu_{N}({\mathscr{E}}^{y}_{N})$ for all $x$ , $y\in S$ .

Example 5.3 (Random walk in a potential field).

In this example, the sets ${\mathscr{B}}^{x}_{N}$ are still reduced to singletons, ${\mathscr{B}}^{x}_{N}=\{\xi^{x,N}\}$ , but $\mu_{N}(\xi^{x,N})\to 0$ . To simplify the discussion as much as possible, we assume that the process is reversible and that the potential has two wells of the same height, but the arguments apply to the more general situations considered in [12, 33, 35].

Let $\Xi$ be an open, bounded and connected subset of ${\mathbb{R}}^{d}$ with a smooth boundary $\partial\,\Xi$ . Fix a smooth function $F:\Xi\cup\partial\,\Xi\to{\mathbb{R}}$ , with three critical points, satisfying the following assumptions:

(RW1)

There are two local minima, denoted by $m_{1}$ , $m_{2}$ . All the eigenvalues of the Hessian of $F$ at these points are strictly positive. Moreover, $F(m_{1})=F(m_{2})=:h$ . 2. (RW2)

The other critical point of $F$ is denoted by $\sigma$ . The Hessian of $F$ at $\sigma$ has one strictly negative eigenvalue, all the other ones being strictly positive. 3. (RW3)

For every $x\in\partial\,\Xi$ , $(\nabla F)(x)\cdot n(x)>0$ , where $n(x)$ represents the exterior normal to the boundary of $\Xi$ , and $x\cdot y$ the scalar product of $x$ , $y\in{\mathbb{R}}^{d}$ . This hypothesis guarantees that $F$ has no local minima at the boundary of $\Xi$ .

Denote by $\Xi_{N}$ the discretization of $\Xi$ : $\Xi_{N}=\Xi\cap(N^{-1}{\mathbb{Z}})^{d}$ , $N\geq 1$ . Let $\mu_{N}$ be the probability measure on $\Xi_{N}$ defined by

[TABLE]

where $Z_{N}$ is the partition function $Z_{N}=\sum_{\eta\in\Xi_{N}}\exp\{-NF(\eta)\}$ . By equation (2.3) in [33],

[TABLE]

where ${\rm Hess}\,F(x)$ represents the Hessian of $F$ calculated at $x$ and $\det{\rm Hess}\,F(x)$ its determinant.

Let $\{\eta^{N}(t):t\geq 0\}$ be the continuous-time Markov chain on $\Xi_{N}$ whose generator $L_{N}$ is given by

[TABLE]

where $\|\,\cdot\,\|$ represents the Euclidean norm of ${\mathbb{R}}^{d}$ .

Recall that $m_{i}$ , $i=1$ , $2$ , represent the two local minima of $F$ in $\Xi$ , and $\sigma$ the saddle point. Let $H:=F(\sigma)>F(m_{1})=F(m_{2})=h$ . Denote by $V_{i}=B_{\kappa}(m_{i})$ , $\kappa>0$ , two balls of radius $\kappa$ centered at the local minima. Assume that $\kappa$ is small enough for $\sup_{x\in V_{i}}F(x)<H$ . Denote by ${\mathscr{E}}^{i}_{N}$ the discretization of the sets $V_{i}$ : ${\mathscr{E}}^{i}_{N}=\Xi_{N}\cap V_{i}$ .

Let $\theta_{N}=2\pi N\exp\{[H-h]N\}$ . It has been proved in [33, 35] that the process $X^{\rm T}_{N}(t)$ fulfills conditions (H1) and (H2). We claim that the assumptions of Propositions 2.1 and 2.2 are in force.

We prove condition (2.1) through Corollary 3.7 with ${\mathscr{B}}^{i}_{N}=\{\xi^{i,N}\}$ , where $\xi^{i,N}$ is a point in $\Xi_{N}$ which approximates the local minima $m_{i}$ .

A. Condition (2.6). Fix $\eta\in{\mathscr{E}}^{i}_{N}$ . Since $H_{{\mathscr{B}}_{N}^{i}}^{{\mathscr{E}}_{N}^{i}}\leq H_{{\mathscr{B}}_{N}^{i}}$ , by the Markov inequality, it is enough to prove that

[TABLE]

By [3, Proposition 6.10], the expectation is bounded by $1/{\rm cap}_{N}(\eta,{\mathscr{B}}_{N}^{i})$ . Consider a path $(\eta_{0}=\eta,\eta_{1},\dots,\eta_{M}=\xi^{i,N})$ such that $M\leq C_{0}N$ , $\eta_{i}\in\Xi_{N}$ , $\|\eta_{i}-\eta_{i+1}\|=1/N$ , $F(\eta_{i})\leq H-\epsilon$ for some $\epsilon>0$ . Let $\Phi$ be the unitary flow from $\eta$ to $\xi^{i,N}$ such that $\Phi(\eta_{i},\eta_{i+1})=1$ . By Thomson’s principle,

[TABLE]

The factor $\theta_{N}$ appeared as the process has been speeded-up. This expression vanishes as $N\to\infty$ in view of (5.8), the definition of $\theta_{N}$ , and because $F(\eta_{i})\leq H-\epsilon$ , $M\leq C_{0}N$ .

B. Condition (2.7). Let $h_{i}=\inf_{x\in\partial V_{i}}F(x)$ . We claim that this condition is in force provided

[TABLE]

Since, under ${\mathbb{P}}_{\xi^{i,N}}$ , $H_{({\mathscr{E}}^{i}_{N})^{c}}=H_{\Delta_{N}}$ , we need to estimate ${\mathbb{P}}_{\xi^{i,N}}[H_{\Delta_{N}}\leq 2\varepsilon_{N}]$ . By Corollary 6.4,

[TABLE]

where $\partial_{-}{\mathscr{E}}^{i}_{N}$ stands for the inner boundary of ${\mathscr{E}}^{i}_{N}$ :

[TABLE]

By definition of ${\mathscr{E}}^{i}_{N}$ , the right-hand side of the penultimate formula is bounded above by $C_{0}\,\varepsilon_{N}\,\theta_{N}\,N^{d}\,\exp\{-N[h_{i}-h]\}$ , which proves the claim.

C. Condition (2.8). We claim that this condition is fulfilled provided

[TABLE]

for some $b>0$ .

We first estimate the spectral gap of the reflected process $\xi^{N}_{R,i}(t)$ , denoted by $\lambda_{R,i}$ . We claim that $\lambda_{R,i}\geq c_{0}\,\theta_{N}\,N^{-(d+1)}$ . To prove this assertion, we have to show that

[TABLE]

for all $N\geq 1$ and all functions $f:{\mathscr{E}}^{i}_{N}\to{\mathbb{R}}$ , where $\langle f,g\rangle_{\mu^{i}_{N}}$ represents the scalar product in $L^{2}(\mu^{i}_{N})$ . For each $\eta\in{\mathscr{E}}^{i}_{N}$ , denote by $\gamma(\eta)=(\eta_{0}=\eta,\dots,\eta_{M}=\xi^{i,N})$ a discrete version of the path from $\eta$ to $\xi^{i,N}$ given by $\dot{x}(t)=-(\nabla F)(x(t))$ . This means that $\|\eta_{j+1}-\eta_{j}\|=\frac{1}{N}$ , $M\leq C_{0}N$ , and $\eta_{j}$ is the closest point of the lattice $\Xi_{N}$ to $x(t_{j})$ for some increasing sequence of times $\{t_{j}\}_{0\leq j\leq M}$ . Clearly, $|F(\eta_{j})-F\big{(}x({t_{j}})\big{)}|\leq\frac{c_{0}}{N}$ and since $\frac{d}{dt}F\big{(}x(t)\big{)}=-\|(\nabla F)\big{(}x(t))\|^{2}\leq 0$ , for all $0\leq k\leq j\leq M$ we have

[TABLE]

In particular,

[TABLE]

Since $M\leq C_{0}N$ , by Schwarz inequality,

[TABLE]

where the last inequality follows from (5.13). Fix an edge $(\zeta,\zeta^{\prime})$ and consider all configurations $\eta\in{\mathscr{E}}^{i}_{N}$ whose path $\gamma(\eta)$ contains this pair (that is $(\zeta,\zeta^{\prime})=(\eta_{j},\eta_{j+1})$ for some $0\leq j<M$ ). Of course, there are at most $|{\mathscr{E}}^{i}_{N}|\leq C_{0}N^{d}$ such configurations. Hence, changing the order of summation, the previous sum is seen to be bounded above by

[TABLE]

This proves claim (5.12) since the double sum is equal to $(2/\theta_{N})\langle f,(-{\mathcal{L}}^{R,i}_{N})f\rangle_{\mu^{i}_{N}}$ .

We turn to the proof of condition (2.8). Fix a sequence $\varepsilon_{N}$ satisfying (5.11) for some $b>0$ . By (5.8), $\mu_{N}(\xi^{i,N})\geq c_{0}N^{-d/2}$ . Hence, by (5.12),

[TABLE]

By (5.11) this expression vanishes as $N\to\infty$ . This proves condition (2.8) in view of (2.9).

Conditions (2.10) and (2.11) are elementary. Hence, as claimed, all conditions of Propositions 2.1 and 2.2 are in force. Similar arguments apply in the case of several wells and critical points, as well as in the non-reversible setting.

Example 5.4 (Random walk on a singular graph).

[41, 7*]**

In this example, the metastable behavior is not due to an energy landscape but to the presence of bottlenecks. After attaining a well, the system remains there a time long enough to relax inside the well before it hits a point from which it can jump to another well. In this example, to fulfil condition (M1) the set ${\mathscr{B}}^{x}_{N}$ can not be taken as a singleton.*

In many other models the entropy plays an important role in the metastable behavior. In the majority of them, the time-scale in which the metastable behavior is observed can not be computed explicitly and is given in terms of the spectral gap or the expectation of hitting times. This is the case of polymers in the depinned phase [15, 14, 29], or the evolution of a droplet in the Ising model with the Kawasaki dynamics [8, 24].

We consider below a random walk on a graph $E_{N}$ which is illustrated in Figure 2 in the two-dimensional case. For $N\geq 1$ , $d\geq 2$ , let $I_{N}=\{0,\dots,N\}$ , $Q^{+}_{N}=I^{2}_{N}\times I^{d-2}_{N}$ , $Q^{-}_{N}=I^{2}_{N}\times(-I_{N})^{d-2}$ be $d$ -dimensional cubes of length $N$ . Let $w_{i}=w^{N}_{i}$ , $0\leq i\leq 3$ , be the points in ${\mathbb{Z}}^{d}$ given by $w_{0}=(0,N,{\boldsymbol{0}})$ , $w_{1}=(N,0,{\boldsymbol{0}})$ , $w_{2}=(0,-N,{\boldsymbol{0}})$ , $w_{3}=(-N,0,{\boldsymbol{0}})$ , where ${\boldsymbol{0}}$ is the $(d-2)$ -dimensional vector with all coordinates equal to [math]. Set $Q^{i}_{N}=w_{i}+Q^{+}_{N}$ , $i=0$ , $2$ , $Q^{j}_{N}=w_{j}+Q^{-}_{N}$ , $j=1$ , $3$ , $E_{N}=\sqcup_{0\leq i\leq 3}Q^{i}_{N}$ . Note that the sets $Q^{i}_{N}\cap Q^{i+1}_{N}$ are singletons in all dimensions. This explains the rather intricate definition of the sets $Q^{i}_{N}$ .

Denote by $e_{1},\dots,e_{d}$ the canonical basis of ${\mathbb{R}}^{d}$ . Let $\eta^{N}(t)$ be the continuous-time Markov chain on $E_{N}$ which jumps from a configuration $\eta\in E_{N}$ to $\eta\pm e_{j}\in E_{N}$ at rate $1$ if $\eta\mp e_{j}\in E_{N}$ and at rate $2$ if $\eta\mp e_{j}\not\in E_{N}$ . With these jump rates the Markov chain on the cube $I^{d}_{N}$ can be thought as the projection on $I^{d}_{N}$ of a simple random walk on ${\mathbb{Z}}^{d}$ .

Denote by $n(\eta)\in\{0,1,\dots,d\}$ , $\eta\in E_{N}$ , the number of neighbors of $\eta$ which do not belong to $E_{N}$ , and by ${\mathscr{C}}$ the four corners of $E_{N}$ : ${\mathscr{C}}=\{\eta\in E_{N}:\eta\in Q^{x}_{N}\cap Q^{y}_{N}\text{ for some }x\not=y\}$ . Let $\mu_{N}$ be the probability measure on $E_{N}$ given by

[TABLE]

where $Z_{N}$ is the normalizing factor. The measure $\mu_{N}$ is the unique stationary (actually, reversible) state. Denote by $\theta_{N}$ the inverse of the spectral gap of this chain. By [41, Example 3.2.5], there exist constants $0<c(d)<C(d)<\infty$ such that for all $N\geq 1$ ,

[TABLE]

and

[TABLE]

Fix sequences $\{\ell_{N}:N\geq 1\}$ , $\{M_{N}:N\geq 1\}$ , $1\ll\ell_{N}\ll M_{N}\ll N$ , such that

[TABLE]

Recall that we denote by ${\mathscr{C}}$ the four corners of $E_{N}$ . Let $\Delta_{N}$ be the points at graph distance less than $\ell_{N}$ from one of the corners:

[TABLE]

where $d(\eta,\xi)$ stands for the graph distance from $\eta$ to $\xi$ . Finally, let ${\mathscr{E}}^{x}_{N}=Q^{x}_{N}\setminus\Delta_{N}$ , $J_{N}=\{M_{N},\dots,N-M_{N}\}$ , and ${\mathscr{B}}^{x}_{N}=w_{x}+J_{N}^{2}\times\big{(}(-1)^{x}J_{N}\big{)}^{d-2}$ . Note that ${\mathscr{B}}^{x}_{N}\subset{\mathscr{E}}^{x}_{N}$ . We refer to Figure 2 for an illustration of these sets.

Assumptions (H1) and (H2) for this model follow from the arguments presented in [7, Proposition 8.3]. Condition (2.1) follows from Lemma 3.4.

A. First condition of Lemma 3.4. If $d\geq 3$ , this condition follows easily from Lemma 3.8. Indeed, since the mixing time of a random walk on a $d$ -dimensional cube of length $N$ is of order $N^{2}$ , condition (3.9) is an easy consequence of (3.13). The following argument also works for $d=2$ .

Fix $\delta>0$ , $\eta\in{\mathscr{E}}^{0}_{N}$ , and recall that we denote by ${\mathscr{C}}$ the set of corners. Let $\varepsilon_{N}\ll 1$ be a sequence such that $N^{2}\ll\varepsilon_{N}\,\theta_{N}$ . By equation (6.18) in [26],

[TABLE]

We may therefore assume that the process $\xi^{N}(t)$ does not hit ${\mathscr{C}}$ before $\varepsilon_{N}$ . On this event, we may couple $\xi^{N}(t)$ with a speeded-up random walk $\widehat{\xi}_{N}(t)$ on $I^{d}_{N}$ , and $\xi^{N}(t)$ hits ${\mathscr{B}}_{N}^{x}$ when $\widehat{\xi}_{N}(t)$ hits $J_{N}^{d}$ . By Theorem 5 in [1] applied to $\widehat{\xi}_{N}(t)$ ,

[TABLE]

Since $\mu_{N}^{x}({\mathscr{B}}_{N}^{x})\geq c_{0}>0$ and $\theta_{N}\varepsilon_{N}\gg N^{2}$ , this proves that

[TABLE]

and in particular the first condition of Lemma 3.4.

B. Second condition of Lemma 3.4. The argument is based on the fact that the process relaxes to equilibrium inside each cube much before it hits the corners. Fix $\delta>0$ , $\delta<s<3\delta$ , $\eta\in{\mathscr{E}}^{0}_{N}$ , and let $\varepsilon_{N}$ be as in A, i.e. $N^{2}\ll\varepsilon_{N}\,\theta_{N}\ll\theta_{N}$ . By (5.15), we may insert the event $\{H_{{\mathscr{C}}}>\varepsilon_{N}\}$ inside the probability appearing in the second displayed equation in Lemma 3.4. After this operation, applying the Markov property, the probability becomes

[TABLE]

On the set $\{H_{{\mathscr{C}}}>\varepsilon_{N}\}$ , we may couple the process $\xi^{N}(t)$ with the speeded-up, random walk reflected at $Q^{0}_{N}$ . Denote by ${\mathbb{P}}^{0}_{N}$ the distribution with respect to this dynamics and by ${\mathbb{E}}^{0}_{N}$ the expectation.

Up to this point we proved that

[TABLE]

Since the mixing time of the (speeded-up) random walk on $Q^{0}_{N}$ is of order $N^{2}/\theta_{N}\ll\varepsilon_{N}$ , the previous expression is bounded by

[TABLE]

where $\mu^{0}_{N}$ is the stationary state of the reflected random walk. As $\mu^{0}_{N}(\eta)\leq C_{0}\mu_{N}(\eta)$ , and since $\mu_{N}$ is the stationary state, the previous expression is bounded by

[TABLE]

which completes the proof of the second condition of Lemma 3.4.

The convergence of the finite-dimensional distributions has been addressed in [7]. We now turn to the assumptions of Proposition 2.2. Condition (M1) has been proved above in A. We show below that (M2) is in force in dimension $d\geq 3$ .

C. Condition (2.7). Recall from (5.14) that $N^{2}\ll M^{d}_{N}/\ell^{d-2}_{N}$ . Let $\varepsilon_{N}$ be a sequence such that $N^{2}\ll\varepsilon_{N}\,\theta_{N}\ll M^{d}_{N}/\ell^{d-2}_{N}$ .

Fix $\eta\in{\mathscr{B}}^{0}_{N}$ . Up to the hitting time of the set $\Delta_{N}$ the process $\xi^{N}(t)$ behaves as the chain $\widehat{\xi}_{N}(t)$ introduced below (5.15). It is therefore enough to prove condition (2.7) for this latter process. Let $\Delta^{(1)}_{N}$ , $\Delta^{(2)}_{N}$ be the simplexes given by

[TABLE]

We have to show that for $i=1$ , $2$ ,

[TABLE]

where ${\mathbf{P}}_{\eta}$ stands for the distribution of $\widehat{\xi}_{N}(t)$ starting from $\eta$ . By symmetry, it suffices to do so for $i=1$ .

Set $\gamma_{N}=\varepsilon^{-1}_{N}$ , and denote by $\zeta^{\star}_{N}(t)$ the $\gamma_{N}$ -enlargement of the process $\widehat{\xi}_{N}(t)$ . We refer to Section 6 for the definition of the enlargement and the statement of some properties. Denote by ${\mathbf{P}}^{\star}_{\eta}$ the distribution of the process $\zeta^{\star}_{N}(t)$ starting from $\eta$ , and by $V^{\star}$ the equilibrium potential between $\Delta^{(1)}_{N}$ and ${\mathbf{E}}^{\star}_{N}$ : $V^{\star}(\eta)={\mathbf{P}}^{\star}_{\eta}\big{[}H_{\Delta^{(1)}_{N}}\leq H_{{\mathbf{E}}^{\star}_{N}}\big{]}$ . By (6.5), (5.16) follows from

[TABLE]

To bound the equilibrium potential $V^{\star}$ , we follow a strategy proposed in [7]. We first claim that

[TABLE]

Fix $L_{N}=2\ell_{N}$ , and let $f:{\mathbb{N}}\to{\mathbb{R}}_{+}$ the function given by $f(k)=1$ for $0\leq k<\ell_{N}$ , $f(k)=0$ for $k\geq L_{N}$ and $f(k)=A\sum_{k\leq j<L_{N}}j^{-(d-1)}$ for $\ell_{N}\leq k<L_{N}$ , where $A$ is chosen for $f(\ell_{N})=1$ . Let $F:{\mathbf{E}}_{N}\to{\mathbb{R}}$ , $F^{\star}:{\mathbf{E}}_{N}\sqcup{\mathbf{E}}^{\star}_{N}\to{\mathbb{R}}$ be given by $F(x)=f(\sum_{1\leq i\leq d}x_{i})$ , $F^{\star}(\eta)=F(\eta)$ , $\eta\in{\mathbf{E}}_{N}$ , $F^{\star}(\eta)=0$ , $\eta\in{\mathbf{E}}^{\star}_{N}$ . By the Dirichlet principle, ${\rm cap}^{\star}_{N}(\Delta^{(1)}_{N},{\mathbf{E}}^{\star}_{N})\leq D^{\star}_{N}(F^{\star})$ , where $D^{\star}_{N}$ represents the Dirichlet form of the enlarged process $\zeta^{\star}_{N}(t)$ .

There are two contributions to the Dirichlet form $D^{\star}_{N}(F^{\star})$ . The first one corresponds to edges whose vertices belong to the set $\Lambda_{N}=\{x\in{\mathbf{E}}_{N}:\ell_{N}\leq\sum_{i}x_{i}\leq L_{N}\}$ . This contribution is bounded by

[TABLE]

The other contribution, is due to the edges between the sets $\Lambda_{N}$ and $\Lambda^{\star}_{N}$ . Since $F^{\star}$ is bounded by 1, this contribution is bounded by $\frac{1}{4}\gamma_{N}\mu_{N}(\Lambda_{N})\leq C_{0}\gamma_{N}\ell^{d}_{N}/N^{d}$ . This completes the proof of (5.18).

We turn to (5.17). Let $\prec$ be the partial order on $J_{N}^{d}$ defined by $\eta\prec\xi$ if $\eta_{i}\leq\xi_{i}$ for $1\leq i\leq d$ . We may couple two copies of the process $\widehat{\xi}_{N}(t)$ , denoted by $\zeta^{\eta}_{N}(t)$ , $\zeta^{\xi}_{N}(t)$ , starting from $\eta\prec\xi$ , respectively, in such a way that $\zeta^{\eta}_{N}(t)\prec\zeta^{\xi}_{N}(t)$ for all $t\geq 0$ . In particular, $\zeta^{\eta}_{N}(t)$ hits $\Delta^{(1)}_{N}$ before $\zeta^{\xi}_{N}(t)$ , so that

[TABLE]

Suppose that (5.17) does not hold. There exists, therefore, $\delta>0$ , a subsequence $N_{j}$ , still denoted by $N$ , and a configuration $\eta^{N}\in{J_{N}^{d}}$ such that $V^{\star}(\eta^{N})\geq\delta$ . By the previous inequality and by definition of ${J_{N}^{d}}$ , $V^{\star}(\xi)\geq\delta$ for all $\xi$ such that $\max_{i}\xi_{i}\leq M_{N}$ . In particular,

[TABLE]

Comparing this bound with (5.18) we deduce that $\delta^{2}\,\gamma_{N}\,M^{d}_{N}\leq C_{0}\ell^{d-2}_{N}\,\theta_{N}$ , which is a contradiction since $\gamma_{N}=\varepsilon^{-1}_{N}$ and $\varepsilon_{N}\,\theta_{N}\ll M^{d}_{N}/\ell^{d-2}_{N}$ .

D. Condition (2.8). It is well known that the mixing time of a random walk on a $d$ -dimensional cube of length $N$ is of order $N^{2}$ , which proves that condition (2.8) is fulfilled since $\varepsilon_{N}\,\theta_{N}\gg N^{2}$ .

E. Last conditions of Proposition 2.2. Condition (2.10) is clearly in force by definition of $\Delta_{N}$ . On the other hand the chain is reversible.

6. Appendix

We present in this section a general estimate for the hitting time of a set in Markovian dynamics. Fix a finite set $E$ and let $\{\eta(t):t\geq 0\}$ be a continuous-time, irreducible, $E$ -valued Markov chain. Denote by $\pi$ the unique stationary state of the process, by $R(\eta,\xi)$ , $\eta$ , $\xi\in E$ its jump rates, and by ${\mathbb{P}}_{\eta}$ its distribution starting from $\eta$ .

We start with an elementary lemma.

Lemma 6.1.

Let $X$ , $T_{\gamma}$ be two independent random variables defined on some probability space $(\Omega,{\mathcal{F}},P)$ . Assume that $T_{\gamma}$ has an exponential distribution of parameter $\gamma>0$ . Then, for all $b>0$ ,

[TABLE]

Proof.

Since $X$ and $T_{\gamma}$ are independent, for every $b>0$ ,

[TABLE]

The last term is equal to $e^{-\gamma b}\,P\big{[}X\leq b\big{]}$ , which completes the proof of the lemma. ∎

Note that if $X$ is an exponential random variable of parameter $\theta$ , the inequality reduces to

[TABLE]

Hence, choosing $\gamma=1/b$ , if $\theta b$ is small, the inequality is sharp in the sense that the left-hand side is equal to $\theta\,b+O([\theta\,b]^{2})$ , while the right-hand side is equal to $e\,\theta\,b+O([\theta\,b]^{2})$ .

Enlargement of a chain [10, 7]. Let $E^{\star}$ be a copy of $E$ and denote by $\eta^{\star}\in E^{\star}$ the copy of $\eta\in E$ . Denote by $\xi^{\gamma}(t)$ , $\gamma>0$ , the Markov process on $E\sqcup E^{\star}$ whose jump rates $R^{\gamma}(\eta,\xi)$ are given by

[TABLE]

Hence, being at some state $\xi^{\star}$ in $E^{\star}$ , the process may only jump to $\xi$ and this happens at rate $\gamma$ . In contrast, being at some state $\xi$ in $E$ , the process $\xi^{\gamma}(t)$ jumps with rate $R(\xi,\xi^{\prime})$ to some state $\xi^{\prime}\in E$ , and jumps with rate $\gamma$ to $\xi^{\star}$ . We call the process $\xi^{\gamma}(t)$ the $\gamma$ -enlargement of the process $\xi(t)$ . Note that the trace of the enlargement $\xi^{\gamma}(t)$ on $E$ coincides with the original process $\xi(t)$ .

The chain $\xi^{\gamma}(t)$ is clearly irreducible and its invariant probability measure, denoted by $\pi^{\star}$ , is given by

[TABLE]

The process $\xi^{\gamma}(t)$ reversed in time is the Markov chain, denoted by $\xi^{\gamma,*}(t)$ , whose jump rates $R^{\gamma,*}$ are given by

[TABLE]

where $R^{*}(\eta,\xi)$ represents the jump rates of the process $\xi(t)$ reversed in time.

Denote by ${\mathbb{P}}^{\star}_{\eta}$ the distribution of the chain $\xi^{\gamma}(t)$ starting from $\eta$ , and by ${\rm cap}^{\star}({\mathscr{C}},{\mathscr{D}})$ the capacity between two disjoint subsets ${\mathscr{C}}$ , ${\mathscr{D}}$ of $E\sqcup E^{\star}$ .

Lemma 6.2.

Fix two disjoint subsets ${\mathscr{A}}$ , ${\mathscr{B}}$ of $E$ . Then

[TABLE]

and

[TABLE]

Proof.

By equation (2.6) in [23],

[TABLE]

where $D^{\star}(f)$ represents the Dirichlet form of a function $f:E\sqcup E^{\star}\to{\mathbb{R}}$ for the enlarged process, and $V^{\star}_{{\mathscr{C}},{\mathscr{D}}}$ the equilibrium potential between two disjoints subsets ${\mathscr{C}}$ , ${\mathscr{D}}$ of $E\sqcup E^{\star}$ : $V^{\star}_{{\mathscr{C}},{\mathscr{D}}}(\eta)={\mathbb{P}}^{\star}_{\eta}[H_{{\mathscr{C}}}<H_{{\mathscr{D}}}]$ . On the one hand, by definition of the enlargement, for every $\eta\in E$ , $V^{\star}_{{\mathscr{A}},{\mathscr{B}}}(\eta^{\star})=V^{\star}_{{\mathscr{A}},{\mathscr{B}}}(\eta)$ . Hence, the contribution to the Dirichlet form $D^{\star}(V^{\star}_{{\mathscr{A}},{\mathscr{B}}})$ of the edges between $E$ and $E^{\star}$ vanishes. On the other hand, since the trace of the enlargement $\xi^{\gamma}(t)$ on $E$ coincides with the original process $\xi(t)$ , for all $\eta\in E$ , $V^{\star}_{{\mathscr{A}},{\mathscr{B}}}(\eta)={\mathbb{P}}^{\star}_{\eta}[H_{{\mathscr{A}}}<H_{{\mathscr{B}}}]={\mathbb{P}}_{\eta}[H_{{\mathscr{A}}}<H_{{\mathscr{B}}}]=V_{{\mathscr{A}},{\mathscr{B}}}(\eta)$ . Hence, the sum appearing on the right-hand side of the previous displayed equation is equal to

[TABLE]

Since, for $\eta$ , $\xi\in E$ , $R^{\star}(\eta,\xi)=R(\eta,\xi)$ , $\pi^{\star}(\eta)=(1/2)\pi(\eta)$ , the previous sum is equal to

[TABLE]

as claimed in (6.2).

Let ${\mathscr{A}}^{\star}=\{\xi^{\star}\in E^{\star}:\ \xi\in{\mathscr{A}}\}$ and $\lambda^{\star}(\eta)$ stand for the holding rate of $\xi^{\gamma}(t)$ at $\eta$ . We have

[TABLE]

where in the last equality we have split the inner sum over $\xi\in{\mathscr{A}}^{\star}$ and $\xi\in E$ . Taking into account that for every $\xi\in E$ we have ${\mathbb{P}}^{\star}_{\xi}\big{[}H_{{\mathscr{A}}^{\star}}>H_{{\mathscr{A}}}\big{]}=1$ because points $\eta^{\star}\in{\mathscr{A}}^{\star}$ are only accessible from $\eta\in{\mathscr{A}}$ , the preceding computation gives

[TABLE]

Inequality (6.3) now follows by monotonicity of capacities. ∎

Denote by $\nu^{\star}_{{\mathscr{A}},{\mathscr{B}}}$ the equilibrium measure between ${\mathscr{A}}$ , ${\mathscr{B}}$ for the chain $\xi^{\gamma}(t)$ , which is concentrated on the set ${\mathscr{A}}$ and is given by

[TABLE]

If ${\mathscr{A}}$ is a set with small measure with respect to the stationary measure, it is expected that, for most configurations $\eta\in E$ , $H_{{\mathscr{A}}}$ is approximately exponentially distributed under ${\mathbb{P}}_{\eta}$ . Let $\lambda^{-1}$ be its expectation, so that ${\mathbb{P}}_{\eta}\big{[}H_{{\mathscr{A}}}\leq b\big{]}\approx 1-\exp\{-b\lambda\}\approx b\lambda$ , provided $b\lambda\ll 1$ . On the one hand, by [6, Proposition A.2],

[TABLE]

where $V^{*}_{\eta,{\mathscr{A}}}$ is the equilibrium potential between $\eta$ and ${\mathscr{A}}$ for the time-reversed dynamics, and ${\rm cap}(\eta,{\mathscr{A}})$ the capacity between $\eta$ and ${\mathscr{A}}$ . If $\langle V^{*}_{\eta,{\mathscr{A}}}\rangle_{\pi}\approx 1$ (for instance, because $\pi(\eta)\approx 1$ ), we conclude that $\lambda\approx{\rm cap}(\eta,{\mathscr{A}})$ . On the other hand, choosing $\gamma=b^{-1}$ as the parameter for the enlarged process, for every $\eta\in E$ ,

[TABLE]

Once more, if $\langle V^{\star,*}_{\eta,E^{\star}}\rangle_{\pi^{\star}}\approx 1$ , we conclude that $b^{-1}\approx{\rm cap}^{\star}(\eta,E^{\star})$ , so that

[TABLE]

The next lemma establishes this estimate.

Lemma 6.3.

Fix a proper subset ${\mathscr{A}}$ of $E$ . For every $b>0$ and $\eta\in E\setminus{\mathscr{A}}$ ,

[TABLE]

and

[TABLE]

Proof.

Fix a proper subset ${\mathscr{A}}$ of $E$ , $b>0$ and $\eta\in E\setminus{\mathscr{A}}$ . Fix $\gamma>0$ , and consider the $\gamma$ -enlarged process. Denote by $H_{E^{\star}}$ the hitting time of the set $E^{\star}$ . By definition of the enlargement, under ${\mathbb{P}}^{\star}_{\eta}$ , $H_{E^{\star}}$ has an exponential distribution of parameter $\gamma$ and is independent of $H_{{\mathscr{A}}}$ . Hence, by Lemma 6.1,

[TABLE]

The previous probability is the value of the equilibrium potential between ${\mathscr{A}}$ and $E^{\star}$ computed at the configuration $\eta$ , denoted hereafter by $V^{\star}_{{\mathscr{A}},E^{\star}}$ . By equation (3.3) in [32] and by (6.2), the previous expression is bounded by

[TABLE]

This proves the first assertion of the lemma.

We may also rewrite the right-hand side of (6.5) as

[TABLE]

where ${\mathbf{1}}\{\eta\}$ represents the indicator of the set $\{\eta\}$ . By [6, Proposition A.2], the previous sum is equal to

[TABLE]

where ${\mathbb{P}}^{\star,*}$ represents the distribution of the process $\xi^{\gamma}(t)$ reversed in time, and $\nu_{{\mathscr{A}},E^{\star}}$ the equilibrium measure given by (6.4). By definition of the enlarged process, for every initial condition $\eta\in E$ , $H_{E^{\star}}$ has an exponential distribution of parameter $\gamma$ . The penultimate displayed equation is thus bounded by $\gamma^{-1}{\rm cap}^{\star}({\mathscr{A}},E^{\star})$ , which completes the proof of the lemma. ∎

Denote by $\partial_{+}{\mathscr{A}}$ the exterior boundary of a set ${\mathscr{A}}$ :

[TABLE]

Corollary 6.4.

Fix a proper subset ${\mathscr{A}}$ of $E$ . For every $b>0$ and $\eta\in E\setminus{\mathscr{A}}$ ,

[TABLE]

where $R(\xi,{\mathscr{A}})=\sum_{\zeta\in{\mathscr{A}}}R(\xi,\zeta)$ .

Proof.

In view of (6.3), the first result of the preceding lemma gives

[TABLE]

It suffices now to pick $\gamma=b^{-1}$ . For the second inequality note that

[TABLE]

∎

Acknowledgements. C. Landim has been partially supported by FAPERJ CNE E-26/201.207/2014, by CNPq Bolsa de Produtividade em Pesquisa PQ 303538/2014-7, and by ANR-15-CE40-0020-01 LSD of the French National Research Agency.

Bibliography43

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] D. J. Aldous. Some inequalities for reversible Markov chains. J. London Math. Soc. (2) , 25(3):564–576, 1982.
2[2] I. Armendáriz, S. Grosskinsky, and M. Loulakis. Metastability in a condensing zero-range process in the thermodynamic limit. Probab. Theory Related Fields , 169(1-2):105–175, 2017.
3[3] J. Beltrán and C. Landim. Tunneling and metastability of continuous time Markov chains. J. Stat. Phys. , 140(6):1065–1114, 2010.
4[4] J. Beltrán and C. Landim. Metastability of reversible finite state Markov processes. Stochastic Process. Appl. , 121(8):1633–1677, 2011.
5[5] J. Beltrán and C. Landim. Metastability of reversible condensed zero range processes on a finite set. Probab. Theory Relat. Fields , 152(3-4):781–807, 2012.
6[6] J. Beltrán and C. Landim. Tunneling and metastability of continuous time Markov chains II, the nonreversible case. J. Stat. Phys. , 149(4):598–618, 2012.
7[7] J. Beltrán and C. Landim. A martingale approach to metastability. Probab. Theory Related Fields , 161(1-2):267–307, 2015.
8[8] J. Beltrán and C. Landim. Tunneling of the Kawasaki dynamics at low temperatures in two dimensions. Ann. Inst. Henri Poincaré Probab. Stat. , 51(1):59–88, 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Metastable Markov chains: from the

Abstract.

1. Introduction

2. Notation and Results

Proposition 2.1**.**

Proposition 2.2**.**

Remark 2.3** (On the hypotheses of Proposition 2.2).**

3. Convergence of the finite-dimensional distributions

Lemma 3.1**.**

Proof.

Proof of Proposition 2.1.

3.1. The assumption (2.1)

Remark 3.2**.**

Lemma 3.3**.**

Proof.

Lemma 3.4**.**

Proof.

Lemma 3.5**.**

Proof.

Lemma 3.6**.**

Proof.

Corollary 3.7**.**

Proof of Corollary 3.7.

Lemma 3.8**.**

Proof.

Remark 3.9**.**

Remark 3.10**.**

4. Convergence of the state

Proof of Proposition 2.2.

Lemma 4.1**.**

Proof.

5. Examples

Example 5.1** (Inclusion process [25, 9]).**

Example 5.2** (Condensing zero-range processes [5, 30, 43]).**

Example 5.3** (Random walk in a potential field).**

Example 5.4** (Random walk on a singular graph).**

6. Appendix

Lemma 6.1**.**

Proof.

Lemma 6.2**.**

Proof.

Lemma 6.3**.**

Proof.

Corollary 6.4**.**

Proof.

Proposition 2.1.

Proposition 2.2.

Remark 2.3 (On the hypotheses of Proposition 2.2).

Lemma 3.1.

Remark 3.2.

Lemma 3.3.

Lemma 3.4.

Lemma 3.5.

Lemma 3.6.

Corollary 3.7.

Lemma 3.8.

Remark 3.9.

Remark 3.10.

Lemma 4.1.

Example 5.1 (Inclusion process [25, 9]).

Example 5.2 (Condensing zero-range processes [5, 30, 43]).

Example 5.3 (Random walk in a potential field).

Example 5.4 (Random walk on a singular graph).

Lemma 6.1.

Lemma 6.2.

Lemma 6.3.

Corollary 6.4.