Finite Markov chains coupled to general Markov processes and an application to metastability I
Thomas G. Kurtz, Jason Swanson

TL;DR
This paper develops a method to couple diffusion processes with Markov chains using eigenfunctions, providing explicit formulas and analyzing metastability, especially in the context of small noise perturbations of dynamical systems.
Contribution
It introduces a novel coupling technique between diffusions and Markov chains based on eigenfunctions, with explicit formulas and analysis of metastability phenomena.
Findings
Coupling diffusion processes to Markov chains using eigenfunctions.
Explicit formulas for conditional probabilities in the coupling.
Analysis of metastability in double-well potentials.
Abstract
We consider a diffusion given by a small noise perturbation of a dynamical system driven by a potential function with a finite number of local minima. The classical results of Freidlin and Wentzell show that the time this diffusion spends in the domain of attraction of one of these local minima is approximately exponentially distributed and hence the diffusion should behave approximately like a Markov chain on the local minima. By the work of Bovier and collaborators, the local minima can be associated with the small eigenvalues of the diffusion generator. Applying a Markov mapping theorem, we use the eigenfunctions of the generator to couple this diffusion to a Markov chain whose generator has eigenvalues equal to the eigenvalues of the diffusion generator that are associated with the local minima and establish explicit formulas for conditional probabilities associated with this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and statistical mechanics · Markov Chains and Monte Carlo Methods · Mathematical Dynamics and Fractals
Finite Markov chains coupled to general Markov processes and an application
to metastability I
Thomas G. Kurtz
University of Wisconsin-Madison
Jason Swanson
University of Central Florida Supported in part by the VIGRE grant of University of Wisconsin-Madison and by NSA grant H98230-09-1-0079.
(January 6, 2021)
Abstract
We consider a diffusion given by a small noise perturbation of a dynamical system driven by a potential function with a finite number of local minima. The classical results of Freidlin and Wentzell show that the time this diffusion spends in the domain of attraction of one of these local minima is approximately exponentially distributed and hence the diffusion should behave approximately like a Markov chain on the local minima. By the work of Bovier and collaborators, the local minima can be associated with the small eigenvalues of the diffusion generator. Applying a Markov mapping theorem, we use the eigenfunctions of the generator to couple this diffusion to a Markov chain whose generator has eigenvalues equal to the eigenvalues of the diffusion generator that are associated with the local minima and establish explicit formulas for conditional probabilities associated with this coupling. The fundamental question then becomes to relate the coupled Markov chain to the approximate Markov chain suggested by the results of Freidlin and Wentzel. In Part II of this work, we provide a complete analysis of this relationship in the special case of a double-well potential in one dimension. More generally, the coupling can be constructed for a general class of Markov processes and any finite set of eigenvalues of the generator.
AMS subject classifications: Primary 60J60; secondary 60H10, 60F10, 60J27, 60J28, 34L10
Keywords and phrases: conditional distributions, coupling, eigenfunctions, Freidlin and Wentzell, Markov mapping theorem, Markov processes, metastability
1 Introduction
Fix and consider the stochastic process,
[TABLE]
where and is a standard -dimensional Brownian motion. For the precise assumptions on , see Section 3.1. Let be the solution to the differential equation . We will use to denote the solution with . The process is a small-noise perturbation of the deterministic process .
Suppose that is the set of local minima of the potential function . The points are stable points for the process . For , however, they are not stable. The process will initially gravitate toward one of the and move about randomly in a small neighborhood of this point. But after an exponential amount of time, a large fluctuation of the noise term will move the process out of the domain of attraction of and into the domain of attraction of one of the other minima. We say that each point is a point of metastability for the process .
If is a cadlag process in a complete, separable metric space adapted to a right continuous filtration (assumptions that are immediately satisfied for all processes considered here) and is either open or closed, then is a stopping time (see, for example, [8, Proposition 1.5]). If , let . We may sometimes also write , and if the process is understood, we may omit the superscript.
Let
[TABLE]
be the domains of attraction of the local minima. It is well-known (see, for example, [9], [4, Theorem 3.2], [5, Theorems 1.2 and 1.4], and [7]) that as , is asymptotically exponentially distributed under . It is therefore common to approximate the process by a continuous time Markov chain on the set (or equivalently on ). In fact, metastability can be defined in terms of convergence, in an appropriate sense, to a continuous time Markov chain. (See the survey article [15] for details.) Beltrán and Landim [2, 3] introduced a general method for proving the metastability of a Markov chain. Along similar lines, Rezakhanlou and Seo [19] developed such a method for diffusions. For an alternative approach using intertwining relations, see [1].
In this project, for each , we wish to capture this approximate Markov chain behavior by coupling to a continuous time Markov chain, , on . We will refer to the indexed collection of coupled processes, , as a coupling sequence. Our objective is to investigate the possibility of constructing a coupling sequence which satisfies
[TABLE]
as , for all . We also want the transition rate for to go from to to be asymptotically equivalent as to the transition rate for to go from a neighborhood of to a neighborhood of . That is, we would like
[TABLE]
as , for all and , where is the ball of radius centered at .
In this paper (Part I), we develop our general coupling construction. The construction goes beyond the specific case of interest here. It is a construction that builds a coupling between a Markov process on a complete and separable metric space and a continuous-time Markov chain where the generators of the two processes have common eigenvalues. The coupling is done in such a way that observations of the chain yield quantifiable conditional probabilities about the process. This coupling construction is built in Section 2 and uses the Markov mapping theorem (Theorem A.3). In Section 3, we apply this construction method to reversible diffusions in driven by a potential function with a finite number of local minima.
With this coupling construction in hand, we can build the coupling sequences described above. In our follow-up work (Part II), we take up the question of the existence and uniqueness of a coupling sequence that satisfies requirements (1.3) and (1.4).
2 The general coupling
2.1 Assumptions and definitions
Given a Markov process with generator satisfying Assumption 2.1, we will use the Markov mapping theorem to construct a coupled pair, , in such a way that for a specified class of initial distributions, is a continuous-time Markov chain on a finite state space. The construction then allows us to explicitly compute the conditional distribution of given observations of .
For explicit definitions of the notation used here and throughout, see Section A.1.
Assumption 2.1**.**
Let be a complete and separable metric space.
- (i)
. 2. (ii)
* has a stationary distribution , which implies for all .* 3. (iii)
For some , there exist signed measures on and positive real numbers such that, for each and ,
[TABLE]
We define and .
Remark 2.2*.*
If , then (2.1) implies (2.3).
Remark 2.3*.*
In what follows, we will make use of the assumption that the functions are continuous. However, this assumption can be relaxed by appealing to the methods in Kurtz and Stockbridge [14].
Assumption 2.4**.**
Let be a complete and separable metric space. Let , , , and .
- (i)
* and satisfy Assumption 2.1.* 2. (ii)
* is the generator of a continuous-time Markov chain with state space and eigenvalues .* 3. (iii)
The vectors are right eigenvectors of , corresponding to the eigenvalues . 4. (iv)
For each , the function
[TABLE]
satisfies for all .
We define , so that the function is given by .
Remark 2.5*.*
Given satisfying (i) and (ii) of Assumption 2.4, it is always possible to choose vectors satisfying (iii) and (iv). This follows from the fact that each is a bounded function.
Definition 2.6**.**
Suppose satisfies Assumption 2.4. For , define
[TABLE]
Note that . Let . Define by
[TABLE]
where we take
[TABLE]
In particular, .
For each , define the measure on by
[TABLE]
for all . Note that by (2.4), (2.3), and (2.2), these are probability measures.
2.2 Construction of the coupling
We are now ready to construct our coupled pair, , which will have generator , to prove, for appropriate initial conditions, that the marginal process is a Markov chain with generator , and to establish our conditional probability formulas. We first require two lemmas.
Lemma 2.7**.**
In the setting of Definition 2.6, let be a cadlag solution of the martingale problem for . Then there exists a cadlag process such that solves the (local) martingale problem for . If is Markov, then is Markov. If the martingale problem for is well-posed, then the martingale problem for is well-posed.
Remark 2.8*.*
We are not requiring the to be bounded, so for the process we construct,
[TABLE]
may only be a local martingale.
Proof of Lemma 2.7.
Let be a cadlag solution to the martingale problem for . Let be a family of independent unit rate Poisson processes, which is independent of . Then the equation
[TABLE]
has a unique solution, and as in [12], the process is a solution of the (local) martingale problem for . If is Markov, the uniqueness of the solution of (2.9) ensures that is Markov. Similarly, well-posed implies is well posed.
Lemma 2.9**.**
Let satisfy Assumption 2.1. Taking , if satisfies Condition A.1, then satisfies Condition A.1 with replaced by .
Proof.
Since is closed under multiplication, defined in (2.7) is closed under multiplication.
Since we are assuming that , for each , there exists such that .
Condition A.1(iii) for and the separability of implies Condition A.1(iii) for .
Since is a pre-generator and is a perturbation of by a jump operator, is a pre-generator.
Theorem 2.10**.**
Suppose satisfies Condition A.1 and satisfies Assumption 2.4. Let be given by (2.6) and for , , define
[TABLE]
If is a cadlag -valued Markov chain with generator and initial distribution , then there exists a solution of the martingale problem for such that and have the same distribution on , and
[TABLE]
for all and all .
Proof.
We apply Theorem A.3 to the operator .
Let be the coordinate projection. Let be the transition function from into given by the product measure , where is given by (2.8). Then and
[TABLE]
for each . Define
[TABLE]
The result follow by Theorem A.3, if we can show that for every vector . Given , let
[TABLE]
Note that
[TABLE]
Since , by (2.1) and the definition of ,
[TABLE]
By assumption , so and
[TABLE]
This gives
[TABLE]
It follows that is a solution to the martingale problem for .
By Theorem A.3(a), there exists a solution of the martingale problem for such that and have the same distribution on . Theorem A.3(b) implies (2.10).
Remark 2.11*.*
In what follows, we may still write expectations with the notation or , even when we have a coupled process, . The meaning will be determined by context, depending on whether the integrand of the expectation involves only or only .
3 Reversible diffusions
3.1 Assumptions on the potential function
We now consider the special case of our coupling when is a reversible diffusion on driven by a potential function and a small white noise perturbation. We will need to use several results from the literature about the eigenvalues and eigenfunctions of the generator of . We assume the following on .
Assumption 3.1**.**
- (i)
* and .* 2. (ii)
* has local minima .* 3. (iii)
There exist constants and such that , and
[TABLE]
Remark 3.2*.*
Note that . To see this, observe that (3.1) implies . Thus, , which implies .
Lemma 3.3**.**
Under Assumption 3.1, there exist constants such that
[TABLE]
where .
Proof.
Since
[TABLE]
it follows from (3.1) that
[TABLE]
and the upper bound in (3.3) follows immediately.
Since , there exists such that for all , and since , there exists such that whenever .
Recall that satisfies and , and define
[TABLE]
Suppose there exists such that . Then, for all ,
[TABLE]
Therefore, for all , a contradiction, and we must have for all .
Let . By (3.1) and the fact that , we may choose and such that and whenever .
Fix with , so that . Since , it follows that . By the continuity of , we may choose such that . We then have
[TABLE]
Let . Note that implies , and therefore . Moreover, for all , we have , which implies
[TABLE]
Thus,
[TABLE]
But is the length of from to , which is bounded below by
[TABLE]
Therefore, for all , we have , where , and this proves the lower bound in (3.3).
3.2 Spectral properties of the generator
Having established our assumptions on , we now turn our attention to the diffusion process, , given by (1.1). To simplify notation, we may sometimes omit the . The process has generator . To show that meets the requirements of our coupling from Section 2, we must prove certain results about its eigenvalues and eigenfunctions. For this, we begin with some notation, a lemma, and two results from the literature.
Define . Let
[TABLE]
Lemma 3.4**.**
Let be given by (3.4), where satisfies Assumption 3.1. Recall the constants from (3.1)-(3.2). For all , there exist constants such that
[TABLE]
In particular, for all .
Proof.
Fix . By (3.1) and (3.2), for sufficiently large,
[TABLE]
and
[TABLE]
for some . Note that
[TABLE]
Hence, for sufficiently large, . Also,
[TABLE]
Since , it follows that for sufficiently large, . Therefore, there exist constants such that
[TABLE]
and
[TABLE]
for all . Note that
[TABLE]
so that
[TABLE]
From here, the lemma follows easily.
The following two theorems are from [6]. Theorem 3.5 is a consequence of [6, Theorem 4.5.4] and [6, Lemma 4.2.2]. Theorem 3.6 is part of [6, Theorem 2.1.4].
Theorem 3.5**.**
Let , where is continuous with . Let denote the smallest eigenvalue of , and the corresponding eigenfunction, normalized so that . Define and . If
[TABLE]
where , , and , then is an ultracontractive symmetric Markov semigroup on . That is, for each , the operator is a bounded operator mapping to .
Theorem 3.6**.**
Let be an ultracontractive symmetric Markov semigroup on , where is a locally compact, second countable Hausdorff space and is a Borel measure on . If , then each eigenfunction of belongs to .
This next proposition establishes the spectral properties of that are needed to carry out the construction of our coupling.
Proposition 3.7**.**
Fix . The operator is a self-adjoint operator on with discrete, nonnegative spectrum and corresponding orthonormal eigenfunctions . Each is locally Hölder continuous. Moreover, is simple and is proportional to . We define by and , where . The operator given by is a self-adjoint operator on with eigenvalues and orthogonal eigenfunctions . The functions have norm one in , whereas the functions have norm one in .
For , we have . Hence, if we define by
[TABLE]
then is the generator for the diffusion process given by (1.1). For each , (1.1) has a unique, global solution for all time, so that the process with is a solution to the martingale problem for . The operator is graph separable, and is separating and closed under multiplication. The measure is a stationary distribution for . Moreover,
[TABLE]
where and . The signed measures satisfy , and each belongs to , the space of bounded, continuous functions on .
Proof.
Note that by Lemma 3.4. Therefore, by [18, Theorem XIII.67], we have that is a self-adjoint operator on with compact resolvent. It follows (see [6, pp. 108–109, 119–120, and Proposition 1.4.3]) that has a purely discrete spectrum and there exists a complete, orthonormal set of eigenfunctions with corresponding eigenvalues . Moreover, is simple and is strictly positive.
Since is locally bounded, and , [10, Theorem 8.22] implies that, for each compact , is Hölder continuous on with exponent .
Define by , so that . Since is an isometry, is self-adjoint on and has the same eigenvalues as . Note that, for any , it follows from Green’s identity that
[TABLE]
Using the product rule, , this simplifies to
[TABLE]
showing that cannot have a negative eigenvalue. Hence, .
By (3.3), we have , so that with . Hence, since is nonnegative and has multiplicity one, it follows that and is proportional to .
Observe that, if , then, using the product rule for the Laplacian and the identity , we have
[TABLE]
Since , we have .
Since is locally Lipschitz, (1.1) has a unique solution up to an explosion time (see [17, Theorem V.38]). Since by assumption and by Lemma 4.2, it follows that is a Liapunov function for proving that does not explode.
By [13, Remark 2.5], is graph separable. Clearly is closed under multiplication. Since separates points and is complete and separable, is separating (see [8, Theorem 3.4.5]).
If , then
[TABLE]
so that is a stationary distribution for . For , since , we have
[TABLE]
Also, , since and are orthogonal.
Finally, since and is locally Hölder continuous, it follows that each belongs to , and the fact that they are bounded follows from Theorems 3.5 and 3.6.
3.3 The coupled process
By Proposition 3.7, the pair satisfies Assumption 2.1 with , so we have the following.
Theorem 3.8**.**
Let be the generator for (1.1) where satisfies Assumption 3.1, and let be the first eigenvalues and eigenvectors of . Let be the generator of a continuous-time Markov chain with state space and eigenvalues and eigenvectors such that defined by (2.4) is strictly positive. Let be defined as in Definition 2.6.
Let be a continuous time Markov chain with generator and initial distribution . Then there exists a cadlag Markov process with generator and initial distribution given by
[TABLE]
such that and have the same distribution on , and
[TABLE]
for all , all , and all .
Remark 3.9*.*
That with these properties exists can be seen from [16, Theorem 1]. Remark 2.5 ensures the existence of the eigenvectors.
Proof.
Note that under the assumptions of the theorem, satisfies Assumption 2.4. By Proposition 3.7, the rest of the hypotheses of Theorem 2.10 are also satisfied. Consequently, the process exists, and by uniqueness of the martingale problem for , is Markov.
We can now construct the coupling sequences described in the introduction. For each , choose a matrix and eigenvectors that satisfy the assumptions of Theorem 3.8. If is the Markov process described in Theorem 3.8, then the family, , forms a coupling sequence.
The coupling sequence is determined by the collection, . By making different choices for the matrices and eigenvectors, we can obtain different coupling sequences. In our follow-up paper, we will consider the question of existence and uniqueness of a coupling sequence that satisfies conditions (1.3) and (1.4).
Acknowledgments
This paper was completed while the first author was visiting the University of California, San Diego with the support of the Charles Lee Powell Foundation. The hospitality of that institution, particularly that of Professor Ruth Williams, was greatly appreciated.
Appendix A Appendix
A.1 The Markov mapping theorem
Let be a complete and separable metric space, the -algebra of Borel subsets of , and the family of Borel probability measures on . Let be the collection of all real-valued, Borel measurable functions on , and the Banach space of bounded functions with . Let be the subspace of bounded continuous functions, while denotes the collection of continuous, real-valued functions on . A collection of functions is separating if and for all implies .
Condition A.1**.**
- (i)
* and is closed under multiplication and separating.* 2. (ii)
There exists , , such that for each , there exists a constant such that
[TABLE]
(We write even though we do not exclude the possibility that is multivalued. In the multivalued case, each element of must satisfy the inequality.) 3. (iii)
There exists a countable subset such that every solution of the (local) martingale problem for is a solution of the (local) martingale problem for . 4. (iv)
* is a pre-generator, that is, is dissipative and there are sequences of functions and such that for each ,*
[TABLE]
for each .
Remark A.2*.*
Condition A.1(iii) holds if is graph-separable, that is, there is a countable subset of such that is a subset of the bounded, pointwise closure of .
An operator is a pre-generator if for each , there exists a solution of the martingale problem for .
For a measurable -valued process , where is a complete and separable metric space, let
[TABLE]
Theorem A.3**.**
Let and be complete, separable metric spaces. Let satisfy Condition A.1. Let be measurable, and let be a transition function from into (that is, satisfies for all and for all ) satisfying , , , that is, . Assume that for each and define
[TABLE]
Let and define .
- a)
If satisfies a.s. for all and is a solution of the martingale problem for , then there exists a solution of the martingale problem for such that has the same distribution on as . If and are cadlag, then and have the same distribution on . 2. b)
Let (which holds for Lebesgue-almost every ). Then for ,
[TABLE] 3. c)
If, in addition, uniqueness holds for the martingale problem for , then uniqueness holds for the -martingale problem for . If has sample paths in , then uniqueness holds for the -martingale problem for . 4. d)
If uniqueness holds for the martingale problem for , then restricted to is a Markov process.
Remark A.4*.*
If is cadlag with no fixed points of discontinuity (that is a.s. for all ), then for all .
Remark A.5*.*
The main precursor of this Markov mapping theorem is [13, Corollary 3.5]. The result stated here is a special case of Corollary 3.3 of [11].
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Luca Avena, Fabienne Castell, Alexandre Gaudillière, and Clothilde Melot. Approximate and exact solutions of intertwining equations through random spanning forests, 2017.
- 2[2] J. Beltrán and C. Landim. Tunneling and metastability of continuous time Markov chains. J. Stat. Phys. , 140(6):1065–1114, 2010.
- 3[3] J. Beltrán and C. Landim. Tunneling and metastability of continuous time Markov chains II, the nonreversible case. J. Stat. Phys. , 149(4):598–618, 2012.
- 4[4] Anton Bovier, Michael Eckhoff, Véronique Gayrard, and Markus Klein. Metastability in reversible diffusion processes. I. Sharp asymptotics for capacities and exit times. J. Eur. Math. Soc. (JEMS) , 6(4):399–424, 2004.
- 5[5] Anton Bovier, Véronique Gayrard, and Markus Klein. Metastability in reversible diffusion processes. II. Precise asymptotics for small eigenvalues. J. Eur. Math. Soc. (JEMS) , 7(1):69–99, 2005.
- 6[6] E. B. Davies. Heat kernels and spectral theory , volume 92 of Cambridge Tracts in Mathematics . Cambridge University Press, Cambridge, 1990.
- 7[7] Michael Eckhoff. Precise asymptotics of small eigenvalues of reversible diffusions in the metastable regime. Ann. Probab. , 33(1):244–299, 2005.
- 8[8] Stewart N. Ethier and Thomas G. Kurtz. Markov processes . Wiley Series in Probability and Mathematical Statistics: Probability and Mathematical Statistics. John Wiley & Sons Inc., New York, 1986. Characterization and convergence.
