Lower bound for the coarse Ricci curvature of continuous-time pure jump processes
Denis Villemonais

TL;DR
This paper establishes a lower bound for the coarse Ricci curvature in continuous-time pure jump Markov processes, especially in interacting particle systems, and explores applications including herd behavior in agent models.
Contribution
It provides a novel lower bound for coarse Ricci curvature in pure jump processes, with detailed analysis of herd behavior in interacting agents.
Findings
Lower bound for coarse Ricci curvature derived
Application to herd behavior in agent models
Insights into interacting particle systems
Abstract
We obtain a lower bound for the coarse Ricci curvature of continuous time pure jump Markov processes, with an emphasis on interacting particle systems. Applications to several models are provided, with a detailed study of the herd behavior of a simple model of interacting agents.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11footnotetext: Université de Lorraine, IECN, Campus Scientifique, B.P. 70239, Vandœuvre-lès-Nancy Cedex, F-54506, France22footnotetext: Inria, TOSCA team, Villers-lès-Nancy, F-54600, France.
E-mail: [email protected]
Lower bound for the coarse Ricci curvature of continuous-time pure jump processes
Denis Villemonais1,2
Abstract
We obtain a lower bound for the coarse Ricci curvature of continuous time pure jump Markov processes, with an emphasis on interacting particle systems. Applications to several models are provided, with a detailed study of the herd behavior of a simple model of interacting agents.
1 Introduction
Let be a Polish space. Fix and consider a continuous time pure jump particle system of particles evolving in . We assume that the process is non-explosive and that its infinitesimal generator is given, for all and any bounded measurable function , by
[TABLE]
where the terms are finite non-negative measures on , measurable with respect to and and such that, for some (and hence for all) , . Our main result, stated in Section 2, provides a lower bound for the coarse Ricci curvature of evolving in endowed with the metric
[TABLE]
We recall that the coarse Ricci curvature of the continuous-time Markov process is the largest constant satisfying, for all ,
[TABLE]
where denotes the Wasserstein distance (see Section 2 for references and details). A lower bound on provides a measure of the instantaneous convergence rate to a unique stationary distribution (see for instance [13]). It also entails spectral gap inequalities and concentration inequalities (see [45, 35, 33, 34, 56, 57, 24]).
In Section 2.4, as a first application of our main result, we provide a general lower bound for the coarse Ricci curvature of simple () continuous time pure jump processes. The time-continuous version of the coarse Ricci curvature has often been considered not practical because of the lack of general and practical lower bounds, see [45] and [16], contrarily to the discrete time case. The computation of our lower bound mainly requires the computation of a Wasserstein distance between measures, similarly to the discrete time case. We refer the reader to [1] for a different approach based on Kantorovich potentials. In Example 2, we consider the case where is the trivial distance and where the jump measures admit a density with respect to a common non-negative measure on . In Example 3, we check that the lower bound provided by our main result in the case of birth and death processes is in fact equal to the coarse Ricci curvature, as computed explicitly in [13, 34]. This entails that, at least in some simple cases, this lower bound is sharp. We also show in Example 4 how to compute non trivial lower bounds for the coarse Ricci curvature of a modified birth and death process, using our result and a slight extension of Vallender’s Theorem [53] for the computation of the Wasserstein distance between probability measures on the real line (see Lemma 2.6).
In Section 3, we study a simple model of interacting agents whose individual behavior is influenced in a non-linear way by the behavior of the other agents: each agent wanders randomly in a complete graph and also changes its position to a new one, depending on a function of the number of agents in this position. This dynamic is modeled by a system of particles evolving in the complete finite graph of size : we assume that there exist and a function such that any agent jumps from state to with the following rate
[TABLE]
In this model, is the temperature of the system and is a preference function. For instance, with an increasing function with high convexity, the agents will give higher preferences to positions that are already favored by many other agents; with a larger temperature , the agents act more independently. Our aim is to determine characteristics of and values of for which a herd behavior occurs or not in this model. By a herd behavior, we mean a meta-stable state of the whole particle system where a majority of the agents share the same position for a long time. Note that this model can be written in the settings of the present paper, by setting, for all and ,
[TABLE]
The existence of the phase without herd behavior is obtained using the results of Section 2, while the existence of the phase with herd behavior is proved using large deviation results obtained in [26, 27].
In Section 4, lower bounds of the coarse Ricci curvature for several models are obtained: we consider zero range dynamics in Subsection 4.1, Fleming-Viot type systems and some natural extensions in Subsection 4.2, birth and death processes in mean-field type interaction in Subsection 4.3 and system of particles whose jump measures admit a density with respect to the Lebesgue measure or the counting measure in Subsection 4.4.
2 Definitions and main result
2.1 Definitions and reminders about the Wasserstein distance
Fix and consider the Polish space . Let (respectively ) denote the set of probability measures (respectively of non-negative finite measures) on such that, for some (and hence for all) , . The Wasserstein distance between two probability measures and on belonging to is defined as
[TABLE]
where the infimum is taken over all probability measures on such that and ( is called a coupling measure for and ). It is well known that the infimum in the above definition is attained and the state space is a complete state space (see for instance Lemma 5.2 and Theorem 5.4 in [14]). The Wasserstein distance is also referred to as the Kantorovich metric (which one may consider a more suitable name given the historical precedence [55]) and is a particular instance of the Kantorovich-Rubistein norm (with replaced by a suitable cost function and taken in the set of measures such that , see [48, Chapter 6] for relations between the different types of norms).
The Wasserstein distance can also be easily extended to positive measures with the same mass: for all and any probability measures on , we set
[TABLE]
where the infimum is taken over all measures on with mass and such that and . Note that if a coupling realizes the minimum in the definition of , then realizes the minimum in the definition of . Such couplings are also referred to as optimal couplings.
Given a continuous time Markov process evolving in , the coarse Ricci curvature of (as coined by Ollivier [45], see also [33] and [34, Remark 2.3] where this quantity is called the Wasserstein curvature) is the largest constant satisfying, for all ,
[TABLE]
In the discrete time setting, we refer the reader to [62, Theorem 2.1 and Lemma 2.1] for a first use of this concept in a general setting and to [45] for a systematic study. If is positive, then the completeness of implies that the process admits a unique stationary distribution , that and that, for all and any initial distribution ,
[TABLE]
Note that this concept is closely related to the optimal coupling theory developed by Chen (see for instance [13, 15]). Several implications of this notion have been proved in [33, 34], where Joulin obtains Poisson type deviation inequalities for jump type processes. We also refer the reader to [10, Section 3.2] for a link between coarse Ricci curvature and functional inequalities. For general state space processes and for diffusion processes, we refer the reader to the works of Veysseire, where a systematic study of the coarse Ricci curvature has been conducted (see [56, 57]) with nice implications on concentration inequalities and spectral gap estimates. Let us also mention that estimates on the coarse Ricci curvature of a continuous time process immediately provides estimates for the curvature of its discrete time included Markov chain, which also implies several interesting properties (see the works of Ollivier [45, 46] and references therein).
Estimates on the coarse Ricci curvature can be obtained using the coupling of Markov processes. Let be the infinitesimal generator of . We recall (see [14, Definition 5.12]) that a coupling operator of is an operator acting on functions and such that
[TABLE]
for some function . Since is the infinitesimal generator of a pure jump non-explosive process (see [15, Chapter 2]), any coupling operator is also non-explosive and is well defined for all . A common way to prove that the coarse Ricci curvature of a pure jump Markov process is bounded from below by a constant is to prove that there exists a coupling operator of such that
[TABLE]
Indeed, standard localization arguments and Dynkin’s formula entail that, for a Markov process with generator satisfying the above inequality,
[TABLE]
Now, since the law of is a coupling measure for and , we deduce that, for all ,
[TABLE]
and hence that the coarse Ricci curvature of is bounded from below by .
Remark 1*.*
The above strategy also applies to Markov processes that are not of pure jump types and to cost functions that are not distance functions. For diffusion processes, we refer the reader to [17] and to [60, Corollary 1.4] for necessary and sufficient conditions in the case where the drift derives from a potential. We also refer the reader to [28, 29] with an introduction to parallel coupling and the construction of ad hoc distances on the state space. Computation of the coarse Ricci curvature for diffusion processes on manifold has also been studied by Veysseire [57]. For piecewise deterministic processes, we refer the reader to [19, Lemma 5.2] and [11, Theorem 2.3]. Original coupling approaches are also provided in [41, 40, 8].
2.2 Main result
We introduce the family of functions from to , defined for all by
[TABLE]
where denotes the Dirac measure at point and is the product of the scalar by . Note that the finite measures and can have different masses. Properties of are provided in Subsection 2.3 and explicit computations of lower bounds for are provided in the subsequent sections.
The following theorem is the main result of this paper. The particular case is detailed in Subsection 2.4 and applications to particle systems are provided in Sections 3 and 4.
Theorem 2.1**.**
Consider the Markov process with generator given in the introduction. Then there exists a coupling operator of such that, for all ,
[TABLE]
In particular, the coarse Ricci curvature of the process satisfies
[TABLE]
Remark 2*.*
This result remains valid under a more general setting. For instance, if is the subset of a Polish space and if is a continuous non-negative function, then the infimum in the definition of the Wasserstein distance is attained [58, Theorem 4.1] and there exists a measurable selection of such optimal couplings [58, Corollary 5.22], so that the proof of Theorem 2.1 holds true. An other important setting, which will be used in the following sections, is the case where is a separable metric space endowed with its Borel -field and is the trivial distance (i.e. for all ). In this case, is one half of the total variation distance, that is
[TABLE]
and the optimal coupling in the definition of is a measurable function of the Jordan Hahn decomposition of signed measures (which is itself measurable because of the regularity of Borel probability measures on metric spaces [3, Theorem 1.1] and because of the separability assumption), so that the proof of Theorem 2.1 still applies.
Remark 3*.*
In Theorem 2.1, we obtain, using coupling methods, lower bounds on the coarse Ricci curvature of a system of particles from the behavior of individual particles. This idea of reconstituting transport distance bounds on Markov chains on product spaces from the behavior of marginals via suitable couplings was already used by Talagrand and Marton, see for instance [42, 43] and references therein.
Proof of Theorem 2.1.
Fix and . We define the operator
[TABLE]
where is the element of the canonical base of and where is a coupling measure between the positive measures and such that
[TABLE]
Note that can be constructed as a measurable function of by [62, Theorem 1.1], so that is the infinitesimal generator of a pure jump process.
Let us first check that is a coupling operator for . We have, for any bounded measurable function such that for some function (so that only depends on ),
[TABLE]
for all . For each couple , we observe that the integral with respect to the first marginal of is equal to the integral with respect to . Hence, since the integral of with respect to is [math], we obtain that
[TABLE]
By symmetry of the roles of and , we deduce that is indeed a coupling operator for .
Now our aim is to prove that, for all ,
[TABLE]
which will conclude the proof of Theorem 2.1. We have
[TABLE]
where, by definition of ,
[TABLE]
and hence, using equality (2.3),
[TABLE]
∎
Remark 4*.*
One can use the results of this section to study Markov processes obtained from other types of infinitesimal generators. For instance, let be the infinitesimal generator of independent diffusion processes or piecewise deterministic processes and consider the infinitesimal generator , which can be seen as a perturbation of independent random paths (given by ) by jumps with dependence (given by ). If there exists a coupling of such that for some constant (see Remark 1), then one can expect to prove that
[TABLE]
using the coupling of . Two difficulties arise : first, one needs to ensure that this coupling operator defines a proper Markov process; second, that it is possible to apply this coupling operator to . Since it is more intricate to check these properties for general Markov processes, we mainly restrict our attention to the case of pure jump type infinitesimal generators. However, the method used here and in the particular examples of the next sections can be adapted to these situations, as in the following example.
Example 1*.*
Consider a process evolving in with generator
[TABLE]
where is a constant and is the pure jump type infinitesimal generator of the introduction. This is the generator of a system of particles evolving as independent Ornstein Uhlenbeck processes between their jumps (several properties of similar processes with jumps are investigated in [61]). The jumps occur with respect to a jump measure which depends on the position of the whole system. Now, consider the following coupling generator
[TABLE]
where is the basic coupling (also called the parallel coupling) for Ornstein Uhlenbeck processes (see for instance [17, Example 2.5]), which satisfies, for all , and where is the coupling for obtained from Theorem 2.1. If is the lower bound provided by Theorem 2.1 for the coarse Ricci curvature of the pure jump part, then
[TABLE]
so that the coarse Ricci curvature of the process generated by is bounded from below by .
2.3 Some properties of
One of the difficulties of the continuous time setting is that the jump measures do not, in general, share the same mass, contrarily to the discrete time case, where one can use the standard Wasserstein distance to compare transition probabilities [45]. In the definition of , the quantity is used to compare measures with different masses. However, this is clearly not a proper distance between non-negative measures since this quantity is equal to zero for all couple such that and , where is defined as the set of non-negative measures . Proper generalizations of the Wasserstein distance exist in the literature (such as the flat metric [25] and the generalized Wasserstein distance [47], see also the recent developments in [18, 37, 39] with applications to convergence of measure valued dynamical systems), but are not directly relevant in our context.
Fix . The aim of this section is to provide some properties of , which will be useful to derive upper bounds and hence to apply Theorem 2.1.
Proposition 2.2**.**
For all and all , we have
[TABLE]
and
[TABLE]
Proof.
Equality (2.4) is an immediate consequence of the definition of and of (2.2).
Let and be two coupling measures realizing the minimum in the definition of and of respectively. Then is a coupling measure for and , so that
[TABLE]
Subtracting leads to (2.5). ∎
The following inequality is in general a crude estimate, but it is in some cases useful and sharp (as in Example 3).
Proposition 2.3**.**
We have, for all ,
[TABLE]
Proof.
Since is a coupling measure for and , we have
[TABLE]
Subtracting , one obtains the desired inequality. ∎
The following property implies in particular that, if and are two probability measures, then is smaller than . It also implies that, for measures and on such that , then
[TABLE]
Proposition 2.4**.**
We have, for all ,
[TABLE]
where are taken in the set of real numbers such that and are non-negative measures on with equal mass, i.e. such that m_{1}({\color[rgb]{0,0,0}E})+a\geq 0, m_{2}({\color[rgb]{0,0,0}E})+b\geq 0 and . In addition, the minimum is attained for all (or equivalently ).
Proof.
Taking and , one deduces that is larger than the right hand side.
Let us now prove the converse inequality. Let and be two real numbers such that and are non-negative measures on with equal mass and denote by a coupling which realizes the minimum in the definition of .
If (and hence ), then is a coupling measure for and , so that
[TABLE]
Subtracting implies that
[TABLE]
If (and hence ), then
[TABLE]
We deduce that . Hence is a non-negative measure and it is a coupling measure for and . Since it is a restriction of and since optimality is inherited by restriction (see [58, Theorem 4.6]), it is an optimal coupling for its marginals. We deduce that
[TABLE]
Subtracting on both sides concludes the proof. ∎
2.4 The particular case
In this section, we state our result in the simpler case . The following corollary is an immediate consequence of Theorem 2.1.
Corollary 2.5**.**
Let be the infinitesimal generator of a pure jump non-explosive Markov process on defined, for any bounded measurable function , by
[TABLE]
where is a jump kernel of finite non-negative measures. Then the coarse Ricci curvature of the Markov process generated by satisfies
[TABLE]
In Example 2, we apply Corollary 2.5 to the case where is the trivial distance and admits a density with respect to a common non-negative measure on . In Example 3, we show that the lower bound obtained in Corollary 2.5 is in fact equal to the coarse Ricci curvature in the case of birth and death processes. In a second example, we compute a lower bound for a modified version of birth and death processes, using a slight extension of a lemma by Vallender in order to compute the Wasserstein distance between probability measures on the real line.
Remark 5*.*
For continuous time birth and death processes, Mielke [44] recently computed a lower bound for an other notion of discrete Ricci curvature, related to the fact that the evolution of the law of a continuous time birth and death process can be described through a gradient flow system. To relate both definitions is still an open problem, but the lower bound obtained in Mielke’s work has a similar expression (see Section 5 in [44] and Example 3 below) and may be a good starting point to compare both approaches. This example has also been considered by Fathi and Maas in [30, Theorem 4.1] in the setting of Entropic Ricci curvature.
Example 2*.*
In this example, is the trivial distance on . Assume that there exist a non-negative measure on and a measurable function such that
[TABLE]
Without loss of generality, we assume that for all . Then, using the fact that the Wasserstein distance is one half of the total variation distance, we obtain, for all ,
[TABLE]
where we used the fact that and . Rearranging the terms, we obtain
[TABLE]
In particular, the coarse Ricci curvature of the process satisfies
[TABLE]
Example 3*.*
Consider the particular case where and is the infinitesimal generator of a birth and death process with birth rates (b_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}} and death rates (d_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}}, all positive but . In this case, for all x,y\in\mathbb{N}^{\color[rgb]{0,0,0}0},
[TABLE]
We also assume that the distance is given by
[TABLE]
where is a sequence of positive numbers. Using Proposition 2.3, we obtain, for all ,
[TABLE]
with the convention . Hence Corollary 2.5 entails that the coarse Ricci curvature of the process satisfies
[TABLE]
In [13], [34] and [10], it is shown that there is equality in the above equation. This implies that, at least in some cases, Corollary 2.5 and hence Theorem 2.1 are sharp. Note that, in this case, Proposition 2.3 provides an explicit expression for the quantity .
Example 4*.*
The choice of the classical coupling (i.e. the use of Proposition 2.3) in the previous example was judicious because the measures involved for a birth and death process are stochastically ordered : the jumps measures are such that is always dominated by for , so that an optimal coupling between the measures involved is obtained by the classical coupling (this also explains why, in [34, Theorem 4.3] for instance, the classical coupling is sufficient to recover the exact coarse Ricci curvature). This is not the case in the present example.
We assume that
[TABLE]
In this case, a similar computation as above shows that, for ,
[TABLE]
Using the same method (which relies on Proposition 2.3) in the case would lead to the following bound
[TABLE]
Instead, we use Lemma 2.6 below to obtain, when ,
[TABLE]
and hence
[TABLE]
Note that this quantity is always strictly smaller than the bound obtained using Proposition 2.3 (which corresponds to the classical coupling). We deduce that the coarse Ricci curvature of the process satisfies
[TABLE]
Lemma 2.6 allowed us to provide a computable bound for the coarse Ricci curvature. This method can be easily generalized to other jump measures on the real line and, although the coupling operator realizing this bound might be quite difficult to build explicitly, our result shows that such a coupling operator indeed exists.
The following lemma, which is a slight extension of [53], can be useful to compute the Wasserstein distance between laws on the real line when the distance is similar to the one of the two previous examples. Note that in this statement, we define as the infimum in (2.1), although might not be a distance in general. Of course, this result immediately extends to arbitrary non-negative measures and sharing the same mass.
Lemma 2.6**.**
Let be a positive measure on and consider the functional on defined by for all . Then, for any probability measures and belonging to , we have
[TABLE]
where and are the cumulative distribution functions of and respectively. Moreover, the infimum in the definition of is attained.
Proof of Lemma 2.6.
Let be a random variable with uniform law on and define and where
[TABLE]
It is well known that the laws of and are and respectively. We consider the left-continuous non-decreasing function and define the random variables and . We denote by and their respective cumulative distribution functions, and our first aim is to prove (in Step 1) that
[TABLE]
We conclude the proof of the lemma in Step 2, using a well known explicit expression for the Wasserstein distance between the laws of and when the underlying distance if the euclidean one.
Step 1. Fix and and let us prove that .
We set
[TABLE]
Since , there exists such that , and, since is left continuous and non-decreasing, we deduce that there exists which is the largest number such that . Since, by definition of , for all , , we also observe that
[TABLE]
so that by definition of . Finally, we deduce that .
The definition of also entails that, for all , . As a consequence,
[TABLE]
We deduce that and hence that .
Now, setting , we have and hence since is non-decreasing. This implies that .
This concludes Step 1 of the proof.
Step 2. Let us now conclude the proof of the lemma.
Denoting by the euclidean distance on and by the corresponding Wasserstein distance, we obtain using [53] (see also the Addendum by the same author in 1980 and references therein, see also [21] for an anterior look at the problem) that
[TABLE]
since and almost surely. But, setting , we have, for all and ,
[TABLE]
Hence
[TABLE]
In order to verify the last equality, one simply checks that, for all , the integral of the function with respect to the measure is
[TABLE]
On the one hand, we deduce from (2.7) that
[TABLE]
On the other hand, for any coupling with marginal laws and , the coupling is a coupling with the same marginal laws as and . As a consequence,
[TABLE]
This and Equation (2.7) entail that
[TABLE]
and that the law of realizes the minimum in the definition of . ∎
3 A model of interacting agents
In this section, the set is the complete graph of size endowed with the distance . In particular, the Wasserstein distance associated to equals half the total variation distance.
Fix and consider the particle system described in the introduction, where each particle represents an agent’s choice in the complete graph . We recall that this model can be written in the settings of Theorem 2.1, by setting, for all ,
[TABLE]
where is a non-negative function and is a fixed constant (called the temperature of the system). Note that this process is exponentially ergodic, and that the marginal of its empirical stationary distribution is the uniform probability measure on (this is an immediate consequence of the symmetry of the state space and of the dynamic of the particles).
The following results are proved at the end of this section. In this first proposition, we assume that is Lipschitz and provide a coarse Ricci curvature’s lower bound that does not depend on .
Proposition 3.1**.**
Assume that is a Lipschitz function and define the Lipschitz constant of as . Then the coarse Ricci curvature of the particle system described above satisfies
[TABLE]
where the infimum is taken over the probability measures on . Moreover, if is monotone, then
[TABLE]
In the next proposition, we assume that is a non-decreasing strictly convex function and show that, for small values of , the process exhibits a meta-stable state, so that the agents have a herd behavior for large values of : if all the agents start with the same choice , then, during a time of order , for some constant , is favored by the majority of the agents. Note that this is true despite the fact that, during this very same interval of time, the vast majority of the agents have changed their choices at multiple times.
Proposition 3.2**.**
Assume that is a strictly convex function such that , let such that
[TABLE]
and set
[TABLE]
If the temperature is sufficiently small, namely if
[TABLE]
then there exists a positive constant such that, for all ,
[TABLE]
uniformly in and where .
In order to check that in the above result, one simply uses the fact that is strictly convex with , so that, for all , .
Example 5*.*
Assume that is an affine function : for some and such that . Then is Lipschitz with and for any probability measure on . Hence Proposition 3.1 implies that the Wasserstein curvature of the process is bounded from below by . In particular, it is positive since
[TABLE]
and hence the system of agents does not exhibit a herd behavior.
Example 6*.*
Assume that . Then and
[TABLE]
Moreover,
[TABLE]
and
[TABLE]
Hence we deduce from Proposition 3.1 and Proposition 3.2 that
- •
if T>{\color[rgb]{0,0,0}2\,}-1/\#E, then the Wasserstein curvature of the particle system is positive (bounded from below by T-{\color[rgb]{0,0,0}2\,}+1/\#E) and the system of agents does not exhibits a herd behavior;
- •
if , then the system of agents exhibits a herd behavior.
Proof of Proposition 3.1.
Fix and . We set and . We assume, without loss of generality, that . If , one has
[TABLE]
If , then
[TABLE]
In both expressions, we have
[TABLE]
and hence, since in the case and in the case , we deduce that
[TABLE]
We deduce that
[TABLE]
Since , this concludes the first part of the proof of Proposition 3.1.
If is non-decreasing (and similarly if is decreasing), then one can replace the inequality (3.1) by (we use the fact that )
[TABLE]
which, as above, allows to conclude the proof of the second part of Proposition 3.1. ∎
Proof of Proposition 3.2.
The particle system is a mean-field particle system and hence his empirical measure process, defined as
[TABLE]
is a Markov process evolving in the simplex of .
Denote by the solution to the ODE
[TABLE]
with defined as the following operator acting on functions
[TABLE]
Then satisfies the following upper bound large deviation principle proved in [26], where we use the fact that the state space is compact (we also refer the reader to the more recent [27] for more general mean-field interactions with multiple particles jumps) : for any closed set , and all , there exists such that, for all ,
[TABLE]
where if or if is not absolutely continuous, and, otherwise,
[TABLE]
with defined as
[TABLE]
In the following, we choose in the definition of .
Step 1: Our first aim is to prove that there exists a constant such that
[TABLE]
where denotes the Euclidean norm. The main difficulty is that we require to be independent of . Otherwise, the property would be directly obtained from the fact that is strictly convex in its second variable, which is a consequence of [49, Theorem 12.2], as stressed out by [27, Lemma 7.2].
In the case where , we have
[TABLE]
Now, if , then, using the fact that is uniformly bounded over ,
[TABLE]
where is a constant that does not depend on nor . If , then one can choose in order to obtain . If , then one can choose and obtain Finally, we deduce that (3.3) holds true.
Step 2: Our aim is to prove that there exists and such that for all probability measure on such that .
Let be a probability measure on such that for some . We have
[TABLE]
Since is convex with , we have
[TABLE]
Hence
[TABLE]
Now, since the right hand side of the above term is continuous in and strictly positive when (by assumption on ), we conclude that there exists two positive constants and such that the above term is larger than for all . Since one can assume without loss of generality that , this concludes Step 2.
Step 3: Let be such that . Using the results of the previous steps, we show that any function with values in the simplex, such that and such that for some , satisfies
[TABLE]
for some constant .
Consider satisfying the above property. If or if is not absolutely continuous, then and the property is immediate. Otherwise, there exist two times , such that , and for all . In particular, Step 1 entails
[TABLE]
since Step 2 implies that for all . Setting
[TABLE]
and using Cauchy-Schwarz inequality, we obtain
[TABLE]
But
[TABLE]
hence one of the two terms in the left hand side is larger than , so that
[TABLE]
for some constant . This concludes Step 3.
Step 4. We conclude the proof by a classical renewal argument. Using Step 3, the deviation principle (3.2) and using the Markov property, one obtains
[TABLE]
and, for all integer ,
[TABLE]
Hence, for all ,
[TABLE]
Since can be chosen arbitrarily small uniformly in , taking the logarithm allow us to conclude the proof of Proposition 3.2.
∎
4 Application to other models
In this Section, we compute a lower bound for the coarse Ricci curvature of different interacting particle systems. In Subsection 4.1, we consider zero range dynamics. In Subsection 4.2, we study the case of Fleming-Viot type systems and some natural extensions. In Subsection 4.3, we consider birth and death processes in mean-field type interaction. Finally, we conclude in Subsection 4.4 with systems of particles whose jump measures admit a density with respect to the Lebesgue measure or the counting measure (we consider exponential laws on , Gaussian measures on or finitely supported discrete measures on ). For the sake of clarity, we chose independent of , but the approach and most computations remain unchanged in the dependent case.
4.1 Zero range dynamics
Let be a finite or countable space equipped with the trivial distance (defined by for all ) and let be a stochastic matrix and consider a particle system whose infinitesimal generator is given by
[TABLE]
where, for all , is a non-negative function. The particles of this system jump with respect to the transition probability at a rate .
In the following corollary, for all , is the coarse Ricci curvature (in discrete time) of the transition probability matrix along , in the sense of [45, Definition 3]:
[TABLE]
In our case, is the trivial distance, is thus equal to . We refer the reader to [45, 46], where the author provides general properties and explicit bounds of for several choices of . In the following result, we abusively write if is a coordinate of .
Corollary 4.1**.**
The coarse Ricci curvature of the particle system with infinitesimal generator given by (4.1) satisfies
[TABLE]
where \|c_{x}\|_{Lip}:=\sup_{\color[rgb]{0,0,0}\bar{x}\neq\bar{y}\in E^{N}\text{ s.t. }x\in\bar{x}\text{ and }x\in\bar{y}}\frac{|c_{x}(\bar{x})-c_{x}(\bar{y})|}{d(\bar{x},\bar{y})}.
One says that the infinitesimal generator (4.1) defines a zero range dynamic, if the jump rate and distribution of one particle does not depend on the position of the particles located at other sites. This means that, for each , there exists a non-negative function , where , such that the infinitesimal generator of the particle system on is given by
[TABLE]
where denotes the number of components of equal to . In this situation and when is the trivial distance on , the bound obtained in the above corollary can be refined, as stated in the following result.
Corollary 4.2**.**
The coarse Ricci curvature of the zero range particle system with infinitesimal generator given by (4.3) satisfies
[TABLE]
An interesting feature of a zero range dynamic is that the empirical measure of the process is a measure valued Markov process, whose coarse Ricci curvature is bounded from below by the coarse Ricci curvature of the dynamic of the full particle system in . In [7, Section 4], the authors study the mixing properties of the empirical measure dynamic, with the assumption that is finite, that is the uniform measure on for all and that there exist such that, for all ,
[TABLE]
(beware that the rate denoted by in [7] is the jump rate for particles, and hence it corresponds to in our settings). Under this set of assumptions, they obtain a modified logarithmic Sobolev inequality with rate , which provides a lower bound for the rate of exponential convergence to equilibrium of the process, in the relative entropy sense. In [4], the authors prove that the spectral gap for the empirical measure process is lower bounded by under weaker assumptions (namely with ).
Under this particular set of assumptions and considering equal to the trivial distance on , one has , , and for all . Hence Corollary 4.2 implies that is a lower bound for the coarse Ricci curvature of the particle system.
As expected, we obtain a weaker lower bound for the rate of convergence to the equilibrium of this zero range dynamic than in [7], since we consider the dynamic of the full particle system instead of its empirical measure. However, it is interesting to note that both bounds share a similar structure. Note also that, contrarily to [7], we do not require the functions to be non-decreasing and hence provide a new result for the convergence rate to equilibrium of such zero-range dynamics (both for the full particle system and for the empirical measure). Of course, one expects that the actual rate of convergence to equilibrium for the empirical measure is higher than the one we found.
Finally, one interesting aspect of our result on convergence is that we do not require that the process is reversible, allowing various choices of .
Remark 6*.*
Entropy Ricci curvature of zero range dynamic models have also been studied by Fathi and Maas in [30, Section 4.2]. Under the assumptions of [7, Section 4] and that , the authors prove that the Entropy Ricci curvature is lower bounded by and hence that this system satisfies a gradient flow structure with positive curvature when .
Proof of Corollary 4.1.
With the notation of Theorem 2.1, the jump measures of this interacting particle system are
[TABLE]
Using properties (2.5) and (2.4), we obtain for all and such that
[TABLE]
As a consequence, for all ,
[TABLE]
This and Theorem 2.1 allow us to conclude the proof. ∎
Proof of Corollary 4.2.
The same calculations as above up to (4.4) lead to
[TABLE]
But , so that
[TABLE]
Observing that , one can use Theorem 2.1 to conclude the proof. ∎
4.2 Some simple variants of Fleming-Viot type systems
Assume that the distance is bounded by over . We consider the situation where there exist a measurable function and a Markovian kernel such that
[TABLE]
where is the empirical distribution of . In opposition to the zero range dynamics of the previous subsection, the jump rate of a particle only depends on its position and its jump measure depend on the whole position of the system.
Note that, in the case where for all , we recover the Fleming-Viot type system introduced in [5, 23] and whose coarse Ricci curvature with respect to the trivial distance has been studied in [20] (see also [2, 23, 32, 6, 59] for general properties).
Setting , where is the discrete time coarse Ricci curvature of (see Subsection 4.1), we have, for all such that ,
[TABLE]
This and Theorem 2.1 entails the following corollary.
Corollary 4.3**.**
The coarse Ricci curvature of the particle system defined by (4.5) satisfies
[TABLE]
Remark 7*.*
One could also consider the infinitesimal generator
[TABLE]
and prove that
[TABLE]
for any coupling of . This is also true if is not a pure jump infinitesimal generator (see an application in Example 9 below).
Example 7*.*
If is the trivial distance , then we obtain
[TABLE]
Note that, in several cases, J_{d}^{x,y}(q(x,\cdot),q(y,\cdot){\color[rgb]{0,0,0})} can be bounded from above using the results of Example 2. In particular, if (so that ) and is a discrete state space, one gets
[TABLE]
and hence recovers [20, Theorem 1.1,Remark 2.4].
Example 8*.*
We assume that E=\mathbb{N}^{\color[rgb]{0,0,0}0}, that for some and that is the jump kernel of a birth and death process with birth and death rates respectively provided by (b_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}} and (d_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}} (), that is
[TABLE]
We also assume that the process comes down from infinity, which means that \sup_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}}\mathbb{E}_{x}(T_{0})<\infty, where is the first hitting time of [math] for the birth and death process. This is equivalent to
[TABLE]
with (see for instance [54]).
In this case, there exist a bounded function \eta:\mathbb{N}^{\color[rgb]{0,0,0}0}\rightarrow\mathbb{R}_{+} and a constant such that for all ( and are the first eigenvalue and the corresponding eigenfunction for the infinitesimal generator of the birth and death process killed when it reaches [math], see [12] where the definition of clearly implies that it is increasing and bounded for birth and death processes coming down from infinity).
Let us choose the geodesic distance on \mathbb{N}^{\color[rgb]{0,0,0}0} defined by
[TABLE]
and deduce from the computations of Example 3 that the coarse Ricci curvature of the particle system satisfies
[TABLE]
Consider now the Fleming-Viot type system case, i.e. . In this case, we have , so that
[TABLE]
In particular, since this bound does not depend on and because of the convergence result of [59], one can deduce that, if , then the coarse Ricci curvature is positive, uniformly in . As a consequence, a birth and death process with birth and death rates (b_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}} and (d_{x})_{x\in\mathbb{N}^{\color[rgb]{0,0,0}0}} and absorption rate , converges exponentially fast toward its unique quasi-stationary distribution, conditionally on non absorption (the details are the same as in [20], where the total variation norm case is considered).
Example 9*.*
In this example, we consider a piecewise deterministic Markov process (PDMP) evolving in (see [22] for a reference on PDMPs), with generator
[TABLE]
and the distance
[TABLE]
Each particle in this process evolves following the deterministic dynamic and undergoes jumps of size at rate , and jumps with respect to at rate .
Consider the pure jump part of , defined by
[TABLE]
In the setting of Section 2, this corresponds to the jump measures . Hence Theorem 2.1 provides a coupling operator for which satisfies (following the above calculations),
[TABLE]
Then, considering the coupling operator for defined by
[TABLE]
we deduce that
[TABLE]
which entails that
[TABLE]
Note that, if is small enough and smooth enough, this provide a positive lower bound for the coarse Ricci curvature, which does not depend on . In particular, applying this result to the Fleming-Viot type case and using the convergence result [59], i.e. , letting and interpreting as a killing rate, one easily obtains new contraction results in for the conditional distribution of this PDMP and also new existence/uniqueness results for the quasi-stationary distribution of this PDMP.
4.3 Birth and death processes in mean field type interaction
In [52], the author studies, among other things, the coarse Ricci curvature of a system of particles evolving as birth and death processes whose birth and death rates depend on the norm of the whole system, with . Similarly as in the cited article, we make use of the notation for the death rate and for the birth rate ( and are allowed to depend on the position of the whole system in our case). Using the notation of Theorem 2.1, this means that
[TABLE]
The same calculus as in Example 3 (with for all ) shows that, for all x,y\in\mathbb{N}^{\color[rgb]{0,0,0}0} and \bar{x},\bar{y}\in(\mathbb{N}^{\color[rgb]{0,0,0}0})^{N}, we have, if and respectively,
[TABLE]
Hence, if there exist some constants and such that
[TABLE]
and such that
[TABLE]
then, by Theorem 2.1, the coarse Ricci curvature of the particle system satisfies
[TABLE]
In the particular case of the assumptions and notation of [52, Theorem 1.1], we can take and , so that and we recover the result of the cited paper. Note that we did not need to explicitly describe a coupling in order to obtain this bound and to slightly relax the assumptions of [52]. Also, this approach can be easily extended to other processes as in Example 4 for instance.
4.4 System of particles with absolutely continuous jump measures
In this section, we assume that , , endowed with the Euclidean distance and we assume that there exist a probability measure and two measurable functions and such that, for all and , is the density of a probability measure with respect to and such that
[TABLE]
or equivalently that
[TABLE]
For the sake of clarity, we assume that does not depend on (and we will set in the rest of this subsection). However, most of the calculations considered in this section can be worked out in the general case.
The following lemma will be used together with Theorem 2.1 in order to compute a lower bound for the coarse Ricci curvature of such interacting particle systems. This is particularly interesting if one knows how to find bounds for the first moment of any probability of type and for the Wasserstein distance between any probability distributions of the same type. This is the case for instance if the are exponential laws (see Example 10), Gaussian measures (see Example 11) or finitely supported discrete measures on (see Example 12).
Lemma 4.4**.**
Under the above settings, we have, for all such that ,
[TABLE]
Proof.
For all such that , we obtain from (2.6), (2.5) and (2.4) that
[TABLE]
On the one hand, we have
[TABLE]
and, on the other hand,
[TABLE]
This concludes the proof of Lemma 4.4. ∎
Remark 8*.*
Theorem 2.1 used in conjunction with Lemma 4.4 can only provide non-positive lower bounds for the coarse Ricci curvature of the particle system. However, one can use such results to recover positive lower bounds in the case of the perturbation of a system of particles with known positive lower bound. More precisely, if an infinitesimal generator can be written , where is known to have a positive curvature (obtained using a coupling generator ) and where one gets a non-positive lower bound on the curvature of (obtained using the above results and hence using a coupling operator ), then one deduce using the coupling operator that, for all , has a positive curvature (this idea can typically be applied in the context of Remark 4 and Example 1).
Remark 9*.*
Lemma 4.4 is general but usually not sharp, since we used a crude upper bound in (4.6) and (4.7). For instance, the case studied in Subsection 4.3 enters the settings of Lemma 4.4, but we obtain a better bound using a precise computation of the Wasserstein distance between measures with only three atoms. However, in the general case, the computation of the Wasserstein distance between two discrete probability measures with finite support is a difficult task.
Example 10*.*
In this example, we consider a process evolving in with exponential jump measures (in particular, the jumps are almost surely positive). More precisely, we assume that , where is a positive measurable function of and . We also assume that and are anti-monotone (the larger , the smaller ).
Using [53], we obtain
[TABLE]
We also refer the reader to [38, Examples 3.8 and 3.9] for the generalization of this result to the canonical regular exponential family and to Gamma distributions respectively.
We deduce from Lemma 4.4 that, if , then
[TABLE]
We deduce from Theorem 2.1 that the coarse Ricci curvature of the particle system satisfies
[TABLE]
where is the Lipschitz norm of the function .
Example 11*.*
We consider a process evolving in with Gaussian jump measures. More precisely, we assume that is the law of a centered Gaussian vector with co-variance matrix . For simplicity, we assume that the matrices , all belong to a same commutative family of matrices.
In this case, the -Wasserstein distance between the probability measures and is bounded from above (see [31, 36, 50, 51] and [9] for a pedagogical account) by
[TABLE]
In particular, since the distance dominates the distance (this is an easy application of Hölder’s inequality) and using the commutation of the product , we deduce from Lemma 4.4 that, if , then
[TABLE]
where is the Frobenius norm of a matrix . Hence, Theorem 2.1 entails
[TABLE]
where and are respectively the infinite norm and the Lipschitz norm of , is the Lipschitz norm of and is the infinite norm of the function
[TABLE]
Example 12*.*
Let and assume that is a discrete, finitely supported probability measure. More precisely, we assume that there exists such that
[TABLE]
The cumulative distribution function of this measure is
[TABLE]
Hence, using [53], we obtain
[TABLE]
We deduce from Lemma 4.4 that, if , then
[TABLE]
Theorem 2.1 implies that
[TABLE]
where is the first absolute moment of and where is the Lipschitz norm of the function
[TABLE]
with endowed with the norm .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. Alfonsi, J. Corbetta, and B. Jourdain. Evolution of the Wasserstein distance between the marginals of two Markov processes. Ar Xiv e-prints , June 2016. To appear in Bernoulli Journal.
- 2[2] A. Asselah, P. A. Ferrari, and P. Groisman. Quasistationary distributions and Fleming-Viot processes in finite spaces. J. Appl. Probab. , 48(2):322–332, 2011.
- 3[3] P. Billingsley. Convergence of probability measures . Wiley Series in Probability and Statistics: Probability and Statistics. John Wiley & Sons, Inc., New York, second edition, 1999. A Wiley-Interscience Publication.
- 4[4] A.-S. Boudou, P. Caputo, P. Dai Pra and G. Posta. Spectral gap estimates for interacting particle systems via a Bochner-type identity. J. Funct. Anal. , 232(1):222–258, 2006.
- 5[5] K. Burdzy, R. Holyst, D. Ingerman, and P. March. Configurational transition in a Fleming-Viot-type model and probabilistic interpretation of Laplacian eigenfunctions. J. Phys. A , 29(29):2633–2642, 1996.
- 6[6] K. Burdzy, R. Hołyst, and P. March. A Fleming-Viot particle representation of the Dirichlet Laplacian. Comm. Math. Phys. , 214(3):679–703, 2000.
- 7[7] P. Caputo, P. Dai Pra and G. Posta. Convex entropy decay via the Bochner-Bakry-Emery approach. Ann. Inst. H. Poincaré Probab. Statist. , 45(3):734–753,2009.
- 8[8] P. Cattiaux and A. Guillin. Semi log-concave Markov diffusions. In Séminaire de Probabilités XLVI , volume 2123 of Lecture Notes in Math. , pages 231–292. Springer, Cham, 2014.
