Entropic curvature and convergence to equilibrium for mean-field dynamics on discrete spaces
Matthias Erbar, Max Fathi, Andr\'e Schlichting

TL;DR
This paper introduces a notion of entropic curvature for mean-field dynamics on discrete spaces, linking curvature bounds to convergence rates and establishing explicit bounds for classical models.
Contribution
It extends the concept of curvature bounds from linear Markov chains to non-linear mean-field dynamics, providing new tools for analyzing convergence to equilibrium.
Findings
Positive curvature bounds imply functional inequalities for convergence.
Explicit curvature bounds are derived for classical statistical mechanics models.
The framework generalizes existing curvature notions to non-linear mean-field systems.
Abstract
We consider non-linear evolution equations arising from mean-field limits of particle systems on discrete spaces. We investigate a notion of curvature bounds for these dynamics based on convexity of the free energy along interpolations in a discrete transportation distance related to the gradient flow structure of the dynamics. This notion extends the one for linear Markov chain dynamics studied by Erbar and Maas. We show that positive curvature bounds entail several functional inequalities controlling the convergence to equilibrium of the dynamics. We establish explicit curvature bounds for several examples of mean-field limits of various classical models from statistical mechanics.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Entropic curvature and convergence to equilibrium for mean-field dynamics on discrete spaces
Matthias Erbar
Insitut für Angewande Mathematik, Endenicher Allee 60, Universität Bonn
,
Max Fathi
CNRS & Institut de Mathématiques de Toulouse
Université de Toulouse
118 route de Narbonne, 31062 Toulouse, France
and
André Schlichting
Insitut für Angewande Mathematik, Endenicher Allee 60, Universität Bonn
Abstract.
We consider non-linear evolution equations arising from mean-field limits of particle systems on discrete spaces. We investigate a notion of curvature bounds for these dynamics based on convexity of the free energy along interpolations in a discrete transportation distance related to the gradient flow structure of the dynamics. This notion extends the one for linear Markov chain dynamics studied in [21]. We show that positive curvature bounds entail several functional inequalities controlling the convergence to equilibrium of the dynamics. We establish explicit curvature bounds for several examples of mean-field limits of various classical models from statistical mechanics.
1. Introduction
This work is about long-time behavior for mean-field systems on discrete spaces. Mean-field equations describe the large-scale limit of interacting particle systems where the total force exerted on any given particle is the average of the forces exerted by all other particles on the tagged particle. They are used to describe collective behavior in many areas of sciences. Examples include the modeling of granular flows in physics [2] and collective behavior and self-organization for groups of animals [16, 11]. We refer to [36] for an introduction to the mathematical theory.
One of the important questions in the mathematical analysis of these equations is their long-time behavior. In [9], Carrillo, McCann and Villani obtained quantitative bounds on the rate of convergence to equilibrium for McKean-Vlasov equations in a continuous setting of the form
[TABLE]
under strong convexity assumptions on the potentials , and . The core idea underlying their method was the fact that the PDE has a gradient flow structure, i.e. it can be recast as a gradient descent equation of the free energy functional in the space of probability measure with respect to the Kantorovitch-Wasserstein distance , which has a formal Riemannian description via Otto calculus [32, 33]. The use of such structures in the study of long-time behavior comes from the fact that as soon as the driving functional satisfies some uniform convexity property (with respect to the particular metric structure), it must decay exponentially fast towards its minimal value along solutions of the evolution equation. Moreover, we can use convexity to derive strong functional inequalities relating the distance, the entropy functional and the entropy dissipation functional [33].
1.1. Setup and main results
Our main motivation here is to adapt the approach of [9] to mean-field equations in a discrete setting. We consider discrete mean-field dynamics of the form
[TABLE]
where is a flow of probability measures on a finite set and is a parametrized collection of Markov kernels. These dynamics naturally arise as scaling limits of interacting particles systems on graphs where the interaction only depends on the normalized empirical measure of the system (which indeed corresponds to mean-field interactions). They generalize linear Markov chains on discrete spaces, which orrepsond to the case where is a constant Markov kernel, independent of .
While the Wasserstein gradient flow approach works well on continuous spaces, it fails in the discrete setting, since the Wasserstein -transport distance does not admit any non-trivial absolutely continuous curve. In our previous work [19], we derived a gradient flow structure for (1.1) by replacing the role of the Wasserstein distance with a distance constructed via a suitable modification of the Benamou-Brenier formula for optimal transport, extending similar earlier results for linear reversible Markov chains obtained in [29, 31, 13]. Under the condition that the rates are Gibbs with a potential (see Assumption 2.1), i.e. is reversible with respect to a local Gibbs measure of the form \pi_{x}(\mu)=Z(\mu)^{-1}\exp\bigl{(}-H_{x}(\mu)\bigr{)}, with given in terms of the potential , we showed that this dynamic is the gradient flow of the free energy functional
[TABLE]
with respect to the distance on the simplex of probability measures on , see Proposition 2.2. This built up on previous works [4, 5] that showed that is indeed a Lyapunov functional for the flow. An archetypical example, which we shall discuss in some details later on, is the classical Curie-Weiss model, which corresponds to a mean-field dynamic on a two-point space. Already this easy model exhibits interesting behavior, such as a phase transition at an explicit critical value of a temperature parameter.
In the present work, we exploit this gradient flow structure to analize the long-term behaviour of (1.1) inspired by the approach in [9] by investigating convexity properties of the free energy along discrete optimal transport paths for a non-linear Markov triple as above. Following the works of Lott, Sturm and Villani [28, 35] for metric measure spaces and [21, 31] for linear Markov chains, we make the following
Definition 1.1** (Entropic Ricci curvature lower bound).**
We say that has Ricci curvature bounded below by (for short ) if for any -geodesic :
[TABLE]
We will show, see Theorem 3.7, that Ricci curvature lower bounds can be characterized in terms of a discrete Bochner-type inequality by deriving the Hessian of in the Riemannian structure , as well as in terms of the Evolution Variational inequality EVIκ for the solutions to (1.1):
[TABLE]
Further, we will show that a positive lower bound on the Ricci curvature entails a number of functional inequalities that control the convergence to equilibrium of the mean-field systems. These involve a discrete Fisher information functional given by
[TABLE]
which arises from the dissipation of along solutions to (1.1) as . One of our main results is the following theorem which can be seen as a discrete analog of [9, Thm. 2.1].
Theorem 1.2**.**
Assume that for some . Then the following hold:
- (i)
there exists a unique stationary point for the evolution (1.1), it is the unique minimizer of . Let ;
- (ii)
the modified logarithmic Sobolev inequality with constant holds, i.e. for all ,
[TABLE]
- (iii)
for any solution to (1.1) we have exponential decay of the free energy:
[TABLE]
- (iv)
the entropy-transport inequality with constant holds, i.e. for all ,
[TABLE]
1.2. Examples
We will establish explicit curvature bounds for several examples of (relatively simple) mean-field dynamics, such as the Curie-Weiss model, zero-range mean-field dynamcis and misanthrope processes. We compute a formula for the second derivative of entropy along geodesics, and generalize techniques developed in [22, 20] to the present non-linear situation in order to bound for bounding this second derivative. The nonlinearity of the dynamic gives rise to several extra terms when computing the Hessian of the free energy functional, which complicates the analysis.
In the case of the Curie–Weiss model, we will show that a positive lower curvature bound holds down to the critical temperature, see Section 5.1.
Another particular family of dynamic we shall be interested is when the flux of particles from some site to a site is a function of the particle density at site , that is . In the situation where is constant, this would correspond to the scaling limit of independent particles on the complete graph. As in [22, 20], our approach is in some sense perturbative in nature, and we shall consider rates of the form , and show that if is not too large in some sense, relative to , then we can derive a rate of convergence to equilibrium. This is inspired by recent work of Villemonais [39], who proved that the particle system has a positive Ollivier-Ricci (or coarse Ricci) curvature (another notion of curvature, corresponding to a contraction rate for the Markovian dynamic) independently of the system size, and hence converges to equilibrium in distance, via a uniform estimate on the Poincaré constant of the dynamic. Our approach has the advantage of yielding rates of convergence in relative entropy via Theorem 1.2, which is a strictly stronger notion of convergence.
1.3. Connection to the literature
The approach of [9] was later extended to other potentials [10, 3]. Other approaches developed later include using uniform convergence estimates for a stochastic particle approximation [12] and coupling arguments [17, 18]. Without convexity, deriving rates of convergence can be quite delicate, since there may be multiple equilibria [38], unlike what happens for linear diffusions.
Our approach developed here builds on earlier work [29, 31, 13] contructing gradient flow structures for linear Markov chain dynamics and [21] studying Ricci curvature and its impact on functional inequalities in this context. It is also related to, but different from the one developped in [26], which uses convexity of the entropy along a different type of paths, the so-called entropic interpolations, rather than geodesic paths, to establish functional inequalities involving relative entropy. In the continuous setting, entropic interpolations are regularizations of geodesics in Wasserstein space, but in the discrete case it seems that the entorpic interpolations of [26] are related to a gradient flow structure different from the one of [19] we use here.
Organization
The plan of the paper is as follows. Section 2 introduces the mathematical framework we shall work in. Section 3 introduces the notion of curvature bounds in our setting, and contains the computation of the Hessian for general dynamics. Section 4 investigates the consequences of Ricci curvature bounds in terms of functional inequalities and convergence to equilibrium for the nonlinear dynamics. Finally, Section 5 investigates curvature bounds for several examples of mean-field dynamics inspired by classical models of statistical physics.
2. Setup
2.1. Gradient-flow formulation
The main definitions and results from [19] on which this work will build are collected in this section. The gradient flow structure of (1.1) is based on the existence of a suitable potential, which is ensured by the following constraint, which we shall assume to hold throughout the article. We recall that a rate matrix of a Markov chain in the continuous time setting satisfies
[TABLE]
Assumption 2.1**.**
Let be such that for each is a twice continuously differentiable. Let be a family of rate matrices that is Gibbs with respect to the potential function , i.e. for each , is the rate matrix of an irreducible, reversible ergodic Markov chain with respect to the probability measure
[TABLE]
with
[TABLE]
In particular satisfies the detailed balance condition with respect to , that is for all
[TABLE]
holds. Moreover, we assume that for each the map is Lipschitz continuous over .
We will refer to the triple as above for short as a non-linear Markov triple.
The specific form of (2.1) with (2.2) emerges from the detailed balance condition of an underlying -particle system, from which the dynamics we are interested arise in the limit (see [19]). Associated to a non-linear Markov triple is the non-linear master equation
[TABLE]
which is the deterministic evolution equation describing the mean-field limit of the underlying particle system. Based on the above assumption a gradient flow formulation of (2.4) is established in [19, Proposition 2.13] as we shall briefly recall.
Consider the Onsager operator given by
[TABLE]
where is the logarithmic mean. Then the master equation can be written in gradient flow form using the functional from (1.2):
[TABLE]
In other words, (2.4) is the gradient flow of with respect to the Riemannian structure on induced by the metric tensor . Since this Riemannian metric degenerates at the boundary of we note the following characterization in metric terms. We consider the distance function on that is formally induced by the Riemannian metric , i.e. for we set
[TABLE]
where CE is the set of curves with continuous, measurable and integrable in time, and satisfying the continuity equation
[TABLE]
in distribution sense, and the action functional is given by
[TABLE]
Proposition 2.2** (Gradient flow structure of the mean-field system).**
Let be a non-linear Markov triple satisfying Assumption 2.1. Then any solution to (2.4) is a gradient flow of with respect to the distance .
The distance and the above gradient flow structure are extensions of the discrete transport distance constructed in [29] and the gradient flow structure of linear Markov chains to the non-linear case. See [19, Section 2.3] for more background on the construction of the distance .
An immediate consequence of the gradient flow formulation (2.5) is the free energy dissipation relation established in [19, Remark 2.14]:
[TABLE]
Here, the discrete Fisher information or dissipation is defined by
[TABLE]
with defined by . In this framework, the Fisher information can be reinterpreted as the squared modulus of the gradient of the entropy with respect to the discrete transport metric , i.e. we have
[TABLE]
2.2. Notation
We will use the following notation throughout the paper.
Given a function we will denote by its discrete gradient, given by
[TABLE]
For a function we denote by its discrete divergence, given by
[TABLE]
For and we will denote the Euclidean inner products by
[TABLE]
Then we have the integration by parts formula
[TABLE]
For a functions in , we denote by the componentwise product. Using the shorthand notation \Lambda(\mu)_{xy}:=\Lambda\bigl{(}\mu_{x}Q(\mu)_{xy},\mu_{y}Q(\mu)_{yx}\bigr{)} we can thus write the continuity equation (2.6) and the action functional (2.7) compactly as
[TABLE]
We will switch freely between notations for the components of functions , as or depending on what is more readable in the presence of other indices, e.g. a time parameter .
2.3. Equilibria and qualitative longtime behavior
From the gradient flow formulation, it is straightforward to obtain the following characterization of stationary states, which is completely analog to the McKean-Vlasov equation on [8, Proposition 2.4 and Corollary 2.5].
Proposition 2.3** (Characterization of stationary points).**
Let be a non-linear Markov triple satisfying Assumption 2.1. Then, the following statements are equivalent:
- (1)
* is a stationary solution to (1.1), that is .* 2. (2)
* is a fixed point of the map (2.1), that is .* 3. (3)
* is a critical point of (1.2) on .* 4. (4)
* is a global minimizer of (2.9), that is .*
The set of all stationary points is denoted by .
Moreover, it holds that , i.e. each stationary point has strictly positive density.
Proof.
(1)(2): Let . The rate matrix is by assumption the rate matrix of an irreducible reversible Markov chain with unique reversible measure . In particular, it is also the unique stationary solution to and hence . If , we calculate using the local detailed balance condition (2.3) and find
[TABLE]
since is a rate matrix.
(2)(3): Take and any . Let the standard linear interpolation. Then, it holds
[TABLE]
where we used the relations (2.1) and (2.2). Now, if the right hand side is zero and hence a critical point if . On the other hand, if the right hand side is zero for all , it follows that for a constant . Since , we have that and hence critical points are fixed points.
(2)(4): Let . Since for all , we immediately find from the local detailed balance condition (2.3) that . Likewise, any global minimizer satisfies by the definition of that , that is the local detailed balance condition (2.3). Since again by assumption has the unique reversible measure , we conclude that .
Finally, the positivity follows from the definition of in (2.1) and the assumptions on implying that is finite. Hence, for all implies in particular that . ∎
Another useful information provided by the gradient flow information is the free energy dissipation relation (2.8), which immediately shows that is a Lyapunov function for the evolution (1.1). By standard theory, we can conclude the following qualitative longtime behavior.
Proposition 2.4** (Convergence to stationary points).**
Let satisfy Assumption 2.1, then for some as .
Proof.
The proof follows along standard arguments from the theory of dynamical systems (see for instance [37, Section 6]).
By Assumption 2.1, is Lipschitz on , which implies by standard well-posedness for ODEs, that the solutions to (1.1) are globally defined and generate a semigroup on . The -limit is given by
[TABLE]
Since is compact, each orbit for any is also compact in and the -limit is non-empty and quasi-invariant, that is for it holds . Moreover, again thanks to the compactness of follows for any that as (see also [37, Lemma 6.7]).
Since the free energy functional is continuous on and monotone along the flow, it follows that consists of complete orbits along which has the constant value with . By the free energy dissipation relation (2.8), it follows that for any and any we have
[TABLE]
and hence the nonnegativity of and continuity of trajectories imply for all . Hence, consists of all states such that , which by Proposition 2.3 entails and moreover also that is a stationary solution . ∎
Our purpose in this work can be summarized as giving sufficient conditions for which the above statement on convergence to equilibrium can be made quantitative (but which shall automatically enforce that contains a single element).
3. Curvature for non-linear Markov chains
In this section, we introduce a notion Ricci curvature lower bounds for non-linear Markov chains based on geodesic convexity of the entropy. This generalizes the notion of curvature for linear Markov chains developed in [21] inspired by the approach of Lott, Sturm and Villani [28, 35] to a synthetic notion of lower bounds on Ricci curvature for geodesic metric measure spaces.
Let be a non-linear Markov chain according to Assumption 2.1 and let be the associated free energy functional (1.2) and the associated transport distance.
Definition 3.1** (Entropic Ricci curvature lower bound).**
We say that has Ricci curvature bounded below by (for short ) if for any -geodesic :
[TABLE]
We will show that a lower bound on the Ricci curvature can be characterized equivalently by a lower bound on the Hessian of the free energy functional with respect to the Riemanian structure on induced by , or via an Evolution Variational Inequality for the non-linear Markov dynamics.
To this end, we first derive the geodesic equation for the distance as well as an expression for the first variation of the free energy.
Lemma 3.2** (Geodesic equation).**
Let be a constant speed geodesic contained in . Then the unique potential such that solves
[TABLE]
or explicitely
[TABLE]
where is the derivative with respect to .
Remark 3.3*.*
In the case of a linear Markov chain, where is independent of , the expression (3.1) simplifies to
[TABLE]
recovering the geodesic equation derived in [21, Prop. 3.4].
Proof.
Since is a smooth Riemannian manifold, uniqueness and smoothness of geodesics imply that the curve is smooth, and that there exists a unique (up to constants) potential such that and achieves in the infimum for the action
[TABLE]
and moreover is then also a smooth curve. We will derive (3.1) as the corresponding Euler–Langrange equation. So let for be a smooth perturbation of such that and for all . Let be the unique potentials such that . Note that is smooth in and . Then we have
[TABLE]
We compute
[TABLE]
From the continuity equation we infer that for any
[TABLE]
Plugging this into (3.2) for and integrating by parts in yields:
[TABLE]
The claim then follows by noting that
[TABLE]
and using that the perturbation was arbitrary. ∎
In order to give convenient expressions for the first and second variation of the free energy along a geodesic, we introduce the following notation.
We set
[TABLE]
and note that , so is the adjoint of . The master equation (1.1) then reads . Note further that we can write
[TABLE]
where we set \bigl{(}Q(\mu)\pi(\mu)\bigr{)}_{xy}=Q(\mu)_{xy}\pi(\mu)_{x}, which is symmetric in .
Lemma 3.4** (First variation of the free energy).**
Let be a solution to the continuity equation. Then it holds
[TABLE]
Note that when the curve is a solution to the gradient flow equation, the right-hand side is indeed the discrete Fisher information, in accordance with (2.8).
Proof.
Starting from the expression
[TABLE]
recalling that , and setting , we obtain from the continuity equation
[TABLE]
Here, we have also used in the last step that
[TABLE]
and integrated by parts. ∎
To give an expression of the second variation of , we further introduce the following notation.
Let denote the partial derivative of with respect to . Then we write
[TABLE]
Furthermore, let us write
[TABLE]
Then, we set
[TABLE]
Finally, we can define the following quantity:
[TABLE]
Remark 3.5*.*
Note that in the case of a linear Markov chain, the last two terms in the definition of vanish and we recover the formula of [21] for the second derivative of the entropy along geodesics.
Lemma 3.6** (Second variation of the free energy).**
Let be a -geodesic contained in and let be the unique potential such that . Then it holds
[TABLE]
Proof.
From (3.4) we get
[TABLE]
To calculate , first note that
[TABLE]
where denotes the Kronecker delta. Hence, we infer from the geodesic equation (3.1) and (3.5) that
[TABLE]
The continuity equation \dot{\mu}_{t}=-\nabla\cdot\bigl{(}\Lambda(\mu_{t})\cdot\nabla\psi_{t}\bigr{)} readily yields that
[TABLE]
To calculate , note that for any we have
[TABLE]
while for any and we have
[TABLE]
Thus, we get . As , this yields the claim. ∎
We can now state the following equivalent characterizations of lower Ricci bounds:
Theorem 3.7**.**
Let . For a non-linear Markov triple the following assertions are equivalent:
- (1)
* ;* 2. (2)
For all and we have
[TABLE] 3. (3)
The following Evolution Variational Inequality EVIκ holds: for all and all :
[TABLE]
*where denotes the solution to the non-linear Fokker–Planck equation starting from , i.e. and ; *
By Lemma 3.6, (2) corresponds to a lower bound on the Hessian of in the Riemannian structure on induced by . Note that the equivalence of (1) and (2) is a non-trivial assertion, since the Riemannian metric degenerates at the boundary of .
Proof.
The proof is based on an argument of Daneri and Savaré [15] suitably adapted to the discrete setting. We can follow verbatim the proof of [21, Thm. 4.5] where the analogue of Thm. 3.7 is proven for linear Markov chains. The core of the argument is a variation of the action along the evolution equation, [21, Lem. 4.6]. To accommodate the additional terms arising from the non-linear structure in the present situation, we have to replace that lemma with Lemma 3.8 below. ∎
Lemma 3.8**.**
Let be a smooth curve in . For each let denote the solution of the non-linear Fokker–Planck equation at time starting from and let be a smooth curve in satisfying the continuity equation
[TABLE]
Then the identity
[TABLE]
holds for every and .
Proof.
First of all, setting we compute as in Lemma 3.4 that
[TABLE]
Furthermore,
[TABLE]
In order to further manipulate we first note that
[TABLE]
Further, we observe that for any
[TABLE]
To show (3.9), note that the left-hand side equals , while the right-hand side equals . Integrating by parts repeatedly and using (3.9) we obtain
[TABLE]
Thus, we arrive at
[TABLE]
To conclude, it suffices to note that
[TABLE]
further remark that for any we have
[TABLE]
and then use again (3.8). ∎
To end this section, we use Theorem 3.7 to give an expression of the optimal lower Ricci bound on the two point space.
Lemma 3.9** (Two-point space).**
Let \bigl{(}\{0,1\},Q,\pi\bigr{)} be a non-linear Markov triple on the base space and let and as well as and . Then, the optimal constant such that is given by
[TABLE]
Remark 3.10*.*
Note that in the case of a linear Markov chain, where and are independent of , and in particular , we recover the formula in [29, Remark 2.11].
Proof.
First, we compute from (3.7) for any and non-constant :
[TABLE]
Now, note that , yielding
[TABLE]
Furthermore, . Thus by Theorem 3.7 we get the optimal curvature bound by dividing the above identity by and minimize in . Now, we use the identities
[TABLE]
to get rid of the partial derivatives and obtain after some further simplifications the result (3.10). ∎
4. Consequences of Ricci bounds
In this section we derive consequences of Ricci curvature lower bounds for non-linear Markov chains in terms of functional inequalities and the trend to equilibrium for the dynamics. Throughout this section, let be a non-linear Markov triple satisfying Assumption 2.1.
We first note the following expansion bound for the transport distance between solutions to the non-linear Markov dynamics.
Proposition 4.1**.**
Assume that for some . Then for any two solutions to the non-linear evolution equation , we have
[TABLE]
In particular, when , solutions with different initial data get closer at an exponential speed.
Proof.
This is a consequence of the EVIκ. It follows from [15, Prop. 3.1] applied to the functional on the metric space . ∎
Next we prove some consequences of Ricci bounds in terms of different functional inequalities. These results can be seen as non-linear discrete analogues of classical results of Bakry and Émery [1] and of Otto and Villani [33]. They extend results that have been obtained in [21] for linear Markov chains, and are reminiscent of results of Carrillo, McCann and Villani [9] obtained for McKean–Vlasov equations in a continuous setting.
Let be the free energy functional associated with given by
[TABLE]
and recall that attains its minimum on . We set
[TABLE]
so that . Recall that is the discrete Fisher information, given by
[TABLE]
provided and else. Recall that gives the dissipation of along a solution to the non-linear Fokker–Planck equation . More precisely, we have
[TABLE]
Note further that with we have the expression . The next result relates , and the transport distance under a Ricci bound.
Theorem 4.2**.**
Assume that for some . Then the inequality holds with constant , i.e. for all ,
[TABLE]
Proof.
Fix and assume without restriction that since otherwise there is nothing to prove. Denote by the solution to with and set . Theorem 3.7 yields that EVIκ holds, so in particular for :
[TABLE]
From the triangle inequality and the fact that is continuous with respect to we obtain
[TABLE]
Now, note that since we can estimate
[TABLE]
Since is a continuous function, we obtain
[TABLE]
which yields the claim. ∎
Theorem 4.3**.**
Assume that for some . Then the following hold:
- (i)
there exists a unique stationary point , it is the unique minimizer of ;
- (ii)
the modified logarithmic Sobolev inequality with constant holds, i.e. for all ,
[TABLE]
- (iii)
for any solution to we have exponential decay of the free energy:
[TABLE]
- (iv)
the transport-entropy inequality with constant holds, i.e. for all ,
[TABLE]
Proof.
(i) From Proposition 2.3 we know that the set of stationary points is non-empty and that it coincides with the set of local minimizers of . Assume by contradiction that has two distinct local minima at points and with and let be a constant speed geodesic connecting and . Then we infer from that
[TABLE]
Since is a local minimum, there is an such that . This leads to
[TABLE]
a contradiction. Hence, is a singleton and is the unique global minimizer of .
(ii) By Theorem 4.2, we have that holds. Applying with and , noting that , and using Young’s inequality
[TABLE]
with , and yields the claim.
(iii) From we infer that for a solution we have
[TABLE]
and we obtain (4.2) as a consequence of Gronwall’s lemma.
(iv) It suffices to establish ET for any . The inequality for general can then be obtained by approximation, taking into account the continuity of with respect to the Euclidean metric on . So fix , and let be the solution to the non-linear Fokker–Planck equation starting from . From Proposition 2.4 we have that as and that
[TABLE]
The last property follows from the continuity of with respect to the Euclidean distance. We now define the function by
[TABLE]
Obviously we have and by (4.3) we have that as . Hence it is sufficient to show that is non-increasing. To this end we show that its upper right derivative is non-positive. If we deduce from (4.1) that
[TABLE]
where we used in the last inequality. If , then the relation also holds true, since this implies that for all . ∎
5. Some examples of curvature bounds
We shall now compute lower bounds on the curvature for several examples of mean-field dynamics, inspired by classical models of statistical physics.
5.1. Curie-Weiss model
Let us consider the following example also mentioned in [5, Example 4.2], which is the infinite particle limit of the classical Curie-Weiss model, one of the simplest examples of Markovian dynamic exhibiting a phase transition. Let us take and define for by
[TABLE]
with and . Hence, we have
[TABLE]
The free energy for the Curie-Weiss model is given by
[TABLE]
Since , we have that the free energy is essentially given by the function
[TABLE]
Hence, is convex on for and non-convex for .
The local detailed balance state (2.1) is given by
[TABLE]
Therefore, it holds
[TABLE]
We use Glauber rates and set
[TABLE]
With this choice, we can estimate the Ricci curvature of the limit with the help of Lemma 3.9.
Proposition 5.1** (-Convexity of Curie-Weiss model with Glauber rates).**
It holds for
[TABLE]
As a consequence of this curvature bound, one can derive the modified logarithmic Sobolev inequality for the nonlinear dynamic. This inequality could also be derived from a logarithmic Sobolev inequality for the particle Gibbs sampler of [30] and passing to the limit in the number of particles. In [26], the mLSI was also derived via convexity of the entropy, but along a different family of interpolations of probability measures. At a technical level, the proof of [26] requires differentiating the entropy three times rather than two, which involves more technical estimates (this is not much of an issue for a two-point space system like Curie-Weiss, but gets much more complicated for more involved systems, as the ones we shall see later in this section).
Proof.
We set and , for which the rates become . First, we note that with the notation of Lemma 3.9, we have and .
The expression in the infimum of (3.10) to optimize becomes
[TABLE]
It will be convenient to do the variable substitution . For obtaining the expression in a compact manner, we introduce two auxiliary functions
[TABLE]
We then obtain, using the identities , and and after some rewriting,
[TABLE]
A simple evaluation yields , where we note as .
For the lower bound, we proceed in several steps, we first observe that
[TABLE]
Now, the claim follows once we have shown
[TABLE]
Indeed, the last term in (5.2), combined with the above estimate, is bounded from below by , which proves (5.1). To prove (5.3), we do another substitution and set . Therefore, the function becomes after transformations by hyperbolic trigonometric identities
[TABLE]
and we can estimate the left hand side of (5.3) by
[TABLE]
where we used the bound for all . This can be further estimated by observing that for all and by using the identity , to obtain
[TABLE]
by the substitution , which proves (5.3). ∎
5.2. General zero-range/misanthrope processes
In this section, we consider mean-field limits of particle systems with rate matrix of the form
[TABLE]
These systems generalize usual linear Markov chains encoded in by an additional dependency of the jump rate on the population density of the departure and arrival site of the jump. This model, first introduced in [14], incorporates many examples, such as for instance the zero range process, for which , but also interacting agent/voter models [39], for which .
Since our method in this section is perturbative in nature, we restrict to the complete graph as underlying graph, that is for all . In this case the mean-field limit from the -particle system was derived in [23] and the limit equation was investigated in [34]. Since positive curvature is know in the case of independent particles on the complete graph [21], we expect that for with bounded , we should also obtain positive entropic curvature for the nonlinear models when is sufficiently large.
To have a gradient flow formulation, the chain has to satisfy the local detailed balance condition (2.3)
[TABLE]
For the further analysis, we will focus on the separable case, where for some holds
[TABLE]
It is easy to verify that (5.4) is satisfied for
[TABLE]
This is of the form (2.2) for a potential given e.g. by
[TABLE]
i.e. for given by .
Example 5.2**.**
There are two subclasses of models of particular interest:
[TABLE]
Both models satisfy the local detailed balance condition (2.3) for
[TABLE]
and
[TABLE]
For the first, the interacting agent model from [39] is recovered by setting , where is the (constant) degree of the complete graph. For these models, [39] proves a spectral gap via another notion of discrete curvature, but which is not strong enough to derive the mLSI. In the second case, this dynamic corresponds to (a scaling limit of) a zero range-process. This type of particle system is commonly used in statistical physics as a toy model for understanding various large-scale features of interacting systems (scaling limits, long-time behavior, phase transitions). We refer to [27] for an overview. Long-time behavior of the -particle system in various situations was studied for example in [7, 6, 22, 24]. Recently, Hermon and Salez [25] significantly improved on the state of the art using a combination of the Lu-Yau martingale method and a monotone coupling argument, establishing a modified logarithmic Sobolev inequality independent of the number of particles for mean-field zero-range processes in a non-perturbative setting, even in some inhomogeneous situations where the curvature approach cannot work.
In this separable case, we can prove the following statement.
Theorem 5.3** (Curvature for separable kernels).**
Assume the rates are separable, given by
[TABLE]
Suppose that
[TABLE]
Moreover, assume that
[TABLE]
Then in the sense of Theorem 3.7 with given by
[TABLE]
Especially, in the regime \frac{\max\mathopen{}\mathclose{{}\left\{\operatorname{Lip}a,\operatorname{Lip}b}\right\}}{\min\{\underline{a},\underline{b}\}}=:\eta\ll 1 it holds
[TABLE]
Proof.
First, we evaluate some of the quantities occurring in the derivation of the curvature estimate. Let us start with (3.3), for which we have
[TABLE]
and
[TABLE]
The next quantity (3.5) becomes
[TABLE]
The last quantity is (3.6)
[TABLE]
from which after symmetrization, we obtain the identity
[TABLE]
We will use the following identity for the logarithmic mean
[TABLE]
To compensate off-diagonal terms, we need the following estimate for the logarithmic mean [22, Lemma A.2]
[TABLE]
The above basic identities shall be used to estimate the four terms in (3.7), which we denote by and in this order of occurrence.
First term : Let us start estimating the first term in (3.7) and use the identity (5.11)
[TABLE]
where we introduced . Although is non-negative, we will keep it to compensate for terms from and . To do so, we compactify notation further by introducing the tilted measure
[TABLE]
With this definition and with the one-homogeneity of , we can rewrite
[TABLE]
and likewise for the zero-homogeneous derivatives
[TABLE]
With this notation we want to employ the estimate (5.12) in the form
[TABLE]
Now, we can bound from below by
[TABLE]
Second term : Let us continue with the second term in (3.7) for which we use (5.8), symmetrize the sum and obtain
[TABLE]
where we used the Young inequality .
Third term : For estimating the third term in (3.7), denoted by , we use (5.9), do a crude estimate to again apply (5.11)
[TABLE]
To bound the infimum, we observe that
[TABLE]
Hence, in total we obtain
[TABLE]
Fourth term : For estimating the fourth term in (3.7), denoted by , we use (5.10) and compensate it partly by from (5.14)
[TABLE]
*Conclusion: * We combine all the estimates of the individual terms in (3.7) from the rewriting . There is one small catch. After having applied the first bound (5.13) to , we split for the non-negative part into and , where only to the second term the bound (5.14) is applied. The other three estimates (5.15), (5.16) and (5.17) are applied in a straightforward manner to , and , respectively, to arrive at the lower bound
[TABLE]
If is chosen according to (5.5) and by , we arrive at the bound with given in (5.6). The final statement (5.7) follows by simple calculus from the bound , similar for and observing that in this case. ∎
Acknowledgments: This work was supported by the PHC Procope project ”‘Entropic Ricci curvature bounds of interacting particle systems and their mean-field limits”’. MF was additionally supported by ANR-11-LABX-0040-CIMI within the program ANR-11-IDEX-0002-02, as well as Projects EFI (ANR-17-CE40-0030) and MESA (ANR-18-CE40-006) of the French National Research Agency (ANR). ME and AS were additionally supported by the German Research Foundation (DFG) through the “Hausdorff Center for Mathematics” and the CRC 1060 “The Mathematics of Emergent Effects”.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Dominique Bakry and Michel Émery. Diffusions hypercontractives. In Séminaire de probabilités, XIX, 1983/84 , volume 1123 of Lecture Notes in Math. , pages 177–206. Springer, Berlin, 1985.
- 2[2] Dario Benedetto, Emanuele Caglioti, and Mario Pulvirenti. A kinetic equation for granular media. RAIRO Modél. Math. Anal. Numér. , 31(5):615–641, 1997.
- 3[3] François Bolley, Ivan Gentil, and Arnaud Guillin. Uniform convergence to equilibrium for granular media. Arch. Ration. Mech. Anal. , 208(2):429–445, 2013.
- 4[4] Amarjit Budhiraja, Paul Dupuis, Markus Fischer, and Kavita Ramanan. Limits of relative entropies associated with weakly interacting particle systems. Electron. J. Probab. , 20:no. 80, 22, 2015.
- 5[5] Amarjit Budhiraja, Paul Dupuis, Markus Fischer, and Kavita Ramanan. Local stability of Kolmogorov forward equations for finite state nonlinear Markov processes. Electron. J. Probab. , 20:no. 81, 30, 2015.
- 6[6] Pietro Caputo, Paolo Dai Pra, and Gustavo Posta. Convex entropy decay via the Bochner-Bakry-Emery approach. Ann. Inst. Henri Poincaré Probab. Stat. , 45(3):734–753, 2009.
- 7[7] Pietro Caputo and Gustavo Posta. Entropy dissipation estimates in a zero-range dynamics. Probab. Theory Related Fields , 139(1-2):65–87, 2007.
- 8[8] J. A. Carrillo, R. S. Gvalani, G. A. Pavliotis, and A. Schlichting. Long-time behaviour and phase transitions for the mckean–vlasov equation on the torus. Archive for Rational Mechanics and Analysis , Jul 2019.
