Convex Sobolev inequalities related to unbalanced optimal transport
Stanislav Kondratyev, Dmitry Vorotnikov

TL;DR
This paper investigates the decay of relative entropies in nonlinear drift-diffusion-reaction equations modeled as gradient flows in unbalanced optimal transport, establishing new inequalities without requiring convexity of the functionals.
Contribution
It introduces novel isoperimetric-type inequalities for controlling relative entropies in gradient flows over Radon measures, even without geodesic convexity.
Findings
Proves exponential decay of relative entropies in the studied equations.
Establishes new inequalities linking entropies and their productions.
Extends analysis to non-convex functionals in unbalanced optimal transport.
Abstract
We study the behaviour of various Lyapunov functionals (relative entropies) along the solutions of a family of nonlinear drift-diffusion-reaction equations coming from statistical mechanics and population dynamics. These equations can be viewed as gradient flows over the space of Radon measures equipped with the Hellinger-Kantorovich distance. The driving functionals of the gradient flows are not assumed to be geodesically convex or semi-convex. We prove new isoperimetric-type functional inequalities, allowing us to control the relative entropies by their productions, which yields the exponential decay of the relative entropies.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Convex Sobolev inequalities related to unbalanced optimal transport
Stanislav Kondratyev
CMUC, Department of Mathematics, University of Coimbra, 3001-501 Coimbra, Portugal
and
Dmitry Vorotnikov
CMUC, Department of Mathematics, University of Coimbra, 3001-501 Coimbra, Portugal
Abstract.
We study the behaviour of various Lyapunov functionals (relative entropies) along the solutions of a family of nonlinear drift-diffusion-reaction equations coming from statistical mechanics and population dynamics. These equations can be viewed as gradient flows over the space of Radon measures equipped with the Hellinger-Kantorovich distance. The driving functionals of the gradient flows are not assumed to be geodesically convex or semi-convex. We prove new isoperimetric-type functional inequalities, allowing us to control the relative entropies by their productions, which yields the exponential decay of the relative entropies.
Keywords: functional inequalities, optimal transport, reaction-diffusion, fitness-driven dispersal, entropy, exponential decay
MSC [2010] 26D10, 35K57, 35B40, 49Q20, 58B20
1. Introduction
The unbalanced optimal transport [36, 30, 13, 35, 14, 43] interpolates between the classical Monge-Kantorovich transport [45, 46] and the optimal information transport [41]. It equips the space of finite Radon measures with a formal Riemannian structure so that certain classes of reaction-diffusion equations and systems can be interpreted as gradient flows. This paper continues our investigation [30, 29, 31, 33, 32] of such gradient flows and associated functional inequalities, see also [12, 24, 23] for related studies.
The class of PDEs that we consider in this paper is
[TABLE]
Here is a nonlinear function of and which is required to have a certain structure specified below in (1.12), and is an open connected bounded domain admitting the relative isoperimetric inequality, cf. [40],
[TABLE]
All our results remain valid if is a periodic box ; in this case (1.2) is omitted.
The drift-diffusion-reaction equation (1.1) appears in statistical mechanics [19]. It also describes nonlinear fitness-driven models of population dynamics, cf. [38, 15, 16, 25, 33], where it is assumed that the dispersal strategy is determined by a local intrinsic characteristic of organisms called fitness. We refer to Section 2 and to [33] for more detailed discussions.
Let and be fixed -smooth functions, which satisfy the following assumptions:
[TABLE]
Let be a fixed smooth strictly positive function satisfying
[TABLE]
Define
[TABLE]
Thus, the functions and determine the problem (1.1)–(1.3), and the function is merely needed to define a Lyapunov functional for this problem,
[TABLE]
which will be referred to as the relative entropy. Obviously, if and only if . Formally calculating along a solution of (1.1)–(1.3) we obtain
[TABLE]
where the entropy production is defined by
[TABLE]
Setting
[TABLE]
we can write
[TABLE]
Note that problem (1.1)–(1.3) can be viewed as a formal gradient flow (with respect to the unbalanced Hellinger-Kantorovich Riemannian structure) of the driving functional , where
[TABLE]
see Section 2 for the details. We are interested in the exponential decay of the Lyapunov functional (1.14) along the trajectories of this gradient flow. This is related to the entropy-entropy production inequalities of the form
[TABLE]
They can be viewed as unbalanced generalizations of the convex Sobolev inequalities [2, 3, 27], see Section 2.
The main results of the paper are convex Sobolev inequalities akin to (1.17), see Theorems 3.5 and 4.1, and existence and asymptotics of weak solutions to (1.1)–(1.3), see Theorem 3.6.
2. Background and discussion
Assume for a while that is a torus or is convex, although this is not required for our main results. The gradient of a scalar functional on the space of finite Radon measures over with respect to the Hellinger-Kantorovich Riemannian structure (also known as the Wasserstein-Fisher-Rao one) was calculated in [30, 35]:
[TABLE]
The first term on the right-hand side is the Otto-Wasserstein gradient , cf. [42, 45], and the second one is the Hellinger-Fisher-Rao gradient , cf. [28]. It is easy to compute that , hence (1.1)–(1.3) may be interpreted as a gradient flow:
[TABLE]
The production of the relative entropy along the Otto-Wasserstein gradient flow
[TABLE]
is
[TABLE]
Similarly, the production of the same entropy along the Hellinger gradient flow
[TABLE]
is
[TABLE]
In the case of non-convex we can abuse the terminology and still refer to (1.1)–(1.3) as to a gradient flow.
It is clear that
[TABLE]
Generally speaking, neither the Otto-Wasserstein nor the Fisher-Rao entropy production are able to control the relative entropy, so (1.17) is a result of an interplay between the reaction, diffusion and drift. A simple counterexample to
[TABLE]
is with being a proper subset of . Indeed, due to (1.5), (1.9) and (1.10). It is easy to construct a smooth example by mollifying this one. A trivial counterexample to
[TABLE]
is where is a non-negative constant.
Remark 2.1*.*
Note that the two counterexamples intersect at , which violates our target inequality (1.17). However, we will observe, cf. Theorems 3.5 and 4.1, that it suffices keep the total mass bounded away from [math] to secure (1.17).
In view of (1.11), in order to obtain more interesting and instructive examples we should restrict ourselves to probability densities . The sequence
[TABLE]
of probability densities on is a counterexample to (2.4). Indeed, the left-hand side of (2.4) is of order and the right-hand side is .
Inequality (2.5) for deserves a more detailed discussion.
Let us start with considering . In this case, as first observed in the seminal paper [26], the gradient flow (2.2) is the linear Fokker-Planck equation, and the celebrated Bakry-Émery approach allows one to prove (2.5) for [2, 3, 27]. However, it is crucial to have concavity of , which we never assume in this work. These instances of (2.5) are referred to as convex Sobolev inequalities, which inspired the title of our paper. The particular case
[TABLE]
implies the log-Sobolev inequality for , the Poincaré inequality for and Beckner’s inequalities [4] for . Namely, (2.5) may be rewritten as
[TABLE]
In contrast, our assumptions on admit any in (2.6), which yields the following “Beckner-Hellinger inequality”:
[TABLE]
Consider now the case , , . Assume for simplicity that and . Then (2.2) is the porous medium equation, cf. [42]. The alleged inequality (2.5) for the relative entropy (2.6), , reads
[TABLE]
Setting , , , we rewrite (2.9) in the form
[TABLE]
The inequality
[TABLE]
similar to (2.10) appears in [11], see also [10, 18]. It holds for , that is, for , . Assume for a moment that the the relative entropy, i.e., the left-hand side of (2.11), is a priori bounded. Since , the mass is a priori bounded. Consequently, (2.11) is weaker than (2.10) since the exponent is less than , and it is plausible that (2.10) cannot be true. Inequality (2.11) for is equivalent to Beckner’s inequality (2.7). As explained in [18], inequality (2.11) is wrong for . In this connection, our results yield the following variant of (2.10):
[TABLE]
for any , , , that is, any , , .
The counterparts of the alleged inequalities (2.9) and (2.10) for are
[TABLE]
[TABLE]
Here . This resembles the inequality
[TABLE]
which was established in [10, 18]. Since , (2.15) is weaker than (2.14), so it seems that (2.14) cannot be true. Our results imply the following variant of (2.14):
[TABLE]
Remark 2.2*.*
Inequalities (2.8), (2.12), (2.16) are obtained assuming (so that (3.4) is automatically satisfied), but hold without this normalization due to their homogeneity.
Many authors studied (2.5) or related inequalities in the particular case , that is, when the driving entropy is compared to its production, cf., e.g., [42, 45, 46, 1, 9]. In this connection, the strict geodesic convexity of the driving entropy normally plays the pivotal role. In [33] (see also [30]) we studied (1.17) for without assuming neither Otto-Wasserstein nor Hellinger-Kantorovich geodesic convexity (we also never assume any similar condition in the present paper). The inequalities obtained there can be further refined [32] be means of studying gradient flows in the spherical Hellinger-Kantorovich space [34, 7], which is beyond the scope of the present paper (though it may seem strange, even non-negativity of the entropy production is uncertain for the spherical Hellinger-Kantorovich flows in the case ). The proofs in the present paper are more direct and simple than in [33] due to the “quasihomogeneous structure” (1.12).
Our last example concerns , which corresponds to the arctangential heat equation [6]. The relative entropy generated by this is geodesically convex neither in the Otto-Wasserstein nor in the Hellinger-Kantorovich sense, cf. [32]. Take . Then we infer the following inequality resembling the log-Sobolev one:
[TABLE]
provided is bounded away from [math].
Nonlinear Fokker-Planck equations akin to (2.2) model behaviour of various stochastic systems, see [20, 44, 27, 5]. The related drift-diffusion-reaction equation (1.1) was suggested in [19]. On the other hand, equation (1.1) belongs to the class of nonlinear models (cf. [16, 25, 47, 33, 32, 38, 15]) for the spatial dynamics of populations which are tending to achieve the ideal free distribution [22, 21] (the distribution which happens if everybody is free to choose its location) in a heterogeneous environment. The dispersal strategy is determined by a local intrinsic characteristic of organisms called fitness. The fitness manifests itself as a growth rate, and simultaneously affects the dispersal as the species move along its gradient towards the most favorable environment. In (1.1), is the density of organisms, and is the fitness. The equilibrium when the fitness is constantly zero corresponds to the ideal free distribution. The works [17, 8, 37, 47, 30, 29, 31, 33] perform mathematical analysis of some of such fitness-driven models. Our Theorem 3.6 indicates that the populations converge to the ideal free distribution with an exponential rate.
3. Main results
We start by introducing the weak solutions to (1.1)–(1.3), following the lines of [33, 32].
Define
[TABLE]
where the integral exists by (1.9). Observe that
[TABLE]
so that is a nonnegative continuous increasing function on .
Set
[TABLE]
As in [33], we can write (1.1) in the form
[TABLE]
where stands for .
Definition 3.1**.**
Let ; . A function is called a weak solution of (1.1)–(1.3) on if for we have and
[TABLE]
for any function such that . A function is called a weak solution of (1.1)–(1.3) on if for any it is a weak solution on .
Remark 3.2*.*
For we automatically have , so the condition is equivalent to . Here .
Formally, the integrand vanishes if . Otherwise it can be written as
[TABLE]
This motivates the following extension of the entropy production suitable for weak solutions.
Definition 3.3**.**
If and , then the entropy production is defined by
[TABLE]
Remark 3.4*.*
Observe that although the integrand with the gradient in (3.3) is a nonnegative measurable function on , the integral, and hence the entropy production, may be infinite.
The following entropy-entropy production inequality applicable to weak solutions is based on an isoperimetric-type inequality established in Section 4.
Theorem 3.5** (Entropy-entropy production inequality).**
Suppose that and satisfy (1.5)–(1.10). Let be a set of functions such that for any and , we have and
[TABLE]
Then there exists such that
[TABLE]
Proof.
The idea is to use the isoperimetric-type inequality provided by Theorem 4.1 (see Section 4). Since we are dealing with a less regular setting at the moment, we argue by approximation.
Take and as usual, put . Arguing as in [33, proof of Theorem 1.7], we see that there exists a sequence of functions taking values in , where , such that
[TABLE]
Set and , so that . Clearly, and are positive and reasonably smooth, the sequences and are bounded in (specifically, the former is bounded by ), and by the continuity of we have
[TABLE]
In particular, this implies that converges to in . Further, by the Lebesgue Dominated Convergence we have
[TABLE]
Thus, if we denote the infimum in (3.4) by and the supremum in (3.5) by , there is no loss of generality in assuming that and . It follows from Theorem 4.1 that there exist and both depending on and (but not on the approximation nor on itself) such that
[TABLE]
By the Lebesgue Dominated Convergence we have
[TABLE]
Further, we have
[TABLE]
On one hand, in . On the other hand, the functions
[TABLE]
are uniformly bounded in , and since we obviously have
[TABLE]
we also have
[TABLE]
Using Reverse Fatou’s Lemma for products (Lemma A.1 in the Appendix), we obtain
[TABLE]
Combining this with (3.7) and (3.9), we see that we can pass to the limit in (3.8) and obtain (3.6) with . ∎
Theorem 3.6** (Existence and asymptotics of weak solutions).**
Assume (1.5)–(1.10). Then for any there exists a nonnegative weak solution of problem (1.1)–(1.3) which enjoys the following properties:
- (1)
* satisfies the entropy dissipation inequality in the sense of measures: for any smooth nonnegative compactly supported function we have*
[TABLE] 2. (2)
the initial entropy satisfies
[TABLE] 3. (3)
* satisfies the lower -bound*
[TABLE] 4. (4)
* exponentially converges to in the sense of entropy:*
[TABLE]
where can be chosen uniformly over initial data satisfying
[TABLE]
with some ; 5. (5)
for any ,
[TABLE]
where can be chosen uniformly over initial data satisfying
[TABLE]
Proof.
For the proof of existence, the approximating procedure used in [33] is still applicable in the current setting. As a matter of fact, the existence result in [33] requires that is either large or does not depend on when is near [math] or near . A similar requirement was imposed for large . However, these assumptions are only needed in order to ensure that any can be bounded from above by a function satisfying and that can be bounded from below by another such function provided that is uniformly bounded away from [math]. This is still the case in the current setting. Indeed, assume for simplicity that is continuous on . Set and put , then clearly ; moreover, it follows from the monotonicity of that , as required. The existence of a lower bound is proved in a similar way, cf. [33, Remark 3.4].
Inequality (3.11) is proved in the same way as the analogous inequality in [33].
We prove that the solution constructed as in [33] satisfies (3.10). To this end it suffices to check that this inequality is preserved under the passage to the limit. Specifically, assume that smooth enough approximate solutions are uniformly bounded in and converge to a. e. in , while
[TABLE]
By the Lebesgue Dominated Convergence we have
[TABLE]
Arguing as in [33, proof of Theorem 3.9] and, in particular, taking into account that a. e. on the set and a. e. on the set , we conclude that for any we have
[TABLE]
so sending and applying Beppo Levy’s theorem, we obtain
[TABLE]
or, equivalently,
[TABLE]
Combining this with (3.17) and (3.18), we obtain (3.10).
We now prove the exponential convergence of the solution to the steady state. Let be a weak solution of (1.1)–(1.3) with the initial data satisfying (3.14). Let be the set of functions such that for any , we have and , with the same and as in (3.14). By Theorem 3.5 we have the entropy-entropy production inequality (3.6) for . It follows from the bounds (3.11) and (3.12) that for a. a. . Combining the entropy dissipation and entropy-entropy production inequalities, we get
[TABLE]
in the sense of measures. Set and . It is easy to check that that in the sense of measures, whence a. e. coincides with a nonincreasing function. Moreover,
[TABLE]
by virtue of (3.11), so for a. a. , which implies (3.13).
We will now use (3.13) with , which is a -function for , and satisfies the assumptions (1.6)–(1.8). We immediately get
[TABLE]
where . Uniform boundedness of implies a bound on . ∎
4. Inequality
In this section we prove a refined version of our unbalanced convex Sobolev inequality in the smooth case.
Theorem 4.1**.**
Assume (1.5)–(1.10). Let be such that
[TABLE]
Then there exist constants (independent of ) , , such that
[TABLE]
The proof of Theorem 4.1 is based on the next two lemmas.
Lemma 4.2**.**
Fix . Then
[TABLE]
Proof.
If the minimum on the right-hand side vanishes, there is nothing to prove. Otherwise the set has nonzero measure. In what follows, we use some facts from geometric measure theory, which can be found in [39]. The relative perimeter of a Lebesgue measurable set of locally finite perimeter with respect to is where is the Gauss-Green measure associated with . The support of is contained in the topological boundary of .
We have:
[TABLE]
The last integral is the variation of over , which can be computed using the coarea formula:
[TABLE]
where we first use the observation that the support of the Gauss–Green measure associated with is disjoint with whenever or , and then we notice that if , then the part of the support of the Gauss–Green measure of lying in is contained in .
Invoking the relative isoperimetric inequality (1.4), we estimate
[TABLE]
and since for we have
[TABLE]
we see that
[TABLE]
Combining this estimate with (4.3) and (4.4), we obtain (4.2). ∎
Lemma 4.3**.**
Given , there exists such that
[TABLE]
Proof.
Applying L’Hôpital’s rule for , and remembering that is an increasing function, we obtain
[TABLE]
[TABLE]
In (4.6) and (4.7) we have used the fact that for , the signs of and coincide, while . Obviously, (4.6) and (4.7) imply (4.5). ∎
Proof of Theorem 4.1.
We claim that there exists such that
[TABLE]
Indeed, it follows from (1.8) (L’Hôpital’s rule) that
[TABLE]
As the entropy is bounded on , by de la Vallée Poussin’s theorem the set is uniformly integrable. Put
[TABLE]
for any we have
[TABLE]
where is the modulus of integrability of . Hence
[TABLE]
which clearly implies a lower bound on \big{|}[\rho\geq m]\big{|} and a fortiori on \big{|}[r\geq\beta]\big{|} with .
Clearly, there is no loss in generality in assuming in (4.8).
In what follows we fix and such that and satisfies (4.8). Denote
[TABLE]
and also
[TABLE]
Assume for now that . Using Lemma 4.2, we have
[TABLE]
Taking into account (4.8), we can write
[TABLE]
with independent of . Estimating
[TABLE]
we obtain
[TABLE]
If , this estimate trivially holds with any . Since is a priori bounded from above by , (4.9) implies that
[TABLE]
Evoking Lemma 4.3, we obtain
[TABLE]
Using (4.10) to estimate by , we obtain (4.1) ∎
Appendix A Reverse Fatou’s Lemma for products
Lemma A.1**.**
Let be a measure space. Suppose that is bounded in and converges to a nonnegative limit in . Then
[TABLE]
Proof.
As we have , we can use Reverse Fatou’s Lemma obtaining
[TABLE]
Further, it is clear that
[TABLE]
Using (A.2) and (A.3) we obtain
[TABLE]
as claimed. ∎
Acknowledgment
The research was partially supported by the Portuguese Government through FCT/MCTES and by the ERDF through PT2020 (projects UID/MAT/00324/2019, PTDC/MAT-PUR/28686/2017 and TUBITAK/0005/2014).
Conflict of interest statement
We have no conflict of interest to declare.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] L. Ambrosio, N. Gigli, and G. Savaré. Gradient Flows: in Metric Spaces and in the Space of Probability Measures . Basel: Birkhäuser Basel, 2008.
- 2[2] A. Arnold, P. Markowich, G. Toscani, and A. Unterreiter. On convex Sobolev inequalities and the rate of convergence to equilibrium for Fokker-Planck type equations. Comm. Partial Differential Equations , 26(1-2):43–100, 2001.
- 3[3] D. Bakry and M. Émery. Diffusions hypercontractives. In Séminaire de Probabilités XIX 1983/84 , pages 177–206. Springer, 1985.
- 4[4] W. Beckner. A generalized Poincaré inequality for Gaussian measures. Proc. Amer. Math. Soc. , 105(2):397–400, 1989.
- 5[5] T. Bodineau, J. Lebowitz, C. Mouhot, and C. Villani. Lyapunov functionals for boundary-driven nonlinear drift-diffusion equations. Nonlinearity , 27(9):2111–2132, 2014.
- 6[6] Y. Brenier. Geometric origin and some properties of the arctangential heat equation. Tunis. J. Math. , 1(4):561–584, 2019.
- 7[7] Y. Brenier and D. Vorotnikov. On optimal transport of matrix-valued measures. Ar Xiv e-prints , Aug. 2018.
- 8[8] R. S. Cantrell, C. Cosner, Y. Lou, and C. Xie. Random dispersal versus fitness-dependent dispersal. J. Differential Equations , 254(7):2905–2941, 2013.
