Nonlinear Fokker-Planck equations with reaction as gradient flows of the free energy
Stanislav Kondratyev, Dmitry Vorotnikov

TL;DR
This paper interprets a class of nonlinear Fokker-Planck equations with reaction as gradient flows on the space of measures using the Hellinger-Kantorovich distance, establishing convergence to equilibrium and applications in ecology.
Contribution
It introduces a new gradient flow framework for nonlinear Fokker-Planck equations with reaction terms, without requiring convexity of the entropy, and proves convergence results.
Findings
Proves entropic exponential convergence to equilibrium.
Establishes new dissipation inequalities controlling entropy.
Provides existence of weak solutions under mild conditions.
Abstract
We interpret a class of nonlinear Fokker-Planck equations with reaction as gradient flows over the space of Radon measures equipped with the recently introduced Hellinger-Kantorovich distance. The driving entropy of the gradient flow is not assumed to be geodesically convex or semi-convex. We prove new generalized dissipation inequalities, which allow us to control the relative entropy by its production. We establish the entropic exponential convergence of the trajectories of the flow to the equilibrium. Along with other applications, this result has an ecological interpretation as a trend to the ideal free distribution for a class of fitness-driven models of population dynamics. Our existence theorem for weak solutions under mild assumptions on the nonlinearity is new even in the absence of the reaction term.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Nonlinear Fokker-Planck equations with
reaction as gradient flows of the free energy
Stanislav Kondratyev
CMUC, Department of Mathematics, University of Coimbra, 3001-501 Coimbra, Portugal
and
Dmitry Vorotnikov
CMUC, Department of Mathematics, University of Coimbra, 3001-501 Coimbra, Portugal
Abstract.
We interpret a class of nonlinear Fokker-Planck equations with reaction as gradient flows over the space of Radon measures equipped with the recently introduced Hellinger-Kantorovich distance. The driving entropy of the gradient flow is not assumed to be geodesically convex or semi-convex. We prove new generalized dissipation inequalities, which allow us to control the relative entropy by its production. We establish the entropic exponential convergence of the trajectories of the flow to the equilibrium. Along with other applications, this result has an ecological interpretation as a trend to the ideal free distribution for a class of fitness-driven models of population dynamics. Our existence theorem for weak solutions under mild assumptions on the nonlinearity is new even in the absence of the reaction term.
Keywords: functional inequalities, optimal transport, Hellinger-Kantorovich distance, geodesic non-convexity
MSC [2010] 26D10, 35Q84, 49Q20, 58B20
1. Introduction
1.1. Setting
Let be an open connected bounded domain in with sufficiently smooth boundary and let be the outward unit normal along . We are interested in nonnegative solutions of
[TABLE]
Here is the unknown function, is a known nonlinear function of and , equation (1.2) is the no-flux boundary condition and the initial data are nonnegative. We refer to Section 1.3 for the motivation and background.
When considering problem (1.1)–(1.3), we always make the following assumptions concerning the function :
[TABLE]
When needed, we also assume that
[TABLE]
Remark 1.1*.*
We make comfortable assumptions about the smoothness of . We do not insist that should be defined for so as not to exclude the interesting cases such as (which corresponds to the linear Fokker-Planck equation, cf. [15, 20]) and , , (the fast diffusion, cf. [40]). However, we assume in (1.5) that the functions and admit continuous extensions to . This ensures that the terms in (1.1) make sense. Moreover, we assume (1.10) to avoid certain complications with the entropy production to be defined below.
Remark 1.2*.*
Assumption (1.6) is essential, it ensures the parabolicity of (1.1). The equation may become degenerate or singular only if or is large. The latter does not bother us as we only consider bounded solutions in what follows.
Remark 1.3*.*
Assumptions (1.7), (1.8) ensure the existence of a positive equilibrium, see below.
Remark 1.4*.*
Estimate (1.9) ensures that the entropy and energy of the equation are well-defined and well-behaved. Note that at least some restrictions on the growth of as are inevitable, as the related very fast diffusion equation is known to behave abnormally [39].
Remark 1.5*.*
Conditions (1.11) and (1.12) are convenient technical assumptions needed for -bounds (hence for the existence theorem) and for controlling the energy for large in the proof of Theorem 1.8. However, they are not necessary everywhere, so we explicitely mention them when the need arises.
Remark 1.6*.*
The results of the paper remain valid if is the periodic box .
It follows from (1.6)–(1.8) that for any there exists a unique such that
[TABLE]
Clearly, . It is a stationary solution of (1.1), (1.2). As we will see, all non-zero solutions of the problem converge to .
1.2. Energy and entropy
Now we will introduce the energy and entropy functionals for equation (1.1) as well as the notion of weak solution.
Put
[TABLE]
It is easy to see that
[TABLE]
Observe that both and are nonnegative and strictly increase with respect to .
Note that if is a nonnegative function of and possibly of , an -bound on is translated into an -bound on , i. e., the superposition operator associated with is -bounded. The same is true of .
Let be a classical solution of (1.1)–(1.3). Equation (1.1) can be cast in the equivalent form
[TABLE]
where we write for , etc. Multiplying by and integrating over , we obtain
[TABLE]
We call the functional
[TABLE]
the energy of problem (1.1)–(1.3) and equation (1.14), the energy identity. Thus, any classical solution of (1.1)–(1.3) satisfies the energy identity (1.14).
For our purposes, the energy identity is useful because it allows us to control the integral . In particular, we can define the weak solution of (1.1)–(1.3) in a class of functions such that . It is easier to exploit this assumption in the case of equation (1.13). Thus, we define the weak solution as follows:
Definition 1.7**.**
Let . A function is called a weak solution of (1.1)–(1.3) on if and
[TABLE]
for any function such that . A function is called a weak solution of (1.1)–(1.3) on if for any it is a weak solution on .
Now, let us address the entropy of the problem. Define
[TABLE]
It follows from (1.9) that is well-defined and continuous on . As decreases with respect to and , it is clear that and if and only if . The relative entropy of equation (1.1) is the functional
[TABLE]
Observe that it is well-defined at least for as the superposition operator is bounded in the spaces .
A straightforward computation shows that for a positive classical solution of (1.1)–(1.3) we have
[TABLE]
Equation (1.17) is called the entropy dissipation identity and the integral on the right-hand side of (1.17) is called the entropy production. However, the term may make no sense for vanishing or non-smooth . In order to generalise the definition of the entropy production, we use the identity
[TABLE]
Given a function such that , the right-hand side of the last identity is a nonnegative measurable function on , so we can define the entropy production for such functions by the formula
[TABLE]
where the second integral on the right-hand side may be infinite. Thus, we see that any positive classical solution of (1.1)–(1.3) satisfies the entropy dissipation identity
[TABLE]
As usual, in the case of weak solutions we establish not the identities (1.14) and (1.18) but rather corresponding inequalities, viz. the energy inequality
[TABLE]
and the entropy dissipation inequality
[TABLE]
For functions such that we understand (1.19) and (1.20) in the sense of measures, i. e., that for any smooth nonnegative compactly supported function we respectively have
[TABLE]
If (1.20) holds in the sense of measures, the derivative is a nonpositive distribution and hence a measure, while the entropy itself a. e. coincides with a non-increasing function.
An important question is whether the entropy can be controlled by the entropy production, since this would imply the exponential stability of the equilibrium. It turns out that this is true provided that the -norm of is bounded away from [math]. Specifically, we have
Theorem 1.8** (Entropy-entropy production inequality).**
Suppose that satisfies (1.4)–(1.10) as well as (1.11). Let be a set of functions such that for any , we have and
[TABLE]
Then there exists such that
[TABLE]
Theorem 1.8 is a consequence of a fairly general functional inequality established in Section 2.
Theorem 1.9** (Existence of weak solutions).**
Suppose that satisfies (1.4)–(1.10) as well as (1.11) and (1.12). Then for any there exists a nonnegative weak solution of problem (1.1)–(1.3) enjoying the following properties:
- (1)
(upper -bound)
[TABLE] 2. (2)
* satisfies the energy inequality (1.19) in the sense of measures and*
[TABLE] 3. (3)
* satisfies the entropy dissipation inequality (1.20) in the sense of measures and*
[TABLE] 4. (4)
(lower -bound)
[TABLE]
Remark 1.10*.*
Theorem 1.9, mutatis mutandis, is also valid in the case of the pure Fokker-Planck equation (1.29). Even in this case, our conditions on the nonlinearity are more relaxed than the ones available in the literature, see, e.g., [1, 4, 22, 23, 13, 40, 7, 3] and the references therein.
Remark 1.11*.*
In the general case, uniqueness of solutions cannot be expected due to the non-Lipschitz reaction term. However, our weak solutions are unique provided the initial data is bounded away from zero, see Theorem 3.9.
Remark 1.12*.*
Under the hypotheses of Theorem 1.9, the right-hand side of (1.23) is always finite (see Remark 3.5). Moreover, if satisfies an estimate , inequality (1.23) provides an estimate .
The next theorem shows that the solutions that we have constructed exponentially converge to . Note that (1.12) is not needed for the long-time convergence.
Theorem 1.13** (Convergence to equilibrium).**
Assume (1.11) and suppose that a weak solution of (1.1)–(1.3) with the initial data satisfies the entropy dissipation inequality (1.20), inequality (1.25), and the lower -bound (1.26). Then exponentially converges to in the sense of entropy:
[TABLE]
where can be chosen uniformly over initial data satisfying
[TABLE]
with some .
Theorems 1.8, 1.9, and 1.13 are proved in Section 3.3.
1.3. Motivation and background
The nonlinear Fokker-Planck equation
[TABLE]
is intended to express the behaviour of stochastic systems coming from various branches of physics, chemistry and biology, see [15, 38, 21, 5]. In order to take into account the creation and annihilation of mass, the general drift-diffusion-reaction equation (1.1) was suggested in [14]. In the considerations of [14] (cf. also [15]), the crucial role is played by the free energy functional that up to an additive constant coincides with our relative entropy functional from (1.16). We opt for this change of terminology (though for thermodynamists the free energy involves the (physical) entropy, the internal energy, and the temperature) because in mathematical analysis it is convenient to refer to the basic Lyapunov functional of a system as the entropy, cf. [41, p. 270].
On the other hand, equation (1.1) is a general nonlinear model for the spatial dynamics of a population that is tending to achieve the ideal free distribution [17, 16] (the distribution that happens if everybody is free to choose its location) in a heterogeneous environment. The dispersal strategy is determined by a local intrinsic characteristic of organisms called fitness (see, e.g., [10, 11]). The fitness manifests itself as a growth rate, and simultaneously affects the dispersal as the species move along its gradient towards the most favorable environment. In (1.1), is the density of organisms, and is the fitness. The equilibrium when the fitness is constantly zero corresponds to the ideal free distribution. The original model [32, 10] assumes a linear logistic fitness
[TABLE]
but in general it can be any nonlinear function of the spatial variable and the density, cf. [11]. The assumptions (1.6), (1.7), (1.8) are natural as they simply mean that the fitness is decreasing with respect to the population density (as the resources are limited), being positive for very small densities and negative for very large densities. Our Theorem 1.13 indicates that the populations converge to the ideal free distribution with an exponential rate.
The existence of weak solutions for the fitness-driven dispersal model (1.1)–(1.3) with the logistic fitness (1.30) was shown in [12], and the entropic exponential convergence to was established in [25]. The same kind of results for cross-diffusion systems involving several interacting populations (with logistic fitnesses) can be found in [24]. Related two-species models were investigated in [6, 31], where one population uses the fitness-driven dispersal strategy and the other diffuses freely or does not move at all. A system of two interacting populations with a particular nonlinear fitness function has recently been considered in [43], which is the only existing mathematical treatment of a non-logistic fitness model that we are aware of.
But perhaps our main motivation to study (1.1) is that it is a gradient flow of the entropy functional with respect to the intriguing recently introduced distance on the space of Radon measures, which is related to the unbalanced optimal transport (i.e., failing to preserve the total transported mass), and that is referred to as the Hellinger-Kantorovich distance or the Wasserstein-Fisher-Rao distance [25, 8, 30, 29, 9]. This distance endows the set of Radon measures with a formal (infinite dimensional) Riemannian metric , and provides first- and second-order differential calculus [25] in the spirit of Otto [35, 41, 42]. In particular, one can compute the metric gradients of the functionals of the form
[TABLE]
by the formula
[TABLE]
where stands for the first variation with respect to and is the usual gradient in space. We refer to [25] for further details and explanations. Since , we can recast (1.1) as a gradient flow
[TABLE]
The entropy dissipation identity (1.17), which by the way was already known to Frank [14], is then nothing but the archetypal property of gradient flows
[TABLE]
In this connection, we recall that for the metric gradient flows like (1.32), the geodesic convexity of the driving entropy functional (or at least semi-convexity, i.e., -convexity with a negative constant ) makes a difference [35, 2, 41, 42, 36]. The presence of convexity allows one to apply minimizing movement schemes [2, 20] to construct solutions to the gradient flow. Moreover, -convexity with strictly positive enables the Bakry-Emery procedure that usually yields the exponential convergence of the relative entropy to zero. Minimizing movement schemes for Hellinger-Kantorovich gradient flows of geodesically convex functionals and for related reaction-diffusion equations were suggested in [19, 18].
Our entropy is geodesically -convex with respect to the Hellinger-Kantorovich structure if , , but fails to be semi-convex for , and for (the latter option corresponds to the interesting case of the Boltzmann entropy). The spatial heterogeneity further complicates the situation. The quadratic (logistic) multicomponent entropy considered in [24, 26] is not even semi-convex. All this can be observed by computing the Hessian of the entropy, cf. [25, Section 3.4]; the non-convexity of the Boltzmann entropy with respect to the Hellinger-Kantorovich metric was also mentioned in [19, 18, 30, 29]. We refer to [28] for a more detailed discussion of examples of and the corresponding geodesic non-convexity. However, Santambrogio [36] emphasizes that the lack of geodesic convexity is not a universal obstacle for the study of gradient flows; our results in the current paper and in [24, 26, 25, 27, 28, 37] illustrate this idea.
2. Generalized dissipation inequalities
2.1. Setting
Motivated by the expressions for the entropy and entropy production, we forget for a while problem (1.1)–(1.3) and consider the integrals
[TABLE]
on their own right. Here a domain in ; ; the functions
[TABLE]
are fixed, and varies over a set of functions . Observe that the nonnegativity of and ensures the existence of the integrals (2.1) and (2.2), although they need not be finite.
The functions and introduced in Section 1.2 are, of course, prototypes for the ones appearing in (2.1) and (2.2), but we assume no formal relationship between them. In particular, in this section we do not suppose that satisfies (1.4)–(1.12).
We would like to know whether (2.1) can be controlled by (2.2) uniformly with respect to . In general, this is not the case, cf. a related discussion in [27]. However, we show that under suitable assumptions on the functions , , and , (2.2) does indeed control (2.1) provided that the set of admissible is separated from [math] in some sense.
For simplicity, we concentrate on the regular case. Section 2.4 contains a discussion of possible generalisations.
Theorem 2.1**.**
Let be a bounded, connected, open domain in admitting the relative isoperimetric inequality. Let . Suppose that functions and satisfy
[TABLE]
Finally, suppose that a set consisting of strictly positive functions contains no sequence such that is bounded in and converges to [math] in measure. Then there exists a constant such that
[TABLE]
Remark 2.2*.*
The isoperimetric inequality for reads
[TABLE]
where denotes the relative perimeter of a Lebesgue measurable set of locally finite perimeter with respect to , cf. [33, Remark 12.39], [34]. We recall that the relative perimeter is defined as
[TABLE]
where is the Gauss-Green measure associated with . The support of is contained [33] in the topological boundary of .
Remark 2.3*.*
If , condition (2.4) is automatically true. If the set is compact, the right-hand side of (2.6) is simplified to and likewise, if , the left-hand side of (2.6) can be written as . As for (2.5), it is more tricky. In Section 2.4 we show that it always holds in a particular setting relevant for gradient flows (Theorem 2.9).
Remark 2.4*.*
The infimum in (2.5) depends on and may tend to zero as , otherwise the claim would be trivial.
2.2. Strategy of the proof of Theorem 2.1
Before starting the proof of Theorem 2.1, we would like to informally outline the underlying ideas.
For simplicity, we will opt for an argument by contradiction. Of course, a direct proof could be presented (as we have recently done in [27] for a related inequality), and a quantitative constant could be derived from it. However, this would be much more cumbersome, and the constant obtained in this way would anyway not be optimal. Any discussion of quantitative constants lies beyond the scope of this article.
It easily follows from (2.5) that controls from above unless is small. Moreover, we infer (Lemma 2.5) that if the constant in (2.7) blows up, the sets where either or are small tend to grow and together occupy nearly all of , while the ‘transitional annulus’—where neither is small—collapses. At this point we must be prepared to face the situation where the integral
[TABLE]
is controlled neither by
[TABLE]
(because (2.5) is not applicable), nor by
[TABLE]
(because may be small), nor by
[TABLE]
(because the ‘annulus’ is too small).
This is where the term with the gradient comes into play. The crucial observation is that the total variation of over the ‘annulus’ can be estimated from below. Actually, condition (2.6) gives a universal lower bound on the variation of between the ‘inner boundary’ of the annulus (say, where is small) and its ‘outer boundary’ (where is small). All in all, the integral (2.9) is controlled by the area of the set (due to (2.4)), which is controlled by the perimeter of this set (by the isoperimetric inequality), which is in turn controlled by the total variation of over the ‘annulus’. This eventually leads to a contradiction. Naturally, when this idea is implemented in Lemma 2.7 and the subsequent reasoning, we must relate the total variation and the integral . Then we use the coarea formula and estimate the total variation of by the perimeters of its superlevel sets.
2.3. Proof of Theorem 2.1
Here we prove Theorem 2.1. We start with the following observations.
Under the hypotheses of Theorem 2.1, integral (2.1) is finite for whenever so is
[TABLE]
Indeed, according to (2.4) we can choose such that
[TABLE]
By (2.5), we have
[TABLE]
(possibly ). Then whenever , so
[TABLE]
as claimed.
Take sequences and such that , ,
[TABLE]
(this is possible according to (2.5)), and .
Assume that Theorem 2.1 is not true. Then there exists a sequence of functions such that
[TABLE]
where
[TABLE]
Clearly, and . Moreover, it easily follows from (2.3)–(2.6) that
[TABLE]
and according to the choice of , we have
[TABLE]
We want to show that the sequence is bounded in and in measure, thus obtaining a contradiction.
We use (2.10) to estimate
[TABLE]
Thus, we have
[TABLE]
For large , the first term on the right-hand side is negative, so we conclude that
[TABLE]
From (2.15) we get
[TABLE]
and by (2.12), the last expression is bounded uniformly with respect to . Hence the sequence is bounded in .
Lemma 2.5**.**
Given ,
[TABLE]
Proof.
Using (2.17), we have:
[TABLE]
where we have taken into account (2.12), so (2.18) is proved. ∎
Lemma 2.6**.**
Given , for large we have
[TABLE]
Proof.
Using the estimate
[TABLE]
obtained in the proof of Lemma 2.5, we get
[TABLE]
and the lemma follows. ∎
It follows from (2.13) that we can choose , , and , all independent of , such that for large we have
[TABLE]
We can assume that the limit
[TABLE]
exists. It follows from (2.20) that for large the sets and are disjoint, so in view of Lemma 2.5 we have
[TABLE]
Thus, we actually face three logical possibilities:
[TABLE]
As , (2.22) clearly implies in measure, a contradiction.
In what follows we show that (2.23) and (2.24) are in fact impossible. The following lemma is crucial.
Lemma 2.7**.**
We have
[TABLE]
Proof.
We have
[TABLE]
Using the coarea formula, we get:
[TABLE]
Fix . Evoking the definition of the relative perimeter, we have
[TABLE]
where is the Gauss-Green measure. Obviously, we have
[TABLE]
for any . It follows from (2.20) that
[TABLE]
so
[TABLE]
and continuing (2.29), we obtain
[TABLE]
Combining this with (2.27) and (2.28), we obtain (2.25). ∎
Let us show that (2.23) is impossible. Assume that it holds.
If at a point we have , , (2.20) guarantees that . Hence, . It follows from (2.23) and (2.21) that , and thus , so we conclude that is uniformly in small when is large. For such large we can apply the isoperimetric inequality:
[TABLE]
Now it follows from (2.20) that , so we have
[TABLE]
Plugging this estimate into (2.25), we obtain
[TABLE]
Estimating
[TABLE]
by virtue of (2.19), we obtain
[TABLE]
where is independent of .
Combining obtained estimate with (2.16), we get:
[TABLE]
whence
[TABLE]
as and the suprema are bounded by (2.12). This contradicts the fact that the left-hand side is a positive constant independent of . Thus, (2.23) is impossible.
It remains to show that (2.24) is also impossible. Assume that it holds.
It is easy to check that in this case we have
[TABLE]
where is independent of and . Indeed, we have the inclusions
[TABLE]
and as in our case the measure of the first and third terms goes to as , we also have
[TABLE]
Now it suffices to apply the isoperimetric equality to if and to otherwise.
Plugging (2.30) into (2.25), we get
[TABLE]
Comparing this with (2.16), we obtain
[TABLE]
As , the left-hand side remains bounded away from 0, while the right-hand side goes to 0, a contradiction.
2.4. Generalisations and specialisations
We start with the remark that Theorem 2.1 can often be applied if is a subset of a space of functions defined on provided that is dense in and the integrals (2.1) and (2.2) are continuous with respect to the topology of . Indeed, if is dense in , we apply the theorem to and proceed by density to make sure that the same constant works for as well. On the other hand, if is not dense in , we replace with its small enlargement in the cone of nonnegative functions in and apply the same reasoning to . A more complicated density argument is used in the proof of Theorem 1.8 given in Section 3.3.
Another question is whether the constant can be chosen uniformly with respect to if the latter triple is allowed to vary over a set . It turns out that Theorem 2.1 can be easily extended to handle this case. Specifically, if the suprema and infima in (2.4)–(2.6) are additionally taken over , the constant can be chosen independently of . The proof remains essentially the same. Assuming the converse, we have violating sequences and such that (2.10) holds with
[TABLE]
Moreover, the functions , , and satisfy (2.11)–(2.14). The rest of the proof can be reused verbatim.
It should also be noted that the bare on the right-hand side of (2.7) can be replaced by a nonnegative function . Of course, in this case it no longer makes sense to require that should consist exclusively of positive functions. The separation from [math] should be taken in the sense that no sequence , where and the sequence is bounded in , converges to [math] in measure. However, if is, for example, an increasing function vanishing at [math], this new condition is clearly equivalent to the original one.
Again, the proof remains essentially unchanged, the sets and being replaced by and , respectively (here ).
Summarising, we have the following strengthened version of Theorem 2.1:
Theorem 2.8**.**
Let be a bounded, connected, open domain in admitting the relative isoperimetric inequality. Let and be an interval (possibly unbounded). Let be a set of tuples such that , , and
[TABLE]
Finally, suppose that a set satisfies the following requirement: for any sequences and such that the sequence is bounded in , the sequence does not converge to [math] in measure. Then there exists a constant depending only on , , and such that
[TABLE]
The proof is left to the reader.
Another option would be to allow for nonnegative instead of strictly positive in Theorem 2.1. In this case one assumes that and that the supremum in (2.4) is taken over and . The resulting inequality differs from (2.7) in that the integral on the right-hand side is taken over . The only modification needed in the proof is that whenever or are integrated over , the domain of integration should be changed to . Note that this does not fit into the previous theorem because can be undefined on .
We conclude by showing that Theorem 2.1 is applicable in a situation relevant for gradient flows. In the subsequent formulation, and denote the derivatives of the functions and , respectively, with respect to their second argument.
Theorem 2.9**.**
Suppose that functions , , and satisfy
[TABLE]
and let be a set of strictly positive functions having the property that no sequence such that is bounded in , converges to [math] in measure. Finally, let and
[TABLE]
Then we have
[TABLE]
where depends on , , , and .
Remark 2.10*.*
Observe that under the hypotheses of Theorem 2.9, the functions and are uniquely determined by . Indeed, if is fixed, as a function of attains its minimum at , so , i. e., , according to (2.38). This uniquely defines , as it follows from (2.39) that strictly decreases with respect to . Now, is the antiderivative of with respect to vanishing at .
Proof.
We check the hypotheses of Theorem 2.8 with , , , and the set consisting of the single tuple . Clearly, we have (2.31), while (2.32)–(2.34) are equivalent to (2.4)–(2.6).
Recalling Remark 2.3, we see that (2.4) holds.
Let us check (2.6). Fix . The function is strictly convex in and attains its zero minimum only at . As , we see that
[TABLE]
On the other hand, as decreases with respect to , we have
[TABLE]
so (2.6) indeed holds.
It remains to check (2.5). Without loss of generality, assume that is such that
[TABLE]
By Cauchy’s mean value theorem, for any , , , we have
[TABLE]
where is some point between and .
By uniform continuity, there exists such that
[TABLE]
implies
[TABLE]
Then from (2.44) and (2.41) we see that
[TABLE]
Further, using (2.45) and (2.42), we have
[TABLE]
whence, recalling that is negative and is decreasing, we conclude
[TABLE]
Now, if , the point also satisfies , so we use (2.46) to conclude from (2.43) that
[TABLE]
If , then either and we again obtain (2.48), or and then we use (2.47) to get
[TABLE]
Thus,
[TABLE]
since the function is continuous and positive on the compact set
[TABLE]
We have showed that (2.5) holds.
Thus, the hypotheses of Theorem 2.8 are fulfilled and the inequality follows. ∎
3. Technicalities
3.1. Positive classical solutions
Let
[TABLE]
be the Heaviside step function.
Lemma 3.1**.**
If nonnegative satisfy the no-flux boundary condition (1.2), then
[TABLE]
where and stand for and , respectively.
Proof.
Without loss of generality, the functions and are defined and smooth on . Consider the set . First let us assume that [math] is a regular value of the function , then the boundary of is smooth. Employing de Giorgi’s Gauss-Green formula [33, Theorem 15.9] and the formula for the Gauss-Green measure of an intersection [33, Theorem 16.3], we compute
[TABLE]
where is the measure-theoretic outward unit normal vector along the reduced boundary of the intersection [33]. Due to the no-flux boundary condition, the last two integrals vanish. On , we have and consequently, . Thus, we can write
[TABLE]
Due to the monotonicity of , we have . We see then that whenever on , is an outward normal vector along . Thus, and equality (3.2) gives (3.1).
In the general case, take a decreasing sequence such that [math] is a regular value of . Set
[TABLE]
By the above, we have
[TABLE]
As is left-continuous, we have
[TABLE]
moreover, it is clear that
[TABLE]
Passing to the limit in (3.3), we obtain (3.1). ∎
Lemma 3.2** (-contraction for positive classical solutions).**
Let and be classical solutions of (1.1)–(1.3) on with different initial data. Suppose that and satisfy
[TABLE]
with some and let be such that
[TABLE]
Then for a. a. ,
[TABLE]
Proof.
We have:
[TABLE]
where and stand for and , respectively. By Lemma 3.1, we have . To estimate , we use (3.4) and the observation that the integrand vanishes where , thus obtaining
[TABLE]
Inequality (3.5) follows. ∎
For , define by
[TABLE]
As is monotonous in , we see that the function is unique, but it does not need to exist for a given . Note that .
Remark 3.3*.*
There is a simple formula for the -norm of :
[TABLE]
It follows from the fact that due to the monotonicity of , the inequality or, equivalently, for all , holds if and only if for all , i. e., when
[TABLE]
Remark 3.4*.*
If (1.11) holds, for any the function with
[TABLE]
is well-defined and a. e. in . Indeed, if the second alternative in (1.11) holds, for any , the function assumes all the values in the interval as varies over ; in particular, attains the value . If, on the other hand, the first alternative in (1.11) holds, take such that is independent of and negative. Clearly, for any fixed , the function takes all the values in the interval as varies over . Now it suffices to observe that due to the monotonicity of , we have . One can prove in the same way that if (1.12) holds, for any function essentially bounded away from [math] on , there exists such that a. e. in , and .
Remark 3.5*.*
It follows from Remarks 3.4 and 3.3 that if (1.11) holds, the right-hand side of (1.23) is finite for any .
Lemma 3.6** (Restricted -contraction).**
Let be a classical solution of (1.1)–(1.3) on . Then for we have
[TABLE]
and likewise, for we have
[TABLE]
provided that exists.
Proof.
Let us prove (3.9) for . Computing the derivative of the left-hand side, for a. a. we get
[TABLE]
As , we can use Lemma 3.1 to get . Now, the integrand of can only be non-zero where , in which case due to the monotonicity of ; consequently, . Thus, we have
[TABLE]
and (3.9) follows. Inequality (3.10) is proved in much the same way. ∎
Lemma 3.7**.**
Suppose that satisfies (1.11) and (1.12). Then for any smooth satisfying the non-flux boundary condition, problem (1.1)–(1.3) has a classical solution.
Proof.
Equation (1.1) can be cast in the form
[TABLE]
If we show that a classical solution is a priori bounded and stays away from 0, we can ignore the fact that the coefficient can be degenerate or singular at and infer the existence of the solution from the classical theory of quasilinear parabolic equations.
Indeed, according to Remark 3.4, we can find and such that and
[TABLE]
Then it follows from Lemma 3.6 that
[TABLE]
providing the required bounds. ∎
3.2. Positive initial data
If the initial data (1.3) is bounded away from [math], we approximate it with smooth functions and prove the existence and uniqueness of weak solutions to (1.1)–(1.3) stated in Theorem 3.9 below.
Lemma 3.8**.**
Suppose that satisfies the energy inequality (1.19) in the sense of measures; then
[TABLE]
where is determined by an upper bound on .
Proof.
The function
[TABLE]
has a non-positive derivative in the sense of measures, so it a. e. coincides with a non-increasing function. In other words, for a. a. , , we have
[TABLE]
An upper bound on defines essential upper bounds on , , , and , so for a. a. we can estimate
[TABLE]
whence
[TABLE]
Passing to the essential upper limit as and estimating , we obtain
[TABLE]
whence (3.11) and (3.12) follow. ∎
Theorem 3.9** (Solvability for positive data).**
Suppose that satisfies (1.4)–(1.10) as well as (1.11) and (1.12). Then for any such that
[TABLE]
with some constant , there exists a unique weak solution
[TABLE]
satisfying the following properties: i) the upper bound (1.23) and lower bound (1.26); ii) the energy and entropy dissipation inequalities as well as (1.24) and (1.25); iii) the restricted contraction
[TABLE]
whenever is defined; iv) if is another such solution with the initial data , the -contraction holds:
[TABLE]
where is defined by (3.4).
Proof.
Let be a sequence of smooth functions satisfying the no-flux boundary condition such that
[TABLE]
and
[TABLE]
Let be the classical solution of (1.1)–(1.3) on with the initial data . For any , it follows from Lemma 3.2 that
[TABLE]
so is a Cauchy sequence in . As is arbitrary, we see that converges in to some function . We claim that it is the sought-for solution.
By Remark 3.4, there exists () such that ; then dominates the initial data and thus, the solutions as well, which follows from Lemma 3.6. Consequently, the sequence is bounded in , so it converges to weakly* in this space, whence .
Put
[TABLE]
Fix . As the sequence is bounded in , so are the sequences , , , , , and . Thus, there is no loss of generality in assuming
[TABLE]
where we write for , etc. It follows from (3.18) that in the sense of distributions. The approximate solutions satisfy the energy inequality and (1.24) while their initial energy is bounded, so we see from (3.12) that the sequence is bounded in . Consequently, and
[TABLE]
Let us check that is a weak solution of (1.1)–(1.3) on . Take an admissible test function . Writing the weak setting for the approximate solution, we have
[TABLE]
It follows from (3.17), (3.18), and (3.19) that we can pass to the limit in (3.20) and obtain (1.15) for . Thus, is indeed a weak solution.
Let us show that satisfies the energy inequality on in the sense of measures. Taking a smooth nonnegative test function vanishing outside of , we write the energy inequality in the sense of measures for the approximate solutions:
[TABLE]
Convergences (3.18) ensure that we can pass to the limit in all the terms but for the first one on the right-hand side. As for the latter, it follows from (3.19) that weakly in , whence
[TABLE]
and the energy inequality follows.
Let us check (1.24). The approximate solutions satisfy
[TABLE]
so by virtue of (3.11) we obtain
[TABLE]
It follows from (3.17) and (3.18) that
[TABLE]
so we get
[TABLE]
Now sending we recover (1.24).
Let us show that satisfies the entropy dissipation inequality on in the sense of measures. Let be a smooth nonnegative test function vanishing outside of . The approximate solutions satisfy the entropy dissipation inequality in the sense of measures, so we have
[TABLE]
Consequently, for any we have
[TABLE]
Observe that
[TABLE]
We claim that
[TABLE]
Then, taking into account (3.18), we can pass to the limit in (3.21) obtaining
[TABLE]
On the set we have (by virtue of (1.10)), and , whence also a. e. on this set. Thus, we can write
[TABLE]
Letting , by Beppo Levi’s theorem we obtain the energy inequality.
To prove the technical claim (3.27), we use a variant of the Banach-Alaoglu theorem in varying spaces:
Lemma 3.10**.**
Let be an open set, a sequence of finite non-negative Radon measures narrowly converging to , and a sequence of vector fields on . If
[TABLE]
then there exists such that, up to extraction of some subsequence,
[TABLE]
and
[TABLE]
The proof of this fact by optimal transport techniques can be found in [2]; this lemma also follows from a variant of the Banach-Alaoglu theorem [25, Proposition 5.3]. We will apply this lemma with , from (3.26), and the sequence of measures , which converges narrowly to due to the strong convergence (3.25). Extracting a subsequence if needed, we see that there is a vector-field verifying (3.28) and (3.29). On the other hand, by (3.25) and (3.26),
[TABLE]
weakly in . Evoking (3.28), we find that
[TABLE]
for all test functions . By density, we conclude that in , and (3.27) follows from (3.29).
Inequality (1.25) is proved in the same way as (1.24) given that it holds for the approximate solutions.
Inequalities (3.13)–(3.15) follow from correspondent inequalities for approximate solutions (Lemmas 3.2 and 3.6), as we obviously have
[TABLE]
where the approximations are constructed in the same way as .
Contraction (3.15) implies the uniqueness of .
To obtain the upper bound (1.23), we define by (3.8) and thus have on , whence in view of contraction (3.13),
[TABLE]
Recalling the formula (3.7) for the norm of , we obtain the upper bound.
To obtain the lower -bound (1.26), we take in (3.14), obtaining
[TABLE]
as required. ∎
3.3. Nonnegative initial data
If initial data (1.3) is only nonnegative, we approximate it with positive functions and reuse the proof of Theorem 3.9 to establish the existence of solutions to (1.1)–(1.3) as stated in Theorem 1.9 (but not uniqueness, owing to the loss of contraction).
Proof of Theorem 1.9.
Take a decreasing sequence and set
[TABLE]
By Theorem 3.9, there exists a weak solution of (1.1)–(1.3) with the initial data . Contraction (3.15) ensures the comparison principle for this sequence of solutions, whence a. e. in . Consequently, there exists the monotone limit and moreover, we obviously have the convergences (3.18). From this moment on, the proof copies that of Theorem 3.9 except that (3.13) and (3.14) hold almost everywhere rather then everywhere. ∎
We conclude by proving Theorems 1.8 and 1.13.
Proof of Theorem 1.8.
Let and consider the function implicitly defined by the equation
[TABLE]
As is monotonous with respect to its second argument, is uniquely defined. Clearly, is .
Fix . We claim that there exists a sequence of functions such that
[TABLE]
Indeed, take a sequence , where and , put , and let be the mollification of . Observe that is strictly positive and so is . It suffices to show that for any sufficiently large there exists such that whenever , we have
[TABLE]
If the second alternative in (1.11) holds, for every we have
[TABLE]
as . This implies that , so (3.30) obviously holds with any .
Assume the first alternative in (1.11). Take such that does not depend on if and set
[TABLE]
We have:
[TABLE]
Thus, for large we have
[TABLE]
Upon mollification,
[TABLE]
For a fixed , the function is continuous on , so the mollifications converge to it uniformly on as . Consequently,
[TABLE]
for all , proving (3.30).
Taking a sequence as above, we can set , so that . Clearly, and . Further, the sequence is bounded in because so is , and due to the continuity of we have
[TABLE]
As a consequence, for and we have
[TABLE]
where we write for , etc. In particular, there is no loss of generality in assuming a lower bound
[TABLE]
(positivity by virtue of (1.21)), where is obviously independent not only of but of as well.
Define
[TABLE]
By Theorem 2.9, there exist a function
[TABLE]
where , and a constant such that
[TABLE]
In particular, as , we see that
[TABLE]
where .
Let us check that we can pass to the limit in (3.33). First, it follows from (3.32) that
[TABLE]
Next, note that we clearly have
[TABLE]
and thus, again using (3.32), we obtain
[TABLE]
Finally, as is smooth and positive, we can write
[TABLE]
On the set we have by (1.10), , and , the last equality implying a. e. on . Thus, we can write
[TABLE]
To sum up, we have
[TABLE]
which is even stronger than (1.22). ∎
Proof of Theorem 1.13.
Let be the set of functions such that for any , we have and . By Theorem 1.8 we have the entropy-entropy production inequality (1.22) for .
Let be a weak solution of (1.1)–(1.3) with the initial data satisfying (1.28). It follows from the lower -bound (1.26) that for a. a. . Combining the entropy dissipation and entropy-entropy production inequalities, we obtain
[TABLE]
Letting , we see that in the sense of measures, whence a. e. coincides with a nonincreasing function. Moreover,
[TABLE]
by virtue of (1.25), so for a. a. , yielding (1.27) with . ∎
Acknowledgment
The idea of this paper originated from conversations of the second author with Goro Akagi and Yann Brenier during a stay at ESI in Vienna. He would like to thank Goro Akagi and Yann Brenier for the inspiring discussions and correspondence, Ulisse Stefanelli for the invitation to the thematic program Nonlinear Flows at ESI, and ESI for hospitality. The research was partially supported by the Portuguese Government through FCT/MCTES and by the ERDF through PT2020 (projects UID/MAT/00324/2019, PTDC/MAT-PUR/28686/2017 and TUBITAK/0005/2014).
Conflict of interest: none
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. W. Alt and S. Luckhaus. Quasilinear elliptic-parabolic differential equations. Math. Z. , 183(3):311–341, 1983.
- 2[2] L. Ambrosio, N. Gigli, and G. Savaré. Gradient Flows: in Metric Spaces and in the Space of Probability Measures . Basel: Birkhäuser Basel, 2008.
- 3[3] V. Barbu. Generalized solutions to nonlinear Fokker-Planck equations. J. Differential Equations , 261(4):2446–2471, 2016.
- 4[4] M. Bertsch and D. Hilhorst. A density dependent diffusion equation in population dynamics: stabilization to equilibrium. SIAM J. Math. Anal. , 17(4):863–883, 1986.
- 5[5] T. Bodineau, J. Lebowitz, C. Mouhot, and C. Villani. Lyapunov functionals for boundary-driven nonlinear drift-diffusion equations. Nonlinearity , 27(9):2111–2132, 2014.
- 6[6] R. S. Cantrell, C. Cosner, Y. Lou, and C. Xie. Random dispersal versus fitness-dependent dispersal. J. Differential Equations , 254(7):2905–2941, 2013.
- 7[7] J. A. Carrillo, A. Jüngel, P. A. Markowich, G. Toscani, and A. Unterreiter. Entropy dissipation methods for degenerate parabolic problems and generalized Sobolev inequalities. Monatsh. Math. , 133(1):1–82, 2001.
- 8[8] L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. An interpolating distance between optimal transport and Fisher–Rao metrics. Foundations of Computational Mathematics , 18(1):1–44, 2018.
