On the total variation Wasserstein gradient flow and the TV-JKO scheme
Guillaume Carlier, Clarice Poon

TL;DR
This paper investigates the JKO scheme for total variation, characterizes its optimizers, and proves convergence to a nonlinear fourth-order PDE under certain boundedness conditions, with results specific to one-dimensional and radially symmetric cases.
Contribution
It provides a detailed analysis of the TV-JKO scheme, including optimizer properties and convergence results, extending understanding of total variation gradient flows.
Findings
Characterization of optimizers for the TV-JKO scheme
Proof of maximum and minimum principles in certain cases
Convergence to a fourth-order nonlinear PDE with bounded density assumptions
Abstract
We study the JKO scheme for the total variation, characterize the optimizers, prove some of their qualitative properties (in particular a form of maximum principle and in some cases, a minimum principle as well). Finally, we establish a convergence result as the time step goes to zero to a solution of a fourth-order nonlinear evolution equation, under the additional assumption that the density remains bounded away from zero. This lower bound is shown in dimension one and in the radially symmetric case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the total variation Wasserstein gradient flow and the TV-JKO scheme
Guillaume Carlier Ceremade, UMR CNRS 7534, Université Paris Dauphine, Pl. de Lattre de Tassigny, 75775, Paris Cedex 16, France, and MOKAPLAN, INRIA-Paris, E-mail: [email protected]
Clarice Poon Centre for Mathematical Sciences, University of Cambridge, Wilberforce Rd, Cambridge CB3 0WA, United Kingdom, Email: [email protected]
Abstract
We study the JKO scheme for the total variation, characterize the optimizers, prove some of their qualitative properties (in particular a form of maximum principle and in some cases, a minimum principle as well). Finally, we establish a convergence result as the time step goes to zero to a solution of a fourth-order nonlinear evolution equation, under the additional assumption that the density remains bounded away from zero. This lower bound is shown in dimension one and in the radially symmetric case.
Keywords: total variation, Wasserstein gradient flows, JKO scheme, fourth-order evolution equations.
MS Classification: 35G31, 49N15.
1 Introduction
Variational schemes based on total variation are extremely popular in image processing for denoising purposes, in particular the seminal work of Rudin, Osher and Fatemi [25] has been extremely influential and is still the object of an intense stream of research, see [10] and the references therein. Continuous-time counterparts are well-known to be related to the gradient flow of the total variation, see Bellettini, Caselles and Novaga [3] and the mean-curvature flow, see Evans and Spruck [14]. The gradient flow of the total variation for other Hilbertian structures may be natural as well and in particular the case, leads to a singular fourth-order evolution equation studied by Giga and Giga [15], Giga, Kuroda and Matsuoka [16]. In the present work, we consider another metric, namely the Wasserstein one.
Given an open subset of and , recall that the total variation of is given by
[TABLE]
and is by definition the subspace of consisting of those ’s in such that is finite. The following fourth-order nonlinear evolution equation
[TABLE]
supplemented by the zero-flux boundary condition
[TABLE]
has been proposed in [7] for the purpose of denoising image densities. Numerical schemes for approximating the solutions of this equation have been investigated in [7, 13, 4]. One should consider weak solutions and in particular interpret the nonlinear term as the negative of an element of the subdifferential of at .
At least formally, when is a probability density on , (1.2)-(1.3) can be viewed as the Wasserstein gradient flow of (we refer to the textbooks of Ambrosio, Gigli, Savaré [1] and Santambrogio [26], for a detailed exposition). Following the seminal work of Jordan, Kinderlehrer and Otto [17] for the Fokker-Planck equation, it is reasonable to expect that solutions of (1.2) can be obtained, at the limit , of the JKO Euler implicit scheme:
[TABLE]
where is the space of Borel probability measures with finite second moment and is the quadratic Wasserstein distance:
[TABLE]
denoting the set of transport plans between and i.e. the set of probability measures on having and as marginals. Our aim is to study in detail the discrete TV-JKO scheme (1.4) as well as its connection with (suitable weak solutions) of the PDE (1.2). Although the assertion that (1.2) is the TV Wasserstein gradient flow is central to the numerical schemes described in [7, 13, 4], there has been so far, to the best of our knowledge, no theoretical justification of this fact.
Fourth-order equations which are Wasserstein gradient flows of functionals involving the gradient of , such as the Dirichlet energy or the Fisher information, have been studied by McCann, Matthes and Savaré [22] who found a new method, the flow interchange technique, to prove higher-order estimates, we refer to [18] for a recent reference on this topic. The total variation is however too singular for such arguments to be directly applicable, as far as we know. We shall prove the convergence of JKO steps as under the extra assumption that densities remain bounded aways from zero. Whether this extra assumption is reasonable or not is related to a minimum principle issue, interesting in its own right, namely the monotonicty of the infimum along JKO steps. We shall see that, in a convex domain, JKO steps obey a maximum principle (the maximum of the density is nonincreasing along JKO steps). The corresponding minimum principle seems more difficult to prove and we have been able to establish it only in some particular cases, namely in dimension one and in the radially symmetric case, eventhough we conjecture it is satisfied in more general situations.
The paper is organized as follows. In section 2, we start with the discussion of a few examples. Section 3 establishes optimality conditions for JKO steps thanks to an entropic regularization scheme. Section 4 is devoted to some properties of solutions of JKO steps and in particular a maximum principle based on a result of [11], we also establish a minimum principle in dimension one and in the radially symmetric case. Finally, in section 5, we prove a conditional convergence result, we establish convergence of the JKO scheme, as , under the extra assumption that the density remains away from zero, this covers the unidimensional case as well as the radially symmetric case when the initial conditon is strictly positive.
2 Some examples
We first recall the Kantorovich dual formulation of :
[TABLE]
an optimal pair for this problem is called a pair of Kantorovich potentials. The existence of Kantorovich potentials is well-known and such potentials can be taken to be conjugates of each other, i.e. such that
[TABLE]
which implies that and are semi-concave (more precisely is convex). If is absolutely continuous with respect to the -dimensional Lebesgue measure, is differentiable a.e. and the map is the gradient of a convex function pushing forward to which is in fact the optimal transport between and thanks to Brenier’s theorem [5]. In such a case, we will simply refer to as a Kantorovich potential between and . We refer the reader to [28] and [26] for details.
In this section, we will consider some explicit examples which rely on the following sufficient optimality condition (details for a rigorous derivation of the Euler-Lagrange equation for JKO steps will be given in section 3) in the case of the whole space i.e. . Let us also recall that by Sobolev inequality is continuously embedded in .
Lemma 2.1**.**
Let , and (so is the total variaton on the whole space), if is such that
[TABLE]
where is a Kantorovich potential between and and , with , (so that ), and
[TABLE]
Then, setting
[TABLE]
one has
[TABLE]
Proof.
For all , , and it follows from the Kantorovich duality formula that
[TABLE]
The claim then directly follows from (2.2). ∎
2.1 The case of a characteristic function
A simple illustration of Lemma 2.1 in dimension 1 concerns the case of a uniform , (here and in the sequel we shall denote by the characteristic function of the set ):
[TABLE]
It is natural to make the ansatz that the minimizer of defined by (2.4) remains of the form for some . The optimal transport between and being the linear map , a direct computation gives
[TABLE]
which is minimal when is the only root in of
[TABLE]
To check that this is the correct guess, we shall check that the conditions of Lemma 2.1 are met. It is easy to check that the potential defined by
[TABLE]
is a Kantorovich potential between and . Define111The guess for this construction is by integrating the Euler-Lagrange equation on the support of . then by
[TABLE]
extended by on and on . By construction (use the fact that it is odd and nondecreasing on thanks to (2.5)), also so that and , and one easily checks that (here and in the sequel denotes the Radon measure which is the distributional derivative of the function ). Moreover with an equality on . The optimality of then directly follows from Lemma 2.1.
Of course, the argument can be iterated so as to obtain the full TV-JKO sequence:
[TABLE]
where is defined inductively by
[TABLE]
which is nothing but the implicit Euler discretization of the ODE
[TABLE]
whose solution is . Extending in a piecewise constant way: for , it is not difficult to check that converges (in and in for any and any ) to given by . Since is the velocity field associated to , solves the continuity equation
[TABLE]
In addition, where
[TABLE]
extended by (respectively ) on (respectively ). The function is , and (in the sense of measures). In other words the limit of satisfies
[TABLE]
with and which is the natural weak form of (1.2) since in dimension one.
2.2 Instantaneaous creation of discontinuities
We now consider the case where and will show that the JKO scheme instantaneously creates a discontinuity at the level of , the minimizer of when is small enough. We indeed look for in the form:
[TABLE]
for some well-chosen . The optimal transport map between such a and is odd and given explicitly by
[TABLE]
The Kantorovich potential which vanishes at (extended in an even way to ) is then given by
[TABLE]
where
[TABLE]
Let us now integrate on with initial condition , i.e. for
[TABLE]
Note that is nondecreasing on (because , and is convex on so that on ), our aim now is to find in such a way that i.e. replacing in the previous formula
[TABLE]
the right hand-side is a continuous function of taking value [math] for and for , hence as soon as one may find a such that indeed . Extend then by on and to in an odd way. We then have built a function which is (), such that , and such that . Thanks to Lemma 2.1, we conclude that is optimal. This example shows that discontinuities may appear at the very first iteration of the TV-JKO scheme.
3 Euler-Lagrange equation for JKO steps
The aim of this section is to establish optimality conditions for (3.1). Despite the fact that it is a convex minimization problem, it involves two nonsmooth terms and , so some care should be taken of to justify rigorously the arguments. In the next subsection, we introduce an entropic regularization, the advantage of this strategy is that the minimizer will be positive everywhere, giving some differentiability of the transport term.
3.1 Entropic approximation
In this whole section, we assume that is an open bounded connected (not necessarily convex) subset of with Lipschitz boundary and denote by the set of Borel probability measures on that are absolutely continuous with respect to the Lebesgue measure (and will use the same notation for both for the measure and its density). Given and , we consider one step of the TV-JKO scheme:
[TABLE]
It is easy by the direct method of the calculus of variations to see that (3.1) has at least one solution, moreover being convex and being strictly convex whenever (see [26]), the minimizer is in fact unique, and in the sequel we denote it by . Given we consider the following approximation of (3.1):
[TABLE]
where
[TABLE]
It is easy to to see that (3.2) admits a unique solution . Moreover, since is bounded, is lower bounded, hence is bounded. Recalling that the embedding is compact for every , one may therefore (up to extraction) assume that converges as a.e. and strongly in for every to some , which, by a standard -convergence argument, is easily seen to be the solution of (3.1). The advantage of this regularization is that not only each is bounded from below but also that is bounded from below uniformly in (but not in which is fixed throughout this section):
Proposition 3.1**.**
Up to passing to a subsequence, the family is uniformly bounded from below. Moreover, is bounded in for any and converges strongly to [math] in for any .
Proof.
Let be such that the set has positive measure and finite perimeter (recall that ). Let us assume that there is an such that
[TABLE]
and
[TABLE]
We aim to show that cannot be arbitrarily small. Define then that is on and elsewhere. Defining and observing that , we see that (3.3) implies that and so that and are disjoint. Finally, set
[TABLE]
See Figure 2, where we set .
By construction hence , in this difference we have four terms, namely
- •
the Wasserstein term, which, using the Kantorovich duality formula (2.1) and the fact that is bounded can be estimated in terms of :
[TABLE]
for a constant that depends on but neither on nor ,
- •
the TV term: : outside we have replaced by a -Lipschitz function of which decreases the TV semi-norm, on on the contrary we have created a jump of magnitude so
[TABLE]
where denotes the perimeter of (in ),
- •
the entropy variation on , on this set both and are less than so that whenever which by the mean value theorem yields
[TABLE]
- •
the last term is the entropy variation on . It is convenient to split into and . The entropy variation on the first part is easy to control. Indeed, is nondecreasing on . Since, on , , we have . As for the second part, we observe that , so on this set, both and remain in the interval . We thus have
[TABLE]
where
[TABLE]
Putting together (3.6)-(3.7)-(3.8)-(3.9), we arrive at
[TABLE]
which for small enough is possible only when i.e. . More precisely, either we have the lower bound:
[TABLE]
or (3.3) is impossible i.e. . To prove that is bounded from below uniformly in , it is therefore enough to show that we can find a family , bounded and bounded away from [math], such that remains bounded away from [math], and is uniformly bounded from above as . First note that, since is bounded, there exists such that in and a.e. up to a subsequence, note also that and is a probability density. Setting , , if , since converges a.e. to , we have a.e. . It then follows from Fatou’s Lemma that when , , hence choosing so that , we deduce that there exists and such that for all and all , we have . For an upper bound on perimeters, we observe that since , thanks to the co-area formula, we have
[TABLE]
So, there exists such that
Finally, since converges in , we may assume that, up to a subsequence, for some (see Theorem IV.9 in [6]). Then, by Dominated convergence and since for every , we have that converges a.e. and in , in particular this implies that converges to [math] strongly in , and we have just seen that is bounded in .
∎
Let us also recall some well-known facts (see [9]) about the total variation functional viewed as a convex l.s.c. and one-homogeneous functional on . Define
[TABLE]
where on are to be understood in the weak sense
[TABLE]
Note that is closed and convex in and is its support function:
[TABLE]
As for the Wasserstein term, recalling Kantorovich dual formulation (2.1), the derivative of the Wasserstein term term will be expressed in terms of a Kantorovich potential between and .
We then have the following characterization for :
Proposition 3.2**.**
There exists such that for every , , on , and
[TABLE]
where is the Kantorovich potential between and .
Proof.
Let such that . Thanks to Proposition 3.1, we know that is bounded away from [math] hence for small enough , is positive hence a probability density. Also, as a consequence of Theorem 1.52 in [26], we have that
[TABLE]
where is the (unique up to an additive constant) Kantorovich potential between and , in particular is Lipschitz and semi concave ( in the sense of measures and is the optimal transport between and ). By the optimality of and the fact that is a semi-norm, we get
[TABLE]
where
[TABLE]
Since is defined up to an additive constant, we may chose it in such a way that has zero mean, doing so, (3.16) holds for any (not necessarily with zero mean). Being Lipschitz, is bounded, also observe that is in for every since and is thanks to Proposition 3.1, hence we have for every .
By approximation and observing that , (3.16) extends to all . In particular, we have
[TABLE]
but since is convex and closed in , it follows from Hahn-Banach’s separation theorem that . Finally, getting back to (3.16) (without the zero mean restriction on ) and taking gives , and we then deduce that this should be an equality.
∎
3.2 Euler-Lagrange equation
We are now in position to rigorously establish the Euler-Lagrange equation for (3.1):
Theorem 3.3**.**
If solves (3.1), there exists a Kantorovich potential between and (in particular is the optimal transport between and ), , and such that
[TABLE]
and
[TABLE]
Remark 3.4*.*
It is not difficult (since (3.1) is a convex problem) to check that (3.17)-(3.18) are also sufficient optimality conditions. The main point here is that the right hand side in (3.17) which is a multiplier associated with the nonnegativity constraint is better than a measure, it is actually an function.
Proof.
As in section 3.1, we denote by the solution of the entropic approximation (3.2). Up to passing to a subsequence (not explicitly written), we may assume that converges a.e. and strongly in (for any ) to (the solution of (3.1), again by a standard -convergence argument). We then rewrite the Euler-Lagrange equation from Proposition 3.2 as
[TABLE]
where , , and
[TABLE]
It follows from Proposition 3.1 that converges to [math] strongly in any , and that is bounded in . Up to subsequences, we may therefore assume that and weakly- converge in respectively to some and with , on and . As for the Kantorovich potentials , since the transport map a.e. takes values in we have , hence is an equi-Lipschitz family because is bounded. Moreover which remains bounded, hence we may assume that converges uniformly to some potential and it is well-known (see [26]) that is a Kantorovich potential between and . Letting tend to [math] gives (3.17).
Since converges strongly in to and converges weakly- to in we have
[TABLE]
hence . Thanks to (3.13), we obviously have (since , ), for the converse inequality, it is enough to observe that
[TABLE]
and that converges to weakly in for every . Since converges strongly to in when we deduce that which completes the proof of (3.18).
∎
A first consequence of the high integrability of is that one can give a meaning to for any . Indeed, if and denotes its conjugate exponent, following Anzellotti [2], if and is such that , one can define the distribution by
[TABLE]
Then is a Radon measure which satisfies (in the sense of measures) hence is absolutely continuous with respect to . Moreover one can also define a weak notion of normal trace of , such that the following integration by parts formula holds
[TABLE]
We refer to [2] for proofs. These considerations of course apply to and and in particular enable one to see as a measure and to interpret the optimality condition as in the sense of measures. Finally, the fact that in Theorem 3.3 and the theory of variational mean curvature (see Tamanini [27], Massari [20, 21], Theorem 3.6 of Gonzalez and Massari [19]) allows for conclusions about the regularity of the level sets, of , the solution of (3.1), we do not elaborate this regularity (which, anyway, only holds for fixed time step ) further here.
4 Maximum and minimum principles for JKO steps
Throughout this section, we further assume that is a convex open bounded subset of , our aim is to establish bounds on the TV-JKO iterates given by (3.1). Since, the TV-JKO scheme aims at minimizing total variation at the fastest rate in the Wasserstein metric, it is natural to wonder whether when the initial condition is bounded from above and from below then the JKO-iterates remain so (with the same bounds). We shall answer affirmatively for the upper bound (maximum principle), as for the propagation of the lower bound (minimum principle), we have been able to prove it only in special cases (dimension one and radially symmetric setting).
4.1 Convexity along generalized geodesics
Our aim is to deduce some bounds on from bounds on . To do so, we shall combine some convexity arguments and a remarkable estimate due to De Philippis et al. [11]. First we recall the notion of generalized geodesic from Ambrosio, Gigli and Savaré [1]. Given , and in , and denoting by (respectively ) the optimal transport (Brenier) map between and (respectively , the generalized geodesic with base joining to is by definition the curve of measures:
[TABLE]
A key property of these curves introduced in [1] is the strong convexity of the squared distance estimate:
[TABLE]
It is well-known that if : is a proper convex lower semi-continuous (l.s.c.) internal energy density, bounded from below such that and which satisfies McCann’s condition (see [23])
[TABLE]
then defining the generalized geodesic curve by (4.1), one has
[TABLE]
In particular and uniform bounds are stable along generalized geodesics:
[TABLE]
and
[TABLE]
An immediate consequence of (4.2) (see chapter 4 of [1] for general contraction estimates) is the following
Lemma 4.1**.**
Let be a nonempty subset of , let , , if is a Wasserstein projection of onto , and if the generalized geodesic with base joining to remains in then
[TABLE]
Proof.
Since we have , applying (4.2) to the generalized geodesics with base joining to we thus get
[TABLE]
dividing by and then taking therefore gives the desired result.
∎
The other result we shall use to derive bounds is a estimate of De Philippis et al. [11], which states that, given, , and : , proper convex l.s.c., the solution of
[TABLE]
is with the bound
[TABLE]
Taking in particular,
[TABLE]
this implies that the Wasserstein projection of onto the set defined by the constraint has a smaller total variation than .
4.2 Maximum principle
Theorem 4.2**.**
Let and let be the solution of (3.1), then with
[TABLE]
Proof.
Thanks to (4.5) the set has the property that the generalized geodesics (with any base) joining two of its points remains in . Let then be the projection of onto i.e. the solution of . Thanks to Lemma 4.1 we have and thanks to Theorem 1.1 of [11], . The optimality of for (3.1) therefore implies i.e. .
∎
Remark 4.3*.*
In section 3, we have used an approximation of (3.1) with an additional small entropy term, the same bound as in Theorem 4.2 will remain valid in this case. Indeed, consider a proper convex l.s.c. and bounded from below internal energy density and consider given , the variant of (3.1)
[TABLE]
Then we claim that the solution still satisfies . Indeed we have seen in the previous proof that the Wasserstein projection of onto the constraint both diminishes and the Wasserstein distance to . It turns out that it also diminishes the internal energy. Indeed, thanks to Proposition 5.2 of [11], there is a measurable set such that , it thus follows that . So, from the convexity of and Jensen’s inequality,
[TABLE]
thus yielding the same conclusion as above.
4.3 Minimum principle in special cases
In dimension one, it turns out that we can obtain bounds from below by the same convexity arguments as for the maximum principle of Theorem 4.2:
Proposition 4.4**.**
Assume that , that is a bounded interval and that a.e. on then the solution of (3.1) also satifies a.e. on .
Proof.
The proof is similar to that of Theorem 4.2 but using the Wasserstein projection on the set , the only thing to check to be able to use Lemma 4.1 is that for any basepoint and any and in , the generalized geodesic with base point joining to remains in . The optimal transport maps and from to and respectively are nondecreasing and continuous and setting , one has
[TABLE]
which is easily seen to imply that a.e.. ∎
As a consequence of the previous minimum principle, integrating the Euler-Lagrange equation one can deduce higher regularity for the dual variable :
Corollary 4.5**.**
Assume that and is a bounded interval. If solves (3.1) and is as in Theorem 3.3 then . If in addition a.e. on , then .
Proof.
The first claim is obvious because both and (, and are as in Theorem 3.3) are bounded hence so is . As for the second one when , thanks to Proposition 4.4, we also have hence in (3.17) and in this case is Lipschitz i.e. . One can actually go one step further because where is the optimal (monotone) transport between and . This map is explicit in terms of the cumulative distribution function of , , and the inverse of , the cumulative distribution function of , namely . But is Lipschitz since its derivative is which is hence bounded and is Lipschitz as well since . This gives that hence . ∎
The proof of Proposition 4.4 unfortunately does not generalize to higher dimensions, because densities which are bounded from below by are not stable by generalized geodesics. In the radially symmetric case, we can use the Euler-Lagrange equation to derive a minimum principle. We believe that JKO steps preserve lower bounds in more general situations but have not been able to prove it.
Proposition 4.6**.**
Assume that is the ball centered at [math] or radius in , and that is radially symmetric with a.e. on then the solution of (3.1) also satifies a.e. on .
Proof.
Let us write with , since (3.1) is invariant by rotation and strictly convex, it is easy to see that its unique solution is also radially symmetric, let us write it as . Denoting by the -Hausdorff measure of the unit sphere , and setting , , observe that is the minimizer of the one-dimensional convex functional
[TABLE]
among nonnegative densities on such that and is a bounded Radon measure on . Arguing as in the proof of Theorem 3.3, the minimizer is characterized by the Euler-Lagrange equation
[TABLE]
where is a Kantorovich potential between and and is such that
[TABLE]
Note that (4.12) implies that is Lipschitz so that is locally Lipschitz and
[TABLE]
Since , we can perform a Hahn-Jordan decomposition of :
[TABLE]
and set
[TABLE]
Next, we observe that, using (4.14), we have , we thus deduce that -a.e and since is continuous we actually have on . In a similar way, on .
Now let us show that . Assume, by contradiction, that the set where has positive measure in , and let be a continuity point of such that , define then
[TABLE]
We then have . Let us assume that , we claim then that since otherwise, would be nondecreasing in a neighbourhood of which would imply for small , contradicting the definition of , we thus have . Since is in a neigbourhood of , it has a right and a left limit at , again by minimality of , the left limit of at cannot be strictly smaller than , so there is an such that on . Hence on , (4.12) becomes
[TABLE]
moreover, on , is actually of class with where is the (continuous) optimal transport between and obtained by the relation (where is the cumulative distribution function of for ). One can therefore differentiate (4.17) on so as to obtain
[TABLE]
Since is maximal at , we first have
[TABLE]
but recalling (4.12) we also have
[TABLE]
which shows that is differentiable at with , this enables us to deduce that , with (4.18) this gives
[TABLE]
If , since , the same conclusion is reached with an equality. In a similar way, we obtain (again with an equality in case ). Using the fact that on (with strict inequality in a neighbourhood of ) together with and , we get
[TABLE]
which yields the desired contradiction.
∎
Let us remark that the proof of Proposition 4.6 gives an alternative proof of the minimum principle in dimension one.
5 Convergence of the TV-JKO scheme under a lower bound estimate
We are now interested in the convergence of the TV-JKO scheme to a solution of the fourth-order nonlinear equation (1.2) as the time step goes to [math]. Throughout this section, we assume that is a bounded open convex subset of and that the initial condition satisfies
[TABLE]
We fix a time horizon , and for small , define the sequence by
[TABLE]
for with . Thanks to Theorem 4.2, (5.1) ensures that the JKO-iterates defined by (5.2) also remain bounded . We shall also assume that remains bounded from below by :
[TABLE]
which holds, as we have seen in subsection 4.3 when or when is a ball and is radially symmetric.
We extend this discrete sequence by piecewise constant interpolation i.e.
[TABLE]
We shall see that converges to a solution of
[TABLE]
with the no-flux boundary condition
[TABLE]
Let us introduce the spaces
[TABLE]
Since is no more than in , one has to be slightly cautious in the meaning of which be conveniently done by interpreting this term as the negative of an element in the subdifferential of (in the sense). For every let us define
[TABLE]
This leads to the following definition:
Definition 5.1**.**
A weak solution of (5.5)-(5.6) is a such that there exists with
[TABLE]
and is a weak solution of
[TABLE]
i.e. for every
[TABLE]
We then have
Theorem 5.2**.**
If satisfies (5.6) and the JKO iterates obey the lower bound (5.3), there exists a vanishing sequence of time steps such that the sequence constructed by (5.2)-(5.4) converges strongly in for any and in to a weak solution of (5.5)-(5.6).
Proof.
First, being , we have a uniform bound on thanks to Theorem 4.2, and from our extra lower bound assumption (5.3) we have
[TABLE]
Moreover, by construction of the TV-JKO scheme (5.2), one has
[TABLE]
By using an Aubin-Lions type compactness Theorem of Savaré and Rossi (Theorem 2 in [24]), the fact that the embedding of into is compact for every as well as a refinement of Arzèla-Ascoli Theorem (Proposition 3.3.1 in [1]), one obtains (see section 4 of [12] or section 5 of [8] for details) that, up to taking suitable sequence of vanishing times steps , we may assume that
[TABLE]
and
[TABLE]
for some limit curve . From (5.9) and Lebesgue’s dominated convergence Theorem, we deduce that the convergence in (5.11) actually holds for any . It also follows from (5.9) and (5.10), that and that .
We deduce from the fact that and Theorem 3.3 that for each , there exists such that and
[TABLE]
and the optimal (backward) optimal transport from to is related to by
[TABLE]
We extend in a piecewise constant way i.e. set
[TABLE]
We then observe that
[TABLE]
Thanks to (5.10) we thus deduce that is bounded in , since has zero-mean, with Poincaré-Wirtinger inequality, we obtain
[TABLE]
We may therefore assume (up to further suitable extractions) that there is some such that converges to weakly in and converges weakly in to . Of course and on for a.e. . Note also that converges weakly in to .
The limiting equation can now be derived using standard computations (see the proof of Theorem 5.1 of the seminal work [17], or chapter 8 of [26]): Let and observe that
[TABLE]
Recalling that , and applying Taylor’s theorem, we have
[TABLE]
where . Note also that for , . Therefore,
[TABLE]
with
[TABLE]
Passing to the limit to [math] in (5.17) yields that is a weak solution to
[TABLE]
It remains to prove that , for a.e. . The inequality is obvious since , on and . To prove the converse inequality, we use Fatou’s Lemma, the lower semi-continuity of , (5.13) and the weak-convergence in of to :
[TABLE]
which concludes the proof.
∎
Acknowledgements: The authors wish to thank Vincent Duval and Gabriel Peyré for suggesting the TV-Wasserstein problem to them as well as for fruitful discussions. They also thank Maxime Laborde and Filippo Santambrogio for helpful remarks in particular regarding the maximum principle.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré. Gradient flows: in metric spaces and in the space of probability measures . Springer Science & Business Media, 2008.
- 2[2] G. Anzellotti. Pairing between measures and bounded functions and compensated compactness. Ann. di Matematica Pura ed Appl. , IV(135):293–318, 1983.
- 3[3] Giovanni Bellettini, Vicent Caselles, and Matteo Novaga. The total variation flow in ℝ n superscript ℝ 𝑛 \mathbb{R}^{n} . Journal of Differential Equations , 184(2):475–525, 2002.
- 4[4] Martin Benning, Luca Calatroni, Bertram Düring, and Carola-Bibiane Schönlieb. A primal-dual approach for a total variation Wasserstein flow. In Geometric Science of Information , pages 413–421. Springer, 2013.
- 5[5] Yann Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. , 44(4):375–417, 1991.
- 6[6] Haïm Brezis. Analyse fonctionnelle . Collection Mathématiques Appliquées pour la Maîtrise. [Collection of Applied Mathematics for the Master’s Degree]. Masson, Paris, 1983. Théorie et applications. [Theory and applications].
- 7[7] Martin Burger, Marzena Franek, and Carola-Bibiane Schönlieb. Regularized regression and density estimation based on optimal transport. Applied Mathematics Research e Xpress , 2012(2):209–253, 2012.
- 8[8] Guillaume Carlier and Maxime Laborde. A splitting method for nonlinear diffusions with nonlocal, nonpotential drifts. Nonlinear Anal. , 150:1–18, 2017.
