On the total variation Wasserstein gradient flow and the TV-JKO scheme

Guillaume Carlier; Clarice Poon

arXiv:1703.00243·math.AP·July 9, 2018

On the total variation Wasserstein gradient flow and the TV-JKO scheme

Guillaume Carlier, Clarice Poon

PDF

TL;DR

This paper investigates the JKO scheme for total variation, characterizes its optimizers, and proves convergence to a nonlinear fourth-order PDE under certain boundedness conditions, with results specific to one-dimensional and radially symmetric cases.

Contribution

It provides a detailed analysis of the TV-JKO scheme, including optimizer properties and convergence results, extending understanding of total variation gradient flows.

Findings

01

Characterization of optimizers for the TV-JKO scheme

02

Proof of maximum and minimum principles in certain cases

03

Convergence to a fourth-order nonlinear PDE with bounded density assumptions

Abstract

We study the JKO scheme for the total variation, characterize the optimizers, prove some of their qualitative properties (in particular a form of maximum principle and in some cases, a minimum principle as well). Finally, we establish a convergence result as the time step goes to zero to a solution of a fourth-order nonlinear evolution equation, under the additional assumption that the density remains bounded away from zero. This lower bound is shown in dimension one and in the radially symmetric case.

Equations228

J(\rho):=\sup\Big{\{}\int_{\Omega}\mathrm{div}(z)\rho\;:\;z\in C_{c}^{1}(\Omega),\;\|z\|_{L^{\infty}}\leq 1\Big{\}}

J(\rho):=\sup\Big{\{}\int_{\Omega}\mathrm{div}(z)\rho\;:\;z\in C_{c}^{1}(\Omega),\;\|z\|_{L^{\infty}}\leq 1\Big{\}}

\partial_{t}\rho+\mathrm{div}\Big{(}\rho\;\nabla\mathrm{div}\Big{(}\frac{\nabla\rho}{|\nabla\rho|}\Big{)}\Big{)}=0,\mbox{ in $(0,T)\times\Omega$},\;\rho_{|_{t=0}}=\rho_{0},

\partial_{t}\rho+\mathrm{div}\Big{(}\rho\;\nabla\mathrm{div}\Big{(}\frac{\nabla\rho}{|\nabla\rho|}\Big{)}\Big{)}=0,\mbox{ in $(0,T)\times\Omega$},\;\rho_{|_{t=0}}=\rho_{0},

\;\rho\nabla\mathrm{div}\Big{(}\frac{\nabla\rho}{|\nabla\rho|}\Big{)}\cdot\nu=0\mbox{ on $\partial\Omega$}

\;\rho\nabla\mathrm{div}\Big{(}\frac{\nabla\rho}{|\nabla\rho|}\Big{)}\cdot\nu=0\mbox{ on $\partial\Omega$}

\rho_{0}^{\tau}=\rho_{0},\;\rho_{k+1}^{\tau}\in\mathrm{argmin}\Big{\{}\frac{1}{2\tau}W_{2}^{2}(\rho_{k}^{\tau},\rho)+J(\rho),\;\rho\in\mathrm{BV}(\Omega)\cap{\cal P}_{2}(\overline{\Omega})\Big{\}}

\rho_{0}^{\tau}=\rho_{0},\;\rho_{k+1}^{\tau}\in\mathrm{argmin}\Big{\{}\frac{1}{2\tau}W_{2}^{2}(\rho_{k}^{\tau},\rho)+J(\rho),\;\rho\in\mathrm{BV}(\Omega)\cap{\cal P}_{2}(\overline{\Omega})\Big{\}}

W_{2}^{2}(\rho_{0},\rho_{1}):=\inf_{\gamma\in\Pi(\rho_{0},\rho_{1})}\Big{\{}\int_{\mathbb{R}^{d}\times\mathbb{R}^{d}}|x-y|^{2}\mbox{d}\gamma(x,y)\Big{\}},

W_{2}^{2}(\rho_{0},\rho_{1}):=\inf_{\gamma\in\Pi(\rho_{0},\rho_{1})}\Big{\{}\int_{\mathbb{R}^{d}\times\mathbb{R}^{d}}|x-y|^{2}\mbox{d}\gamma(x,y)\Big{\}},

\frac{1}{2}W_{2}^{2}(\mu_{0},\mu_{1})=\sup\Big{\{}\int_{\mathbb{R}^{d}}\psi\mbox{d}\mu_{0}+\int_{\mathbb{R}^{d}}\varphi\mbox{d}\mu_{1}\;:\;\psi(x)+\varphi(y)\leq\frac{|x-y|^{2}}{2}\Big{\}}

\frac{1}{2}W_{2}^{2}(\mu_{0},\mu_{1})=\sup\Big{\{}\int_{\mathbb{R}^{d}}\psi\mbox{d}\mu_{0}+\int_{\mathbb{R}^{d}}\varphi\mbox{d}\mu_{1}\;:\;\psi(x)+\varphi(y)\leq\frac{|x-y|^{2}}{2}\Big{\}}

φ (x) = y \in R^{d} in f {\frac{1}{2} ∣ x - y ∣^{2} - ψ (y)}, ψ (y) = x \in R^{d} in f {\frac{1}{2} ∣ x - y ∣^{2} - φ (x)},

φ (x) = y \in R^{d} in f {\frac{1}{2} ∣ x - y ∣^{2} - ψ (y)}, ψ (y) = x \in R^{d} in f {\frac{1}{2} ∣ x - y ∣^{2} - φ (x)},

\frac{\varphi}{\tau}+\mathrm{div}(z)\geq 0,\mbox{ with equality $\rho_{1}$-a.e.}

\frac{\varphi}{\tau}+\mathrm{div}(z)\geq 0,\mbox{ with equality $\rho_{1}$-a.e.}

J (ρ_{1}) = \int_{R^{d}} div (z) ρ_{1} .

J (ρ_{1}) = \int_{R^{d}} div (z) ρ_{1} .

Φ_{τ, ρ_{0}} (ρ) := \frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ) + J (ρ), \forall ρ \in BV (R^{d}) \cap P_{2} (R^{d})

Φ_{τ, ρ_{0}} (ρ) := \frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ) + J (ρ), \forall ρ \in BV (R^{d}) \cap P_{2} (R^{d})

Φ_{τ, ρ_{0}} (ρ_{1}) \leq Φ_{τ, ρ_{0}} (ρ), \forall ρ \in BV (R^{d}) \cap P_{2} (R^{d}) .

Φ_{τ, ρ_{0}} (ρ_{1}) \leq Φ_{τ, ρ_{0}} (ρ), \forall ρ \in BV (R^{d}) \cap P_{2} (R^{d}) .

\frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ) \geq \frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ_{1}) + \int_{R^{d}} \frac{φ}{τ} (ρ - ρ_{1}) .

\frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ) \geq \frac{1}{2 τ} W_{2}^{2} (ρ_{0}, ρ_{1}) + \int_{R^{d}} \frac{φ}{τ} (ρ - ρ_{1}) .

ρ_{0} = ρ_{α_{0}}, α_{0} > 0, ρ_{α} := \frac{1}{2 α} χ_{[- α, α]} .

ρ_{0} = ρ_{α_{0}}, α_{0} > 0, ρ_{α} := \frac{1}{2 α} χ_{[- α, α]} .

Φ_{τ, ρ_{0}} (ρ_{α_{1}}) = \frac{1}{α _{1}} + \frac{1}{6 τ} (α_{1} - α_{0})^{2}

Φ_{τ, ρ_{0}} (ρ_{α_{1}}) = \frac{1}{α _{1}} + \frac{1}{6 τ} (α_{1} - α_{0})^{2}

α_{1}^{2} (α_{1} - α_{0}) = 3 τ .

α_{1}^{2} (α_{1} - α_{0}) = 3 τ .

φ (x) = \frac{1}{2 α _{1}} (α_{1} - α_{0}) x^{2} - \frac{3 τ}{2 α _{1}}

φ (x) = \frac{1}{2 α _{1}} (α_{1} - α_{0}) x^{2} - \frac{3 τ}{2 α _{1}}

τ z_{1} (x) := - \frac{( α _{1} - α _{0} )}{6 α _{1}} x^{3} + \frac{3 τ x}{2 α _{1}}, x \in [- α_{1}, α_{1}]

τ z_{1} (x) := - \frac{( α _{1} - α _{0} )}{6 α _{1}} x^{3} + \frac{3 τ x}{2 α _{1}}, x \in [- α_{1}, α_{1}]

\rho_{k+1}^{\tau}=\mathrm{argmin}\;\Phi_{\tau,\rho_{k}^{\tau}}=\Big{(}\frac{\alpha_{k+1}^{\tau}}{\alpha_{k}^{\tau}}\mathrm{id}\Big{)}_{\#}\rho_{k}^{\tau}=\Big{(}\frac{\alpha_{k+1}^{\tau}}{\alpha_{0}}\mathrm{id}\Big{)}_{\#}\rho_{0}

\rho_{k+1}^{\tau}=\mathrm{argmin}\;\Phi_{\tau,\rho_{k}^{\tau}}=\Big{(}\frac{\alpha_{k+1}^{\tau}}{\alpha_{k}^{\tau}}\mathrm{id}\Big{)}_{\#}\rho_{k}^{\tau}=\Big{(}\frac{\alpha_{k+1}^{\tau}}{\alpha_{0}}\mathrm{id}\Big{)}_{\#}\rho_{0}

(α_{k + 1}^{τ} - α_{k}^{τ}) (α_{k + 1}^{τ})^{2} = 3 τ, α_{0}^{τ} = α_{0}

(α_{k + 1}^{τ} - α_{k}^{τ}) (α_{k + 1}^{τ})^{2} = 3 τ, α_{0}^{τ} = α_{0}

α^{'} α^{2} = 3, α (0) = α_{0},

α^{'} α^{2} = 3, α (0) = α_{0},

\partial_{t} ρ + (ρ v)_{x} = 0.

\partial_{t} ρ + (ρ v)_{x} = 0.

z (t, x) = \frac{- α ^{'} ( t )}{6 α ( t )} x^{3} + \frac{3 x}{2 α ( t )}, x \in [- α (t), α (t)],

z (t, x) = \frac{- α ^{'} ( t )}{6 α ( t )} x^{3} + \frac{3 x}{2 α ( t )}, x \in [- α (t), α (t)],

\partial_{t} ρ - (ρ z_{xx})_{x} = 0

\partial_{t} ρ - (ρ z_{xx})_{x} = 0

\rho_{1}(x)=\begin{cases}1-\beta/2&\mbox{ if $|x|<\beta$},\\ (1-|x|)_{+}&\mbox{ if $|x|\geq\beta$,}\end{cases}

\rho_{1}(x)=\begin{cases}1-\beta/2&\mbox{ if $|x|<\beta$},\\ (1-|x|)_{+}&\mbox{ if $|x|\geq\beta$,}\end{cases}

T(x)=\begin{cases}1-\sqrt{1-x(2-\beta)}&\mbox{ if $x\in[0,\beta)$},\\ x&\mbox{ if $x\geq\beta$}.\end{cases}

T(x)=\begin{cases}1-\sqrt{1-x(2-\beta)}&\mbox{ if $x\in[0,\beta)$},\\ x&\mbox{ if $x\geq\beta$}.\end{cases}

\varphi(x)=\begin{cases}\frac{x^{2}}{2}-x-\frac{(1-x(2-\beta))^{3/2}}{3(1-\beta/2)}+C&\mbox{ if $x\in[0,\beta)$},\\ 0&\mbox{ if $x>\beta$},\end{cases}

\varphi(x)=\begin{cases}\frac{x^{2}}{2}-x-\frac{(1-x(2-\beta))^{3/2}}{3(1-\beta/2)}+C&\mbox{ if $x\in[0,\beta)$},\\ 0&\mbox{ if $x>\beta$},\end{cases}

C = - \frac{β ^{2}}{2} + β + \frac{2 ( 1 - β ) ^{3}}{3 ( 2 - β )} .

C = - \frac{β ^{2}}{2} + β + \frac{2 ( 1 - β ) ^{3}}{3 ( 2 - β )} .

\begin{split}\tau z(x)=&-\frac{x^{3}}{6}+\frac{x^{2}}{2}-\frac{4}{15(2-\beta)^{2}}[1-(1-2\beta)x]^{\frac{5}{2}}\\ &+\Big{(}\frac{\beta^{2}}{2}-\beta-\frac{2(1-\beta)^{3}}{3(2-\beta)}\Big{)}x+\frac{4}{15(2-\beta)^{2}}\end{split}

\begin{split}\tau z(x)=&-\frac{x^{3}}{6}+\frac{x^{2}}{2}-\frac{4}{15(2-\beta)^{2}}[1-(1-2\beta)x]^{\frac{5}{2}}\\ &+\Big{(}\frac{\beta^{2}}{2}-\beta-\frac{2(1-\beta)^{3}}{3(2-\beta)}\Big{)}x+\frac{4}{15(2-\beta)^{2}}\end{split}

τ = \frac{β ^{3}}{3} - \frac{β ^{2}}{2} + \frac{4 ( 1 - ( 1 - β ) ^{5} )}{15 ( 2 - β ) ^{2}} - \frac{2 ( 1 - β ) ^{3} β}{3 ( 2 - β )}

τ = \frac{β ^{3}}{3} - \frac{β ^{2}}{2} + \frac{4 ( 1 - ( 1 - β ) ^{5} )}{15 ( 2 - β ) ^{2}} - \frac{2 ( 1 - β ) ^{3} β}{3 ( 2 - β )}

\inf_{\rho\in{\cal P}_{\rm{ac}}(\Omega)}\Big{\{}\frac{1}{2\tau}W_{2}^{2}(\rho_{0},\rho)+J(\rho)\Big{\}}.

\inf_{\rho\in{\cal P}_{\rm{ac}}(\Omega)}\Big{\{}\frac{1}{2\tau}W_{2}^{2}(\rho_{0},\rho)+J(\rho)\Big{\}}.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the total variation Wasserstein gradient flow and the TV-JKO scheme

Guillaume Carlier Ceremade, UMR CNRS 7534, Université Paris Dauphine, Pl. de Lattre de Tassigny, 75775, Paris Cedex 16, France, and MOKAPLAN, INRIA-Paris, E-mail: [email protected]

Clarice Poon Centre for Mathematical Sciences, University of Cambridge, Wilberforce Rd, Cambridge CB3 0WA, United Kingdom, Email: [email protected]

Abstract

We study the JKO scheme for the total variation, characterize the optimizers, prove some of their qualitative properties (in particular a form of maximum principle and in some cases, a minimum principle as well). Finally, we establish a convergence result as the time step goes to zero to a solution of a fourth-order nonlinear evolution equation, under the additional assumption that the density remains bounded away from zero. This lower bound is shown in dimension one and in the radially symmetric case.

Keywords: total variation, Wasserstein gradient flows, JKO scheme, fourth-order evolution equations.

MS Classification: 35G31, 49N15.

1 Introduction

Variational schemes based on total variation are extremely popular in image processing for denoising purposes, in particular the seminal work of Rudin, Osher and Fatemi [25] has been extremely influential and is still the object of an intense stream of research, see [10] and the references therein. Continuous-time counterparts are well-known to be related to the $L^{2}$ gradient flow of the total variation, see Bellettini, Caselles and Novaga [3] and the mean-curvature flow, see Evans and Spruck [14]. The gradient flow of the total variation for other Hilbertian structures may be natural as well and in particular the $H^{-1}$ case, leads to a singular fourth-order evolution equation studied by Giga and Giga [15], Giga, Kuroda and Matsuoka [16]. In the present work, we consider another metric, namely the Wasserstein one.

Given an open subset $\Omega$ of $\mathbb{R}^{d}$ and $\rho\in L^{1}(\Omega)$ , recall that the total variation of $\rho$ is given by

[TABLE]

and $\mathrm{BV}(\Omega)$ is by definition the subspace of $L^{1}(\Omega)$ consisting of those $\rho$ ’s in $L^{1}(\Omega)$ such that $J(\rho)$ is finite. The following fourth-order nonlinear evolution equation

[TABLE]

supplemented by the zero-flux boundary condition

[TABLE]

has been proposed in [7] for the purpose of denoising image densities. Numerical schemes for approximating the solutions of this equation have been investigated in [7, 13, 4]. One should consider weak solutions and in particular interpret the nonlinear term $\mathrm{div}(\frac{\nabla\rho}{|\nabla\rho|})$ as the negative of an element of the subdifferential of $J$ at $\rho$ .

At least formally, when $\rho_{0}$ is a probability density on $\Omega$ , (1.2)-(1.3) can be viewed as the Wasserstein gradient flow of $J$ (we refer to the textbooks of Ambrosio, Gigli, Savaré [1] and Santambrogio [26], for a detailed exposition). Following the seminal work of Jordan, Kinderlehrer and Otto [17] for the Fokker-Planck equation, it is reasonable to expect that solutions of (1.2) can be obtained, at the limit $\tau\to 0^{+}$ , of the JKO Euler implicit scheme:

[TABLE]

where ${\cal P}_{2}(\overline{\Omega})$ is the space of Borel probability measures $\overline{\Omega}$ with finite second moment and $W_{2}$ is the quadratic Wasserstein distance:

[TABLE]

$\Pi(\rho_{0},\rho_{1})$ denoting the set of transport plans between $\rho_{0}$ and $\rho_{1}$ i.e. the set of probability measures on $\mathbb{R}^{d}\times\mathbb{R}^{d}$ having $\rho_{0}$ and $\rho_{1}$ as marginals. Our aim is to study in detail the discrete TV-JKO scheme (1.4) as well as its connection with (suitable weak solutions) of the PDE (1.2). Although the assertion that (1.2) is the TV Wasserstein gradient flow is central to the numerical schemes described in [7, 13, 4], there has been so far, to the best of our knowledge, no theoretical justification of this fact.

Fourth-order equations which are Wasserstein gradient flows of functionals involving the gradient of $\rho$ , such as the Dirichlet energy or the Fisher information, have been studied by McCann, Matthes and Savaré [22] who found a new method, the flow interchange technique, to prove higher-order estimates, we refer to [18] for a recent reference on this topic. The total variation is however too singular for such arguments to be directly applicable, as far as we know. We shall prove the convergence of JKO steps as $\tau\to 0^{+}$ under the extra assumption that densities remain bounded aways from zero. Whether this extra assumption is reasonable or not is related to a minimum principle issue, interesting in its own right, namely the monotonicty of the infimum along JKO steps. We shall see that, in a convex domain, JKO steps obey a maximum principle (the maximum of the density is nonincreasing along JKO steps). The corresponding minimum principle seems more difficult to prove and we have been able to establish it only in some particular cases, namely in dimension one and in the radially symmetric case, eventhough we conjecture it is satisfied in more general situations.

The paper is organized as follows. In section 2, we start with the discussion of a few examples. Section 3 establishes optimality conditions for JKO steps thanks to an entropic regularization scheme. Section 4 is devoted to some properties of solutions of JKO steps and in particular a maximum principle based on a result of [11], we also establish a minimum principle in dimension one and in the radially symmetric case. Finally, in section 5, we prove a conditional convergence result, we establish convergence of the JKO scheme, as $\tau\to 0^{+}$ , under the extra assumption that the density remains away from zero, this covers the unidimensional case as well as the radially symmetric case when the initial conditon is strictly positive.

2 Some examples

We first recall the Kantorovich dual formulation of $W_{2}^{2}$ :

[TABLE]

an optimal pair $(\psi,\varphi)$ for this problem is called a pair of Kantorovich potentials. The existence of Kantorovich potentials is well-known and such potentials can be taken to be conjugates of each other, i.e. such that

[TABLE]

which implies that $\varphi$ and $\psi$ are semi-concave (more precisely $\frac{1}{2}|.|^{2}-\varphi$ is convex). If $\mu_{1}$ is absolutely continuous with respect to the $d$ -dimensional Lebesgue measure, $\varphi$ is differentiable $\mu_{1}$ a.e. and the map $T=\mathrm{id}-\nabla\varphi$ is the gradient of a convex function pushing forward $\mu_{1}$ to $\mu_{0}$ which is in fact the optimal transport between $\mu_{0}$ and $\mu_{1}$ thanks to Brenier’s theorem [5]. In such a case, we will simply refer to $\varphi$ as a Kantorovich potential between $\mu_{1}$ and $\mu_{0}$ . We refer the reader to [28] and [26] for details.

In this section, we will consider some explicit examples which rely on the following sufficient optimality condition (details for a rigorous derivation of the Euler-Lagrange equation for JKO steps will be given in section 3) in the case of the whole space i.e. $\Omega=\mathbb{R}^{d}$ . Let us also recall that by Sobolev inequality $\mathrm{BV}(\mathbb{R}^{d})$ is continuously embedded in $L^{\frac{d}{d-1}}(\mathbb{R}^{d})$ .

Lemma 2.1.

Let $\rho_{0}\in{\cal P}_{2}(\mathbb{R}^{d})$ , $\tau>0$ and $\Omega=\mathbb{R}^{d}$ (so $J$ is the total variaton on the whole space), if $\rho_{1}\in\mathrm{BV}(\mathbb{R}^{d})\cap{\cal P}_{2}(\mathbb{R}^{d})$ is such that

[TABLE]

where $\varphi$ is a Kantorovich potential between $\rho_{1}$ and $\rho_{0}$ and $z\in C^{1}(\mathbb{R}^{d})$ , with $\|z\|_{L^{\infty}}\leq 1$ , $\mathrm{div}(z)\in L^{d}(\mathbb{R}^{d})$ (so that $\mathrm{div}(z)\rho_{1}\in L^{1}(\mathbb{R}^{d})$ ), and

[TABLE]

Then, setting

[TABLE]

one has

[TABLE]

Proof.

For all $\rho\in\mathrm{BV}(\mathbb{R}^{d})\cap{\cal P}_{2}(\mathbb{R}^{d})$ , $J(\rho)\geq\int_{\mathbb{R}^{d}}\mathrm{div}(z)\rho=J(\rho_{1})+\int_{\mathbb{R}^{d}}\mathrm{div}(z)(\rho-\rho_{1})$ , and it follows from the Kantorovich duality formula that

[TABLE]

The claim then directly follows from (2.2). ∎

2.1 The case of a characteristic function

A simple illustration of Lemma 2.1 in dimension 1 concerns the case of a uniform $\rho_{0}$ , (here and in the sequel we shall denote by $\chi_{A}$ the characteristic function of the set $A$ ):

[TABLE]

It is natural to make the ansatz that the minimizer of $\Phi_{\tau,\rho_{0}}$ defined by (2.4) remains of the form $\rho_{1}=\rho_{\alpha_{1}}$ for some $\alpha_{1}>\alpha_{0}$ . The optimal transport between $\rho_{\alpha_{1}}$ and $\rho_{0}$ being the linear map $T=\frac{\alpha_{0}}{\alpha_{1}}\mathrm{id}$ , a direct computation gives

[TABLE]

which is minimal when $\alpha_{1}$ is the only root in $(\alpha_{0},+\infty)$ of

[TABLE]

To check that this is the correct guess, we shall check that the conditions of Lemma 2.1 are met. It is easy to check that the potential defined by

[TABLE]

is a Kantorovich potential between $\rho_{1}=\rho_{\alpha_{1}}$ and $\rho_{0}$ . Define111The guess for this construction is by integrating the Euler-Lagrange equation on the support of $\rho_{\alpha_{1}}$ . then $z_{1}$ by

[TABLE]

extended by $1$ on $[\alpha_{1},+\infty)$ and $-1$ on $(-\infty,-\alpha_{1}]$ . By construction $-1\leq z_{1}\leq 1$ (use the fact that it is odd and nondecreasing on $[0,\alpha_{1}]$ thanks to (2.5)), also $z_{1}^{\prime}(\pm\alpha_{1})=0$ so that $z_{1}\in C^{1}(\mathbb{R})$ and $z_{1}(\alpha_{1})=1$ , $z_{1}(-\alpha_{1})=-1$ and one easily checks that $J(\rho_{1})=-\int_{\mathbb{R}}z_{1}D\rho_{1}=\int_{\mathbb{R}}z^{\prime}_{1}\rho_{1}$ (here and in the sequel $D\rho_{1}$ denotes the Radon measure which is the distributional derivative of the $\mathrm{BV}$ function $\rho_{1}$ ). Moreover $\tau z_{1}^{\prime}+\varphi\geq 0$ with an equality on $[-\alpha_{1},\alpha_{1}]$ . The optimality of $\rho_{1}=\rho_{\alpha_{1}}$ then directly follows from Lemma 2.1.

Of course, the argument can be iterated so as to obtain the full TV-JKO sequence:

[TABLE]

where $\alpha_{k}^{\tau}$ is defined inductively by

[TABLE]

which is nothing but the implicit Euler discretization of the ODE

[TABLE]

whose solution is $\alpha(t)=(\alpha_{0}^{3}+9t)^{\frac{1}{3}}$ . Extending $\rho_{k}^{\tau}$ in a piecewise constant way: $\rho^{\tau}(t)=\rho_{k+1}^{\tau}$ for $t\in(k\tau,(k+1)\tau]$ , it is not difficult to check that $\rho^{\tau}$ converges (in $L^{\infty}((0,T),({\cal P}_{2}(\mathbb{R}),W_{2}))$ and in $L^{p}((0,T)\times\mathbb{R})$ for any $p\in(1,\infty)$ and any $T>0$ ) to $\rho$ given by $\rho(t,.)=(\frac{\alpha(t)}{\alpha_{0}}\rm{id})_{\#}\rho_{0}$ . Since $v(t,x)=\frac{\alpha^{\prime}(t)}{\alpha(t)}x$ is the velocity field associated to $X(t,x)=\frac{\alpha(t)}{\alpha_{0}}x$ , $\rho$ solves the continuity equation

[TABLE]

In addition, $\rho v=-\rho z_{xx}$ where

[TABLE]

extended by $1$ (respectively $-1$ ) on $[\alpha(t),+\infty)$ (respectively $(-\infty,-\alpha(t)]$ ). The function $z$ is $C^{1}$ , $\|z\|_{L^{\infty}}\leq 1$ and $z\cdot D\rho=-|D\rho|$ (in the sense of measures). In other words the limit $\rho$ of $\rho^{\tau}$ satisfies

[TABLE]

with $|z|\leq 1$ and $z\cdot D\rho=-|D\rho|$ which is the natural weak form of (1.2) since $z_{xx}=\nabla\mathrm{div}(z)$ in dimension one.

2.2 Instantaneaous creation of discontinuities

We now consider the case where $\rho_{0}(x)=(1-|x|)_{+}$ and will show that the JKO scheme instantaneously creates a discontinuity at the level of $\rho_{1}$ , the minimizer of $\Phi_{\tau,\rho_{0}}$ when $\tau$ is small enough. We indeed look for $\rho_{1}$ in the form:

[TABLE]

for some well-chosen $\beta\in(0,1)$ . The optimal transport map $T$ between such a $\rho_{1}$ and $\rho_{0}$ is odd and given explicitly by

[TABLE]

The Kantorovich potential which vanishes at $\beta$ (extended in an even way to $\mathbb{R}_{-}$ ) is then given by

[TABLE]

where

[TABLE]

Let us now integrate $\tau z^{\prime}=-\varphi$ on $[0,\beta]$ with initial condition $z(0)=0$ , i.e. for $x\in[0,\beta]$

[TABLE]

Note that $z$ is nondecreasing on $[0,\beta]$ (because $\varphi(0)<0$ , $\varphi(\beta)=0$ and $\varphi$ is convex on $[0,\beta]$ so that $\varphi\leq 0$ on $[0,\beta]$ ), our aim now is to find $\beta\in(0,1)$ in such a way that $z(\beta)=1$ i.e. replacing in the previous formula

[TABLE]

the right hand-side is a continuous function of $\beta\in[0,1]$ taking value [math] for $\beta=0$ and $\frac{1}{10}$ for $\beta=1$ , hence as soon as $10\tau<1$ one may find a $\beta\in(0,1)$ such that indeed $z(\beta)=1$ . Extend then $z$ by $1$ on $[\beta,+\infty)$ and to $\mathbb{R}_{-}$ in an odd way. We then have built a function $z$ which is $C^{1}$ ( $\varphi(\beta)=0$ ), such that $|z|\leq 1$ , $z\cdot D\rho_{1}=-|D\rho_{1}|$ and such that $z^{\prime}+\frac{\varphi}{\tau}=0$ . Thanks to Lemma 2.1, we conclude that $\rho_{1}$ is optimal. This example shows that discontinuities may appear at the very first iteration of the TV-JKO scheme.

3 Euler-Lagrange equation for JKO steps

The aim of this section is to establish optimality conditions for (3.1). Despite the fact that it is a convex minimization problem, it involves two nonsmooth terms $J$ and $W^{2}_{2}(\rho_{0},.)$ , so some care should be taken of to justify rigorously the arguments. In the next subsection, we introduce an entropic regularization, the advantage of this strategy is that the minimizer will be positive everywhere, giving some differentiability of the transport term.

3.1 Entropic approximation

In this whole section, we assume that $\Omega$ is an open bounded connected (not necessarily convex) subset of $\mathbb{R}^{d}$ with Lipschitz boundary and denote by ${\cal P}_{\rm{ac}}(\Omega)$ the set of Borel probability measures on $\Omega$ that are absolutely continuous with respect to the Lebesgue measure (and will use the same notation for $\mu\in{\cal P}_{\rm{ac}}(\Omega)$ both for the measure $\mu$ and its density). Given $\rho_{0}\in{\cal P}_{\rm{ac}}(\Omega)$ and $\tau>0$ , we consider one step of the TV-JKO scheme:

[TABLE]

It is easy by the direct method of the calculus of variations to see that (3.1) has at least one solution, moreover $J$ being convex and $\rho\mapsto W_{2}^{2}(\rho,\rho_{0})$ being strictly convex whenever $\rho_{0}\in{\cal P}_{\rm{ac}}(\Omega)$ (see [26]), the minimizer is in fact unique, and in the sequel we denote it by $\rho_{1}$ . Given $h>0$ we consider the following approximation of (3.1):

[TABLE]

where

[TABLE]

It is easy to to see that (3.2) admits a unique solution $\rho_{h}$ . Moreover, since $\Omega$ is bounded, ${\cal E}$ is lower bounded, hence $J(\rho_{h})$ is bounded. Recalling that the embedding $BV(\Omega)\subset L^{p}(\Omega)$ is compact for every $p\in[1,\frac{d}{d-1})$ , one may therefore (up to extraction) assume that $\rho_{h}$ converges as $h\to 0$ a.e. and strongly in $L^{p}(\Omega)$ for every $p\in[1,\frac{d}{d-1})$ to some $\rho_{1}$ , which, by a standard $\Gamma$ -convergence argument, is easily seen to be the solution of (3.1). The advantage of this regularization is that not only each $\rho_{h}$ is bounded from below but also that $h\log(\rho_{h})$ is bounded from below uniformly in $h$ (but not in $\tau$ which is fixed throughout this section):

Proposition 3.1.

Up to passing to a subsequence, the family $\beta_{h}:=h\log(\rho_{h})$ is uniformly bounded from below. Moreover, $\beta_{h}$ is bounded in $L^{p}(\Omega)$ for any $p>1$ and $\max(0,\beta_{h})$ converges strongly to [math] in $L^{p}(\Omega)$ for any $p>1$ .

Proof.

Let $t_{h}>0$ be such that the set $F_{t_{h}}^{h}:=\{\rho_{h}>t_{h}\}$ has positive measure and finite perimeter (recall that $\rho_{h}\in\mathrm{BV}$ ). Let us assume that there is an $\varepsilon\in(0,1)$ such that

[TABLE]

and

[TABLE]

We aim to show that $\varepsilon$ cannot be arbitrarily small. Define then $\mu_{\varepsilon,h}:=\max(\rho_{h},\varepsilon)$ that is $\varepsilon$ on $A_{\varepsilon,h}$ and $\rho_{h}$ elsewhere. Defining $c_{\varepsilon,h}:=\int_{\Omega}(\mu_{\varepsilon,h}-\rho_{h})$ and observing that $c_{\varepsilon,h}\leq\varepsilon|\Omega|$ , we see that (3.3) implies that $c_{\varepsilon,h}\leq\frac{1}{2}t_{h}|F_{t_{h}}^{h}|$ and $t_{h}\geq 2\varepsilon$ so that $A_{\varepsilon,h}$ and $F_{t_{h}}^{h}$ are disjoint. Finally, set

[TABLE]

See Figure 2, where we set $\tilde{c}_{\varepsilon,h}:=c_{\varepsilon,h}/|F_{t_{h}}^{h}|$ .

By construction $\rho_{\varepsilon,h}\in{\cal P}(\Omega)$ hence $0\leq{\cal F}_{h}(\rho_{\varepsilon,h})-{\cal F}_{h}(\rho_{h})$ , in this difference we have four terms, namely

•

the Wasserstein term, which, using the Kantorovich duality formula (2.1) and the fact that $\Omega$ is bounded can be estimated in terms of $\|\rho_{\varepsilon,h}-\rho_{h}\|_{L^{1}}=2c_{\varepsilon,h}$ :

[TABLE]

for a constant $C$ that depends on $\Omega$ but neither on $\varepsilon$ nor $h$ ,

•

the TV term: $J(\rho_{\varepsilon,h})-J(\rho_{h})$ : outside $F_{t_{h}}^{h}$ we have replaced $\rho_{h}$ by a $1$ -Lipschitz function of $\rho_{h}$ which decreases the TV semi-norm, on $F_{t_{h}}^{h}$ on the contrary we have created a jump of magnitude $c_{\varepsilon,h}/|F_{t_{h}}^{h}|$ so

[TABLE]

where $\mathrm{Per}(F_{t_{h}}^{h})=J(\chi_{F_{t_{h}}^{h}})$ denotes the perimeter of $F_{t_{h}}^{h}$ (in $\Omega$ ),

•

the entropy variation on $A_{\varepsilon,h}$ , on this set both $\rho_{\varepsilon,h}$ and $\rho_{h}$ are less than $\varepsilon$ so that $(1+\log(t))\leq(1+\log(\varepsilon))$ whenever $t\in[\rho_{h},\rho_{\varepsilon,h}]$ which by the mean value theorem yields

[TABLE]

•

the last term is the entropy variation on $F_{t_{h}}^{h}$ . It is convenient to split $F_{t_{h}}^{h}$ into $F_{t_{h}}^{h}\cap\{\rho_{\varepsilon,h}\geq\frac{1}{e}\}$ and $F_{t_{h}}^{h}\cap\{\rho_{\varepsilon,h}<\frac{1}{e}\}$ . The entropy variation on the first part is easy to control. Indeed, $t\mapsto t\log(t)$ is nondecreasing on $[\frac{1}{e},+\infty)$ . Since, on $F_{t_{h}}^{h}\cap\{\rho_{\varepsilon,h}\geq\frac{1}{e}\}$ , $\rho_{h}\geq\rho_{\varepsilon,h}\geq\frac{1}{e}$ , we have $(\rho_{\varepsilon,h}\log(\rho_{\varepsilon,h})-\rho_{h}\log(\rho_{h}))\leq 0$ . As for the second part, we observe that $F_{t_{h}}^{h}\cap\{\rho_{\varepsilon,h}<\frac{1}{e}\}\subset\{\rho_{h}\leq\frac{1}{e}+\frac{t_{h}}{2}\}$ , so on this set, both $\rho_{\varepsilon,h}$ and $\rho_{h}$ remain in the interval $[\frac{t_{h}}{2},\frac{1}{e}+\frac{t_{h}}{2}]$ . We thus have

[TABLE]

where

[TABLE]

Putting together (3.6)-(3.7)-(3.8)-(3.9), we arrive at

[TABLE]

which for small enough $\varepsilon$ is possible only when $c_{\varepsilon,h}=0$ i.e. $|A_{\varepsilon,h}|=0$ . More precisely, either we have the lower bound:

[TABLE]

or (3.3) is impossible i.e. $\rho_{h}\geq\frac{t_{h}|F_{t_{h}}^{h}|}{2|\Omega|}$ . To prove that $\beta_{h}=h\log(\rho_{h})$ is bounded from below uniformly in $h$ , it is therefore enough to show that we can find a family $t_{h}$ , bounded and bounded away from [math], such that $|F_{t_{h}}^{h}|$ remains bounded away from [math], and $\mathrm{Per}(F_{t_{h}}^{h})$ is uniformly bounded from above as $h\to 0$ . First note that, since $J(\rho_{h})$ is bounded, there exists $\rho$ such that $\rho_{h}\to\rho$ in $L^{1}$ and a.e. up to a subsequence, note also that $\rho\in\mathrm{BV}$ and $\rho$ is a probability density. Setting $F_{t}:=\{\rho>t\}$ , $F_{t}^{h}:=\{\rho_{h}>t\}$ , if $s>t$ , since $\rho_{h}$ converges a.e. to $\rho$ , we have a.e. $\liminf_{h}\chi_{F_{t}^{h}}\geq\chi_{F_{s}}$ . It then follows from Fatou’s Lemma that when $s>t$ , $\liminf_{h}|F_{t}^{h}|\geq|F_{s}|$ , hence choosing $0<\beta_{1}<\beta_{2}<\beta$ so that $|F_{\beta}|>0$ , we deduce that there exists $h_{0}>0$ and $c_{1}>0$ such that for all $t\in[\beta_{1},\beta_{2}]$ and all $h\in(0,h_{0}]$ , we have $c_{1}\leq|F_{t}^{h}|\leq|\Omega|$ . For an upper bound on perimeters, we observe that since $J(\rho_{h})\leq C$ , thanks to the co-area formula, we have

[TABLE]

So, there exists $t_{h}\in[\beta_{1},\beta_{2}]$ such that $\mathrm{Per}(F_{t_{h}}^{h})\leq C/(\beta_{2}-\beta_{1}).$

Finally, since $\rho_{h}$ converges in $L^{1}$ , we may assume that, up to a subsequence, $\rho_{h}\leq\phi$ for some $\phi\in L^{1}$ (see Theorem IV.9 in [6]). Then, by Dominated convergence and since $\log(\max(\phi,1))\in L^{p}(\Omega)$ for every $p>1$ , we have that $\log(\max(\rho_{h},1))$ converges a.e. and in $L^{p}$ , in particular this implies that $\max(0,\beta_{h})$ converges to [math] strongly in $L^{p}(\Omega)$ , and we have just seen that $\min(0,\beta_{h})$ is bounded in $L^{\infty}(\Omega)$ .

∎

Let us also recall some well-known facts (see [9]) about the total variation functional $J$ viewed as a convex l.s.c. and one-homogeneous functional on $L^{\frac{d}{d-1}}(\Omega)$ . Define

[TABLE]

where $\mathrm{div}(z)=\xi,\;z\cdot\nu=0$ on $\partial\Omega$ are to be understood in the weak sense

[TABLE]

Note that $\Gamma_{d}$ is closed and convex in $L^{d}(\Omega)$ and $J$ is its support function:

[TABLE]

As for the Wasserstein term, recalling Kantorovich dual formulation (2.1), the derivative of the Wasserstein term $\rho\mapsto W^{2}_{2}(\rho_{0},\rho)$ term will be expressed in terms of a Kantorovich potential between $\rho$ and $\rho_{0}$ .

We then have the following characterization for $\rho_{h}$ :

Proposition 3.2.

There exists $z_{h}\in L^{\infty}(\Omega,\mathbb{R}^{d})$ such that $\mathrm{div}(z_{h})\in L^{p}(\Omega)$ for every $p\in[1,+\infty)$ , $\|z_{h}\|_{L^{\infty}}\leq 1$ , $z_{h}\cdot\nu=0$ on $\partial\Omega$ , $J(\rho_{h})=\int_{\Omega}\mathrm{div}(z_{h})\rho_{h}$ and

[TABLE]

where $\varphi_{h}$ is the Kantorovich potential between $\rho_{h}$ and $\rho_{0}$ .

Proof.

Let $\mu\in L^{\infty}(\Omega)\cap\mathrm{BV}(\Omega)$ such that $\int_{\Omega}\mu=0$ . Thanks to Proposition 3.1, we know that $\rho_{h}$ is bounded away from [math] hence for small enough $t>0$ , $\rho_{h}+t\mu$ is positive hence a probability density. Also, as a consequence of Theorem 1.52 in [26], we have that

[TABLE]

where $\varphi_{h}$ is the (unique up to an additive constant) Kantorovich potential between $\rho_{h}$ and $\rho_{0}$ , in particular $\varphi_{h}$ is Lipschitz and semi concave ( $D^{2}\varphi_{h}\leq\mathrm{id}$ in the sense of measures and $\mathrm{id}-\nabla\varphi_{h}$ is the optimal transport between $\rho_{h}$ and $\rho_{1}$ ). By the optimality of $\rho_{h}$ and the fact that $J$ is a semi-norm, we get

[TABLE]

where

[TABLE]

Since $\varphi_{h}$ is defined up to an additive constant, we may chose it in such a way that $\xi_{h}$ has zero mean, doing so, (3.16) holds for any $\mu\in L^{\infty}(\Omega)\cap\mathrm{BV}(\Omega)$ (not necessarily with zero mean). Being Lipschitz, $\varphi_{h}$ is bounded, also observe that $h(\log(\rho_{h}))_{+}=h\log(\max(1,\rho_{h}))$ is in $L^{p}(\Omega)$ for every $p\in[1,+\infty)$ since $\rho_{h}\in L^{\frac{d}{d-1}}(\Omega)$ and $h\log(\rho_{h})_{-}=-h\log(\min(1,\rho_{h}))$ is $L^{\infty}(\Omega)$ thanks to Proposition 3.1, hence we have $\xi_{h}\in L^{p}(\Omega)$ for every $p\in[1,+\infty)$ .

By approximation and observing that $\xi_{h}\in L^{d}(\Omega)$ , (3.16) extends to all $\mu\in L^{\frac{d}{d-1}}(\Omega)$ . In particular, we have

[TABLE]

but since $\Gamma_{d}$ is convex and closed in $L^{d}(\Omega)$ , it follows from Hahn-Banach’s separation theorem that $\xi_{h}\in\Gamma_{d}$ . Finally, getting back to (3.16) (without the zero mean restriction on $\mu$ ) and taking $\mu=-\rho_{h}$ gives $J(\rho_{h})\leq\int_{\Omega}\xi_{h}\rho_{h}$ , and we then deduce that this should be an equality.

∎

3.2 Euler-Lagrange equation

We are now in position to rigorously establish the Euler-Lagrange equation for (3.1):

Theorem 3.3.

If $\rho_{1}$ solves (3.1), there exists $\varphi$ a Kantorovich potential between $\rho_{1}$ and $\rho_{0}$ (in particular $\mathrm{id}-\nabla\varphi$ is the optimal transport between $\rho_{1}$ and $\rho_{0}$ ), $\beta\in L^{\infty}(\Omega)$ , $\beta\geq 0$ and $z\in L^{\infty}(\Omega,\mathbb{R}^{d})$ such that

[TABLE]

and

[TABLE]

*Remark 3.4**.*

It is not difficult (since (3.1) is a convex problem) to check that (3.17)-(3.18) are also sufficient optimality conditions. The main point here is that the right hand side $\beta$ in (3.17) which is a multiplier associated with the nonnegativity constraint is better than a measure, it is actually an $L^{\infty}$ function.

Proof.

As in section 3.1, we denote by $\rho_{h}$ the solution of the entropic approximation (3.2). Up to passing to a subsequence (not explicitly written), we may assume that $\rho_{h}$ converges a.e. and strongly in $L^{p}(\Omega)$ (for any $p\in[1,\frac{d}{d-1})$ ) to $\rho_{1}$ (the solution of (3.1), again by a standard $\Gamma$ -convergence argument). We then rewrite the Euler-Lagrange equation from Proposition 3.2 as

[TABLE]

where $\beta_{h}^{+}:=h\log(\max(\rho_{h},1))$ , $\beta_{h}^{-}:=-h\log(\min(\rho_{h},1))$ , and

[TABLE]

It follows from Proposition 3.1 that $\beta_{h}^{+}$ converges to [math] strongly in any $L^{p}$ , $p\in[1,+\infty)$ and that $\beta_{h}^{-}$ is bounded in $L^{\infty}$ . Up to subsequences, we may therefore assume that $z_{h}$ and $\beta_{h}^{-}$ weakly- $*$ converge in $L^{\infty}$ respectively to some $z$ and $\beta$ with $\|z\|_{L^{\infty}}\leq 1$ , $z\cdot\nu=0$ on $\partial\Omega$ and $\beta\geq 0$ . As for the Kantorovich potentials $\varphi_{h}$ , since the transport map $(\mathrm{id}-\nabla\varphi_{h})$ a.e. takes values in $\Omega$ we have $\|\nabla\varphi_{h}\|_{L^{\infty}}\leq{\rm{diam}}(\Omega)$ , hence $\varphi_{h}$ is an equi-Lipschitz family because $\Omega$ is bounded. Moreover $\int_{\Omega}\varphi_{h}=\tau\int_{\Omega}(\beta_{h}^{-}-\beta_{h}^{+})$ which remains bounded, hence we may assume that $\varphi_{h}$ converges uniformly to some potential $\varphi$ and it is well-known (see [26]) that $\varphi$ is a Kantorovich potential between $\rho_{1}$ and $\rho_{0}$ . Letting $h$ tend to [math] gives (3.17).

Since $\rho_{h}$ converges strongly in $L^{1}$ to $\rho_{1}$ and $\beta_{h}^{-}$ converges weakly- $*$ to $\beta$ in $L^{\infty}$ we have

[TABLE]

hence $\beta\rho_{1}=0$ . Thanks to (3.13), we obviously have $J(\rho_{1})\geq\int_{\Omega}\mathrm{div}(z)\rho_{1}$ (since $\mathrm{div}(z)\in L^{\infty}$ , $\mathrm{div}(z)\in\Gamma_{d}$ ), for the converse inequality, it is enough to observe that

[TABLE]

and that $\mathrm{div}(z_{h})=-\frac{\varphi_{h}}{\tau}-\beta_{h}^{+}+\beta_{h}^{-}$ converges to $\mathrm{div}(z)$ weakly in $L^{q}$ for every $q\in[1,+\infty)$ . Since $\rho_{h}$ converges strongly to $\rho_{1}$ in $L^{q}$ when $q\in[1,\frac{d}{d-1})$ we deduce that $J(\rho_{1})=\int_{\Omega}\mathrm{div}(z)\rho_{1}$ which completes the proof of (3.18).

∎

A first consequence of the high integrability of $\mathrm{div}(z)$ is that one can give a meaning to $z\cdot\nabla u$ for any $u\in\mathrm{BV}(\Omega)$ . Indeed, if $q\in[\frac{d}{d-1},+\infty]$ and $q^{\prime}$ denotes its conjugate exponent, following Anzellotti [2], if $u\in\mathrm{BV}(\Omega)\cap L^{q}(\Omega)$ and $\sigma\in L^{\infty}(\Omega,\mathbb{R}^{d})$ is such that $\mathrm{div}(\sigma)\in L^{q^{\prime}}(\Omega)$ , one can define the distribution $\sigma\cdot Du$ by

[TABLE]

Then $\sigma\cdot Du$ is a Radon measure which satisfies $|\sigma\cdot Du|\leq\|\sigma\|_{L^{\infty}}|Du|$ (in the sense of measures) hence is absolutely continuous with respect to $|Du|$ . Moreover one can also define a weak notion of normal trace of $\sigma$ , $\sigma\cdot\nu\in L^{\infty}(\partial\Omega)$ such that the following integration by parts formula holds

[TABLE]

We refer to [2] for proofs. These considerations of course apply to $\sigma=z$ and $u=\rho_{1}\in\mathrm{BV}(\Omega)$ and in particular enable one to see $z\cdot D\rho_{1}$ as a measure and to interpret the optimality condition $J(\rho_{1})=\int_{\Omega}\mathrm{div}(z)\rho_{1}$ as $|D\rho_{1}|=-z\cdot D\rho_{1}$ in the sense of measures. Finally, the fact that $\mathrm{div}(z)\in L^{\infty}$ in Theorem 3.3 and the theory of variational mean curvature (see Tamanini [27], Massari [20, 21], Theorem 3.6 of Gonzalez and Massari [19]) allows for conclusions about the regularity of the level sets, $F_{t}=\{\rho_{1}>t\}$ of $\rho_{1}$ , the solution of (3.1), we do not elaborate this regularity (which, anyway, only holds for fixed time step $\tau>0$ ) further here.

4 Maximum and minimum principles for JKO steps

Throughout this section, we further assume that $\Omega$ is a convex open bounded subset of $\mathbb{R}^{d}$ , our aim is to establish bounds on the TV-JKO iterates given by (3.1). Since, the TV-JKO scheme aims at minimizing total variation at the fastest rate in the Wasserstein metric, it is natural to wonder whether when the initial condition is bounded from above and from below then the JKO-iterates remain so (with the same bounds). We shall answer affirmatively for the upper bound (maximum principle), as for the propagation of the lower bound (minimum principle), we have been able to prove it only in special cases (dimension one and radially symmetric setting).

4.1 Convexity along generalized geodesics

Our aim is to deduce some bounds on $\rho_{1}$ from bounds on $\rho_{0}$ . To do so, we shall combine some convexity arguments and a remarkable $\mathrm{BV}$ estimate due to De Philippis et al. [11]. First we recall the notion of generalized geodesic from Ambrosio, Gigli and Savaré [1]. Given $\overline{\mu}$ , $\mu_{0}$ and $\mu_{1}$ in ${\cal P}_{\rm{ac}}(\Omega)$ , and denoting by $T_{0}$ (respectively $T_{1}$ ) the optimal transport (Brenier) map between $\overline{\mu}$ and $\mu_{0}$ (respectively $\mu_{1})$ , the generalized geodesic with base $\overline{\mu}$ joining $\mu_{0}$ to $\mu_{1}$ is by definition the curve of measures:

[TABLE]

A key property of these curves introduced in [1] is the strong convexity of the squared distance estimate:

[TABLE]

It is well-known that if $G$ : $\mathbb{R}_{+}\to\mathbb{R}\cup\{+\infty\}$ is a proper convex lower semi-continuous (l.s.c.) internal energy density, bounded from below such that $G(0)=0$ and which satisfies McCann’s condition (see [23])

[TABLE]

then defining the generalized geodesic curve $(\mu_{t})_{t\in[0,1]}$ by (4.1), one has

[TABLE]

In particular $L^{p}$ and uniform bounds are stable along generalized geodesics:

[TABLE]

and

[TABLE]

An immediate consequence of (4.2) (see chapter 4 of [1] for general contraction estimates) is the following

Lemma 4.1.

Let $K$ be a nonempty subset of ${\cal P}_{\rm{ac}}(\Omega)$ , let $\mu_{0}\in K$ , $\mu_{1}\in{\cal P}_{\rm{ac}}(\Omega)$ , if $\hat{\mu}_{1}\in\mathrm{argmin}_{\mu\in K}W^{2}_{2}(\mu_{1},\mu)$ is a Wasserstein projection of $\mu_{1}$ onto $K$ , and if the generalized geodesic with base $\mu_{1}$ joining $\mu_{0}$ to $\hat{\mu}_{1}$ remains in $K$ then

[TABLE]

Proof.

Since $\mu_{t}\in K$ we have $W^{2}_{2}(\mu_{1},\hat{\mu}_{1})\leq W_{2}^{2}(\mu_{1},\mu_{t})$ , applying (4.2) to the generalized geodesics with base $\mu_{1}$ joining $\mu_{0}$ to $\hat{\mu}_{1}$ we thus get

[TABLE]

dividing by $(1-t)$ and then taking $t=1$ therefore gives the desired result.

∎

The other result we shall use to derive bounds is a $\mathrm{BV}$ estimate of De Philippis et al. [11], which states that, given, $\mu\in{\cal P}_{\rm{ac}}(\Omega)\cap\mathrm{BV}(\Omega)$ , and $G$ : $\mathbb{R}_{+}\to\mathbb{R}\cup\{+\infty\}$ , proper convex l.s.c., the solution of

[TABLE]

is $\mathrm{BV}$ with the bound

[TABLE]

Taking in particular,

[TABLE]

this implies that the Wasserstein projection of $\mu$ onto the set defined by the constraint $\rho\leq M$ has a smaller total variation than $\mu$ .

4.2 Maximum principle

Theorem 4.2.

Let $\rho_{0}\in{\cal P}_{\rm{ac}}(\Omega)\cap L^{\infty}(\Omega)$ and let $\rho_{1}$ be the solution of (3.1), then $\rho_{1}\in L^{\infty}(\Omega)$ with

[TABLE]

Proof.

Thanks to (4.5) the set $K:=\{\rho\in{\cal P}_{\rm{ac}}(\Omega):\;\rho\leq\|\rho_{0}\|_{L^{\infty}(\Omega)}\mbox{ a.e.}\}$ has the property that the generalized geodesics (with any base) joining two of its points remains in $K$ . Let then $\hat{\rho}_{1}$ be the $W_{2}$ projection of $\rho_{1}$ onto $K$ i.e. the solution of $\inf_{\rho\in K}W^{2}_{2}(\rho_{1},\rho)$ . Thanks to Lemma 4.1 we have $W^{2}_{2}(\rho_{0},\hat{\rho}_{1})\leq W_{2}^{2}(\rho_{0},\rho_{1})-W_{2}^{2}(\rho_{1},\hat{\rho}_{1})$ and thanks to Theorem 1.1 of [11], $J(\hat{\rho}_{1})\leq J(\rho_{1})$ . The optimality of $\rho_{1}$ for (3.1) therefore implies $W_{2}(\rho_{1},\hat{\rho}_{1})=0$ i.e. $\rho_{1}\leq\|\rho_{0}\|_{L^{\infty}(\Omega)}$ .

∎

*Remark 4.3**.*

In section 3, we have used an approximation of (3.1) with an additional small entropy term, the same bound as in Theorem 4.2 will remain valid in this case. Indeed, consider a proper convex l.s.c. and bounded from below internal energy density $G$ and consider given $h\geq 0$ , the variant of (3.1)

[TABLE]

Then we claim that the solution $\rho_{h}$ still satisfies $\rho_{h}\leq\|\rho_{0}\|_{L^{\infty}(\Omega)}$ . Indeed we have seen in the previous proof that the Wasserstein projection $\hat{\rho}_{h}$ of $\rho_{h}$ onto the constraint $\rho\leq\|\rho_{0}\|_{L^{\infty}(\Omega)}$ both diminishes $J$ and the Wasserstein distance to $\rho_{0}$ . It turns out that it also diminishes the internal energy. Indeed, thanks to Proposition 5.2 of [11], there is a measurable set $A$ such that $\hat{\rho}_{h}=\chi_{A}\rho_{h}+\chi_{\Omega\setminus A}\|\rho_{0}\|_{L^{\infty}}$ , it thus follows that $|\Omega\setminus A|\|\rho_{0}\|_{L^{\infty}}=\int_{\Omega\setminus A}\rho_{h}$ . So, from the convexity of $G$ and Jensen’s inequality,

[TABLE]

thus yielding the same conclusion as above.

4.3 Minimum principle in special cases

In dimension one, it turns out that we can obtain bounds from below by the same convexity arguments as for the maximum principle of Theorem 4.2:

Proposition 4.4.

Assume that $d=1$ , that $\Omega$ is a bounded interval and that $\rho_{0}\geq\alpha>0$ a.e. on $\Omega$ then the solution $\rho_{1}$ of (3.1) also satifies $\rho_{1}\geq\alpha>0$ a.e. on $\Omega$ .

Proof.

The proof is similar to that of Theorem 4.2 but using the Wasserstein projection on the set $K:=\{\rho\in{\cal P}_{\rm{ac}}(\Omega)\;:\;\rho\geq\alpha\}$ , the only thing to check to be able to use Lemma 4.1 is that for any basepoint $\overline{\mu}$ and any $\mu_{0}$ and $\mu_{1}$ in $K$ , the generalized geodesic with base point $\overline{\mu}$ joining $\mu_{0}$ to $\mu_{1}$ remains in $K$ . The optimal transport maps $T_{0}$ and $T_{1}$ from $\overline{\mu}$ to $\mu_{0}$ and $\mu_{1}$ respectively are nondecreasing and continuous and setting $T_{t}:=(1-t)T_{0}+tT_{1}$ , one has

[TABLE]

which is easily seen to imply that $\mu_{t}\geq\alpha$ a.e.. ∎

As a consequence of the previous minimum principle, integrating the Euler-Lagrange equation one can deduce higher regularity for the dual variable $z$ :

Corollary 4.5.

Assume that $d=1$ and $\Omega$ is a bounded interval. If $\rho_{1}$ solves (3.1) and $z$ is as in Theorem 3.3 then $z\in W^{1,\infty}_{0}(\Omega)$ . If in addition $\rho_{0}\geq\alpha>0$ a.e. on $\Omega$ , then $z\in W^{3,\infty}(\Omega)$ .

Proof.

The first claim is obvious because both $\varphi$ and $\beta$ ( $\varphi$ , $\beta$ and $z$ are as in Theorem 3.3) are bounded hence so is $z^{\prime}$ . As for the second one when $\rho_{0}\geq\alpha>0$ , thanks to Proposition 4.4, we also have $\rho_{1}\geq\alpha$ hence $\beta=0$ in (3.17) and in this case $\mathrm{div}(z)=z^{\prime}=-\frac{\varphi}{\tau}$ is Lipschitz i.e. $z\in W^{2,\infty}$ . One can actually go one step further because $x-\varphi^{\prime}(x)=T(x)$ where $T$ is the optimal (monotone) transport between $\rho_{1}$ and $\rho_{0}$ . This map is explicit in terms of the cumulative distribution function of $\rho_{1}$ , $F_{1}$ , and $F_{0}^{-1}$ the inverse of $F_{0}$ , the cumulative distribution function of $\rho_{0}$ , namely $T=F_{0}^{-1}\circ F_{1}$ . But $F_{1}$ is Lipschitz since its derivative is $\rho_{1}$ which is $\mathrm{BV}$ hence bounded and $F_{0}^{-1}$ is Lipschitz as well since $\rho_{0}\geq\alpha>0$ . This gives that $\varphi\in W^{2,\infty}$ hence $z\in W^{3,\infty}$ . ∎

The proof of Proposition 4.4 unfortunately does not generalize to higher dimensions, because densities which are bounded from below by $\alpha$ are not stable by generalized geodesics. In the radially symmetric case, we can use the Euler-Lagrange equation to derive a minimum principle. We believe that JKO steps preserve lower bounds in more general situations but have not been able to prove it.

Proposition 4.6.

Assume that $\Omega=B(0,R)$ is the ball centered at [math] or radius $R>0$ in $\mathbb{R}^{d}$ , and that $\rho_{0}$ is radially symmetric with $\rho_{0}\geq\alpha>0$ a.e. on $\Omega$ then the solution $\rho_{1}$ of (3.1) also satifies $\rho_{1}\geq\alpha>0$ a.e. on $\Omega$ .

Proof.

Let us write $\rho_{0}(x)={\widetilde{\rho}}_{0}(r)$ with $r=|x|\in[0,R]$ , since (3.1) is invariant by rotation and strictly convex, it is easy to see that its unique solution $\rho_{1}$ is also radially symmetric, let us write it as $\rho_{1}(x)={\widetilde{\rho}}_{1}(r)$ . Denoting by $c_{d}$ the $(d-1)$ -Hausdorff measure of the unit sphere $S^{d-1}$ , and setting ${\widetilde{\mu}}_{0}:=c_{d}r^{d-1}{\widetilde{\rho}}_{0}$ , ${\widetilde{\mu}}_{1}:=c_{d}r^{d-1}{\widetilde{\rho}}_{1}$ , observe that ${\widetilde{\rho}}_{1}$ is the minimizer of the one-dimensional convex functional

[TABLE]

among nonnegative densities ${\widetilde{\rho}}$ on $(0,R)$ such that $c_{d}\int_{0}^{R}r^{d-1}{\widetilde{\rho}}=1$ and $r^{d-1}D{\widetilde{\rho}}$ is a bounded Radon measure on $(0,R)$ . Arguing as in the proof of Theorem 3.3, the minimizer ${\widetilde{\rho}}_{1}$ is characterized by the Euler-Lagrange equation

[TABLE]

where ${\widetilde{\varphi}}$ is a Kantorovich potential between ${\widetilde{\mu}}_{1}$ and ${\widetilde{\mu}}_{0}$ and ${\widetilde{z}}\in L^{\infty}(0,R)$ is such that

[TABLE]

Note that (4.12) implies that $r^{d-1}{\widetilde{z}}$ is Lipschitz so that ${\widetilde{z}}$ is locally Lipschitz and

[TABLE]

Since ${\widetilde{\rho}}_{1}\in\mathrm{BV}_{\rm{loc}}(0,R)$ , we can perform a Hahn-Jordan decomposition of $D{\widetilde{\rho}}_{1}$ :

[TABLE]

and set

[TABLE]

Next, we observe that, using (4.14), we have $|D{\widetilde{\rho}}_{1}|=\nu^{+}+\nu^{-}=-{\widetilde{z}}(\nu^{+}-\nu^{-})$ , we thus deduce that ${\widetilde{z}}=-1=\min{\widetilde{z}}$ $\nu^{+}$ -a.e and since ${\widetilde{z}}$ is continuous we actually have ${\widetilde{z}}=-1$ on $A^{+}=\mathrm{spt}(\nu^{+})$ . In a similar way, ${\widetilde{z}}=1=\max{\widetilde{z}}$ on $A^{-}:=\mathrm{spt}(\nu^{-})$ .

Now let us show that ${\widetilde{\rho}}_{1}\geq\alpha$ . Assume, by contradiction, that the set where ${\widetilde{\rho}}_{1}<\alpha$ has positive measure in $(0,R)$ , and let $r_{0}\in(0,R)$ be a continuity point of ${\widetilde{\rho}}_{1}$ such that ${\widetilde{\rho}}_{1}(r_{0})<\alpha$ , define then

[TABLE]

We then have $0\leq a_{-}<a_{+}\leq R$ . Let us assume that $a_{-}>0$ , we claim then that $a_{-}\in A^{-}$ since otherwise, ${\widetilde{\rho}}_{1}$ would be nondecreasing in a neighbourhood of $a_{-}$ which would imply ${\widetilde{\rho}}_{1}(a_{-}-\varepsilon)\leq\alpha$ for small $\varepsilon>0$ , contradicting the definition of $a_{-}$ , we thus have ${\widetilde{z}}(a_{-})=1$ . Since ${\widetilde{\rho}}_{1}$ is $\mathrm{BV}$ in a neigbourhood of $a_{-}$ , it has a right and a left limit at $a_{-}$ , again by minimality of $a_{-}$ , the left limit of ${\widetilde{\rho}}_{1}$ at $a_{-}$ cannot be strictly smaller than $\alpha$ , so there is an $\varepsilon>0$ such that ${\widetilde{\rho}}_{1}>0$ on $I_{-}:=[a_{-}-\varepsilon,a_{-})$ . Hence on $I_{-}$ , (4.12) becomes

[TABLE]

moreover, on $I_{-}$ , ${\widetilde{\varphi}}$ is actually of class $C^{1}$ with ${\widetilde{\varphi}}^{\prime}(r)=r-{\widetilde{T}}(r)$ where ${\widetilde{T}}$ is the (continuous) optimal transport between ${\widetilde{\mu}}_{1}$ and ${\widetilde{\mu}}_{0}$ obtained by the relation $F_{{\widetilde{\mu}}_{0}}\circ{\widetilde{T}}=F_{{\widetilde{\mu}}_{1}}$ (where $F_{{\widetilde{\mu}}_{i}}$ is the cumulative distribution function of ${\widetilde{\mu}}_{i}$ for $i=0,1$ ). One can therefore differentiate (4.17) on $I_{-}$ so as to obtain

[TABLE]

Since ${\widetilde{z}}$ is maximal at $a_{-}$ , we first have

[TABLE]

but recalling (4.12) we also have

[TABLE]

which shows that ${\widetilde{z}}$ is differentiable at $a_{-}$ with ${\widetilde{z}}^{\prime}(a_{-})=0$ , this enables us to deduce that ${\widetilde{z}}^{\prime\prime}(a_{-}^{-}):=\lim_{\delta\to 0^{+}}{\widetilde{z}}^{\prime\prime}(a_{-}-\delta)\leq 0$ , with (4.18) this gives

[TABLE]

If $a_{-}=0$ , since ${\widetilde{T}}(0)=0$ , the same conclusion is reached with an equality. In a similar way, we obtain ${\widetilde{T}}(a_{+})\geq a_{+}$ (again with an equality in case $a_{+}=R$ ). Using the fact that ${\widetilde{\rho}}_{1}\leq\alpha$ on $(a_{-},a_{+})$ (with strict inequality in a neighbourhood of $r_{0}$ ) together with $F_{{\widetilde{\mu}}_{0}}\circ{\widetilde{T}}=F_{{\widetilde{\mu}}_{1}}$ and ${\widetilde{\rho}}_{0}\geq\alpha$ , we get

[TABLE]

which yields the desired contradiction.

∎

Let us remark that the proof of Proposition 4.6 gives an alternative proof of the minimum principle in dimension one.

5 Convergence of the TV-JKO scheme under a lower bound estimate

We are now interested in the convergence of the TV-JKO scheme to a solution of the fourth-order nonlinear equation (1.2) as the time step $\tau$ goes to [math]. Throughout this section, we assume that $\Omega$ is a bounded open convex subset of $\mathbb{R}^{d}$ and that the initial condition $\rho_{0}$ satisfies

[TABLE]

We fix a time horizon $T$ , and for small $\tau>0$ , define the sequence $\rho_{k}^{\tau}$ by

[TABLE]

for $k=0,\ldots N_{\tau}$ with $N_{\tau}:=[\frac{T}{\tau}]$ . Thanks to Theorem 4.2, (5.1) ensures that the JKO-iterates $\rho_{k}^{\tau}$ defined by (5.2) also remain bounded $\rho_{k}^{\tau}\leq\|\rho_{0}\|_{L^{\infty}(\Omega)}$ . We shall also assume that $\rho_{k}^{\tau}$ remains bounded from below by $\alpha$ :

[TABLE]

which holds, as we have seen in subsection 4.3 when $d=1$ or when $\Omega$ is a ball and $\rho_{0}$ is radially symmetric.

We extend this discrete sequence by piecewise constant interpolation i.e.

[TABLE]

We shall see that $\rho^{\tau}$ converges to a solution $\rho$ of

[TABLE]

with the no-flux boundary condition

[TABLE]

Let us introduce the spaces

[TABLE]

Since $\rho$ is no more than $\mathrm{BV}$ in $x$ , one has to be slightly cautious in the meaning of $\mathrm{div}(\frac{\nabla\rho}{|\nabla\rho|})$ which be conveniently done by interpreting this term as the negative of an element in the subdifferential of $J$ (in the $L^{2}$ sense). For every $\rho\in\mathrm{BV}(\Omega)\cap L^{2}(\Omega)$ let us define

[TABLE]

This leads to the following definition:

Definition 5.1.

A weak solution of (5.5)-(5.6) is a $\rho\in L^{\infty}((0,T),\mathrm{BV}(\Omega)\cap L^{\infty}(\Omega))\cap C^{0}([0,T],({\cal P}(\overline{\Omega}),W_{2}))$ such that there exists $z\in L^{\infty}((0,T)\times\Omega)\cap L^{2}((0,T),H^{2}_{\mathrm{div}}(\Omega))$ with

[TABLE]

and $\rho$ is a weak solution of

[TABLE]

i.e. for every $u\in C_{c}^{\infty}([0,T)\times\overline{\Omega})$

[TABLE]

We then have

Theorem 5.2.

If $\rho_{0}$ satisfies (5.6) and the JKO iterates $\rho_{k}^{\tau}$ obey the lower bound (5.3), there exists a vanishing sequence of time steps $\tau_{n}\to 0$ such that the sequence $\rho^{\tau_{n}}$ constructed by (5.2)-(5.4) converges strongly in $L^{p}((0,T)\times(0,1))$ for any $p\in[1,+\infty)$ and in $L^{\infty}((0,T),({\cal P}(\overline{\Omega}),W_{2}))$ to a weak solution of (5.5)-(5.6).

Proof.

First, $\rho_{0}$ being $L^{\infty}$ , we have a uniform $L^{\infty}$ bound on $\rho^{\tau}$ thanks to Theorem 4.2, and from our extra lower bound assumption (5.3) we have

[TABLE]

Moreover, by construction of the TV-JKO scheme (5.2), one has

[TABLE]

By using an Aubin-Lions type compactness Theorem of Savaré and Rossi (Theorem 2 in [24]), the fact that the embedding of $\mathrm{BV}(\Omega)$ into $L^{p}(\Omega)$ is compact for every $p\in[1,\frac{d}{d-1})$ as well as a refinement of Arzèla-Ascoli Theorem (Proposition 3.3.1 in [1]), one obtains (see section 4 of [12] or section 5 of [8] for details) that, up to taking suitable sequence of vanishing times steps $\tau_{n}\to 0$ , we may assume that

[TABLE]

and

[TABLE]

for some limit curve $\rho\in C^{0,\frac{1}{2}}([0,T],({\cal P}(\overline{\Omega}),W_{2}))\cap L^{q}((0,T)\times\Omega)$ . From (5.9) and Lebesgue’s dominated convergence Theorem, we deduce that the convergence in (5.11) actually holds for any $p\in[1,+\infty)$ . It also follows from (5.9) and (5.10), that $\rho\in L^{\infty}((0,T),\mathrm{BV}(\Omega)\cap L^{\infty}(\Omega))$ and that $\rho\geq\alpha$ .

We deduce from the fact that $\rho_{k}^{\tau}\geq\alpha>0$ and Theorem 3.3 that for each $k=0,\ldots,N_{\tau}$ , there exists $z_{k}^{\tau}\in L^{\infty}(\Omega,\mathbb{R}^{d})$ such that $\mathrm{div}(z_{k}^{\tau})\in W^{1,\infty}(\Omega)$ and

[TABLE]

and the optimal (backward) optimal transport $T_{k+1}^{\tau}$ from $\rho_{k+1}^{\tau}$ to $\rho_{k}^{\tau}$ is related to $z_{k+1}^{\tau}$ by

[TABLE]

We extend $z_{k}^{\tau}$ in a piecewise constant way i.e. set

[TABLE]

We then observe that

[TABLE]

Thanks to (5.10) we thus deduce that $\nabla\mathrm{div}z^{\tau}$ is bounded in $L^{2}((0,T)\times\Omega)$ , since $\mathrm{div}(z^{\tau})$ has zero-mean, with Poincaré-Wirtinger inequality, we obtain

[TABLE]

We may therefore assume (up to further suitable extractions) that there is some $z\in L^{\infty}((0,T)\times\Omega)\cap L^{2}((0,T),H^{2}_{\mathrm{div}}(\Omega))$ such that $z^{\tau}$ converges to $z$ weakly $*$ in $L^{\infty}((0,T)\times\Omega)$ and $(\mathrm{div}(z^{\tau}),\nabla\mathrm{div}(z^{\tau}))$ converges weakly in $L^{2}((0,T)\times\Omega)$ to $(\mathrm{div}(z),\nabla\mathrm{div}(z))$ . Of course $\|z\|_{L^{\infty}}\leq 1$ and $z(t,.)\cdot\nu=0$ on $\partial\Omega$ for a.e. $t$ . Note also that $\rho^{\tau}\nabla\mathrm{div}(z^{\tau})$ converges weakly in $L^{1}((0,T)\times\Omega)$ to $\rho\nabla\mathrm{div}(z)$ .

The limiting equation can now be derived using standard computations (see the proof of Theorem 5.1 of the seminal work [17], or chapter 8 of [26]): Let $u\in C_{c}^{2}([0,T)\times\overline{\Omega})$ and observe that

[TABLE]

Recalling that $\rho_{k}^{\tau}={T_{k+1}^{\tau}}_{\#}\rho_{k+1}^{\tau}$ , and applying Taylor’s theorem, we have

[TABLE]

where $|\tilde{R}_{\tau}(x)|\leq C\|D^{2}u(k\tau,\cdot)\|_{L^{\infty}}|T_{k+1}^{\tau}(x)-x|^{2}$ . Note also that for $t\in(k\tau,(k+1)\tau]$ , $|\nabla u(k\tau,\cdot)-\nabla u(t,\cdot)|\leq\tau\|\partial_{t}\nabla u\|_{L^{\infty}}$ . Therefore,

[TABLE]

with

[TABLE]

Passing to the limit $\tau$ to [math] in (5.17) yields that $\rho$ is a weak solution to

[TABLE]

It remains to prove that $J(\rho(t,.))=\int_{\Omega}\mathrm{div}(z(t,x))\rho(t,x)\mbox{d}x$ , for a.e. $t\in(0,T)$ . The inequality $J(\rho(t,.))\geq\int_{\Omega}\mathrm{div}(z(t,x))\rho(t,x)\mbox{d}x$ is obvious since $z(t,.)\in H^{1}_{\mathrm{div}}(\Omega)$ , $z(t,.)\cdot\nu=0$ on $\partial\Omega$ and $\|z(t,.)\|_{L^{\infty}}\leq 1$ . To prove the converse inequality, we use Fatou’s Lemma, the lower semi-continuity of $J$ , (5.13) and the weak-convergence in $L^{1}((0,T)\times\Omega)$ of $\rho^{\tau}\mathrm{div}(z^{\tau})$ to $\rho\mathrm{div}(z)$ :

[TABLE]

which concludes the proof.

∎

Acknowledgements: The authors wish to thank Vincent Duval and Gabriel Peyré for suggesting the TV-Wasserstein problem to them as well as for fruitful discussions. They also thank Maxime Laborde and Filippo Santambrogio for helpful remarks in particular regarding the maximum principle.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Luigi Ambrosio, Nicola Gigli, and Giuseppe Savaré. Gradient flows: in metric spaces and in the space of probability measures . Springer Science & Business Media, 2008.
2[2] G. Anzellotti. Pairing between measures and bounded functions and compensated compactness. Ann. di Matematica Pura ed Appl. , IV(135):293–318, 1983.
3[3] Giovanni Bellettini, Vicent Caselles, and Matteo Novaga. The total variation flow in ℝ n superscript ℝ 𝑛 \mathbb{R}^{n} . Journal of Differential Equations , 184(2):475–525, 2002.
4[4] Martin Benning, Luca Calatroni, Bertram Düring, and Carola-Bibiane Schönlieb. A primal-dual approach for a total variation Wasserstein flow. In Geometric Science of Information , pages 413–421. Springer, 2013.
5[5] Yann Brenier. Polar factorization and monotone rearrangement of vector-valued functions. Comm. Pure Appl. Math. , 44(4):375–417, 1991.
6[6] Haïm Brezis. Analyse fonctionnelle . Collection Mathématiques Appliquées pour la Maîtrise. [Collection of Applied Mathematics for the Master’s Degree]. Masson, Paris, 1983. Théorie et applications. [Theory and applications].
7[7] Martin Burger, Marzena Franek, and Carola-Bibiane Schönlieb. Regularized regression and density estimation based on optimal transport. Applied Mathematics Research e Xpress , 2012(2):209–253, 2012.
8[8] Guillaume Carlier and Maxime Laborde. A splitting method for nonlinear diffusions with nonlocal, nonpotential drifts. Nonlinear Anal. , 150:1–18, 2017.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

On the total variation Wasserstein gradient flow and the TV-JKO scheme

Abstract

1 Introduction

2 Some examples

Lemma 2.1**.**

Proof.

2.1 The case of a characteristic function

2.2 Instantaneaous creation of discontinuities

3 Euler-Lagrange equation for JKO steps

3.1 Entropic approximation

Proposition 3.1**.**

Proof.

Proposition 3.2**.**

Proof.

3.2 Euler-Lagrange equation

Theorem 3.3**.**

Remark 3.4*.*

Proof.

4 Maximum and minimum principles for JKO steps

4.1 Convexity along generalized geodesics

Lemma 4.1**.**

Proof.

4.2 Maximum principle

Theorem 4.2**.**

Proof.

Remark 4.3*.*

4.3 Minimum principle in special cases

Proposition 4.4**.**

Proof.

Corollary 4.5**.**

Proof.

Proposition 4.6**.**

Proof.

5 Convergence of the TV-JKO scheme under a lower bound estimate

Definition 5.1**.**

Theorem 5.2**.**

Proof.

Lemma 2.1.

Proposition 3.1.

Proposition 3.2.

Theorem 3.3.

*Remark 3.4**.*

Lemma 4.1.

Theorem 4.2.

*Remark 4.3**.*

Proposition 4.4.

Corollary 4.5.

Proposition 4.6.

Definition 5.1.

Theorem 5.2.