On the extremal points of the ball of the Benamou-Brenier energy

Kristian Bredies; Marcello Carioni; Silvio Fanzon; Francisco Romero

arXiv:1907.11589·math.OC·April 26, 2023

On the extremal points of the ball of the Benamou-Brenier energy

Kristian Bredies, Marcello Carioni, Silvio Fanzon, Francisco Romero

PDF

TL;DR

This paper characterizes the extremal points of the unit ball in the Benamou-Brenier energy space, revealing they are pairs of measures on characteristic curves, and applies this to sparse solutions in inverse problems.

Contribution

It provides a novel characterization of extremal points in the Benamou-Brenier energy space and applies it to sparse inverse problem solutions.

Findings

01

Extremal points are pairs of measures on characteristic curves.

02

Representation formula for sparse solutions in inverse problems.

03

Extension to a coercive generalization of the energy.

Abstract

In this paper we characterize the extremal points of the unit ball of the Benamou--Brenier energy and of a coercive generalization of it, both subjected to the homogeneous continuity equation constraint. We prove that extremal points consist of pairs of measures concentrated on absolutely continuous curves which are characteristics of the continuity equation. Then, we apply this result to provide a representation formula for sparse solutions of dynamic inverse problems with finite dimensional data and optimal-transport based regularization.

Equations175

\frac{1}{2} \int_{0}^{1} \int_{\overline{Ω}} ∣ v_{t} (x) ∣^{2} d ρ_{t} (x),

\frac{1}{2} \int_{0}^{1} \int_{\overline{Ω}} ∣ v_{t} (x) ∣^{2} d ρ_{t} (x),

\partial_{t} ρ_{t} + div (ρ_{t} v_{t}) = 0 subjected to ρ_{t = 0} = ρ_{0}, ρ_{t = 1} = ρ_{1} .

\partial_{t} ρ_{t} + div (ρ_{t} v_{t}) = 0 subjected to ρ_{t = 0} = ρ_{0}, ρ_{t = 1} = ρ_{1} .

B (ρ, m) := \frac{1}{2} \int_{X} \frac{d m}{d ρ} (t, x)^{2} d ρ (t, x),

B (ρ, m) := \frac{1}{2} \int_{X} \frac{d m}{d ρ} (t, x)^{2} d ρ (t, x),

\partial_{t} ρ + div m = 0 .

\partial_{t} ρ + div m = 0 .

J_{α, β} (ρ, m) := β B (ρ, m) + α ∥ ρ ∥_{M (X)} subjected to \partial_{t} ρ + div m = 0,

J_{α, β} (ρ, m) := β B (ρ, m) + α ∥ ρ ∥_{M (X)} subjected to \partial_{t} ρ + div m = 0,

C_{α, β} := {(ρ, m) \in M : J_{α, β} (ρ, m) \leq 1} .

C_{α, β} := {(ρ, m) \in M : J_{α, β} (ρ, m) \leq 1} .

ρ = a_{γ} d t \otimes δ_{γ (t)}, m = \overset{γ}{˙} (t) a_{γ} d t \otimes δ_{γ (t)}, a_{γ} = (\frac{β}{2} \int_{0}^{1} ∣ \overset{γ}{˙} (t) ∣^{2} d t + α)^{- 1},

ρ = a_{γ} d t \otimes δ_{γ (t)}, m = \overset{γ}{˙} (t) a_{γ} d t \otimes δ_{γ (t)}, a_{γ} = (\frac{β}{2} \int_{0}^{1} ∣ \overset{γ}{˙} (t) ∣^{2} d t + α)^{- 1},

\overset{γ}{˙} (t) = v (t, γ (t)) \mbox f or a . e . t \in (0, 1),

\overset{γ}{˙} (t) = v (t, γ (t)) \mbox f or a . e . t \in (0, 1),

u \in U min R (u) subjected to A u = y,

u \in U min R (u) subjected to A u = y,

M := M (X) \times M (X; R^{d}),

M := M (X) \times M (X; R^{d}),

D := {(ρ, m) \in M : \partial_{t} ρ + div m = 0 in X},

D := {(ρ, m) \in M : \partial_{t} ρ + div m = 0 in X},

\int_{X} \partial_{t} φ d ρ + \int_{X} \nabla φ \cdot d m = 0 for all φ \in C_{c}^{\infty} (X) .

\int_{X} \partial_{t} φ d ρ + \int_{X} \nabla φ \cdot d m = 0 for all φ \in C_{c}^{\infty} (X) .

Ψ (t, x) := ⎩ ⎨ ⎧ \frac{∣ x ∣ ^{2}}{2 t} 0 + \infty if t > 0, if t = ∣ x ∣ = 0, otherwise .

Ψ (t, x) := ⎩ ⎨ ⎧ \frac{∣ x ∣ ^{2}}{2 t} 0 + \infty if t > 0, if t = ∣ x ∣ = 0, otherwise .

B (ρ, m) := \int_{X} Ψ (\frac{d ρ}{d λ}, \frac{d m}{d λ}) d λ,

B (ρ, m) := \int_{X} Ψ (\frac{d ρ}{d λ}, \frac{d m}{d λ}) d λ,

J_{α, β} (ρ, m) := {β B (ρ, m) + α ∥ ρ ∥_{M (X)} + \infty if (ρ, m) \in D, otherwise,

J_{α, β} (ρ, m) := {β B (ρ, m) + α ∥ ρ ∥_{M (X)} + \infty if (ρ, m) \in D, otherwise,

\int_{X} φ (t, x) d ρ (t, x) = \int_{0}^{1} \int_{\overline{Ω}} φ (t, x) d ρ_{t} (x) d t for all φ \in L_{ρ}^{1} (X),

\int_{X} φ (t, x) d ρ (t, x) = \int_{0}^{1} \int_{\overline{Ω}} φ (t, x) d ρ_{t} (x) d t for all φ \in L_{ρ}^{1} (X),

t \mapsto \int_{\overline{Ω}} φ (x) d ρ_{t} (x)

t \mapsto \int_{\overline{Ω}} φ (x) d ρ_{t} (x)

B (ρ, m) = \int_{X} Ψ (1, v) d ρ = \frac{1}{2} \int_{X} ∣ v ∣^{2} d ρ .

B (ρ, m) = \int_{X} Ψ (1, v) d ρ = \frac{1}{2} \int_{X} ∣ v ∣^{2} d ρ .

\int_{0}^{1} \int_{\overline{Ω}} ∣ v ∣^{2} d ρ_{t} (x) d t < + \infty,

\int_{0}^{1} \int_{\overline{Ω}} ∣ v ∣^{2} d ρ_{t} (x) d t < + \infty,

α ∥ ρ ∥_{M (X)} \leq J_{α, β} (ρ, m), min (2 α, β) ∥ m ∥_{M (X; R^{d})} \leq J_{α, β} (ρ, m) .

α ∥ ρ ∥_{M (X)} \leq J_{α, β} (ρ, m), min (2 α, β) ∥ m ∥_{M (X; R^{d})} \leq J_{α, β} (ρ, m) .

n sup J_{α, β} (ρ^{n}, m^{n}) < + \infty,

n sup J_{α, β} (ρ^{n}, m^{n}) < + \infty,

{(ρ^{n}, m^{n}) ⇀ * (ρ, m) weakly* in M, ρ_{t}^{n} ⇀ * ρ_{t} weakly* in M (\overline{Ω}), for every t \in [0, 1] .

{(ρ^{n}, m^{n}) ⇀ * (ρ, m) weakly* in M, ρ_{t}^{n} ⇀ * ρ_{t} weakly* in M (\overline{Ω}), for every t \in [0, 1] .

C_{α, β} := {(ρ, m) \in M : J_{α, β} (ρ, m) \leq 1} .

C_{α, β} := {(ρ, m) \in M : J_{α, β} (ρ, m) \leq 1} .

ρ = a_{γ} d t \otimes δ_{γ (t)}, m = \overset{γ}{˙} (t) a_{γ} d t \otimes δ_{γ (t)}, a_{γ} := (\frac{β}{2} \int_{0}^{1} ∣ \overset{γ}{˙} (t) ∣^{2} d t + α)^{- 1},

ρ = a_{γ} d t \otimes δ_{γ (t)}, m = \overset{γ}{˙} (t) a_{γ} d t \otimes δ_{γ (t)}, a_{γ} := (\frac{β}{2} \int_{0}^{1} ∣ \overset{γ}{˙} (t) ∣^{2} d t + α)^{- 1},

Ext (C_{α, β}) = {(0, 0)} \cup C_{α, β} .

Ext (C_{α, β}) = {(0, 0)} \cup C_{α, β} .

Γ := {γ : [0, 1] \to R^{d} : γ continuous}

Γ := {γ : [0, 1] \to R^{d} : γ continuous}

Γ_{v} (R^{d}) := {γ \in Γ : γ \in AC^{2} ([0, 1]; R^{d}), \overset{γ}{˙} (t) = v (t, γ (t)) for a.e. t \in (0, 1)} .

Γ_{v} (R^{d}) := {γ \in Γ : γ \in AC^{2} ([0, 1]; R^{d}), \overset{γ}{˙} (t) = v (t, γ (t)) for a.e. t \in (0, 1)} .

Γ_{v} (\overline{Ω}) := {γ \in Γ_{v} (R^{d}) : γ (t) \in \overline{Ω} for all t \in [0, 1]} .

Γ_{v} (\overline{Ω}) := {γ \in Γ_{v} (R^{d}) : γ (t) \in \overline{Ω} for all t \in [0, 1]} .

\int_{0}^{1} \int_{\overline{Ω}} ∣ v (t, x) ∣^{2} d ρ_{t} (x) d t < + \infty .

\int_{0}^{1} \int_{\overline{Ω}} ∣ v (t, x) ∣^{2} d ρ_{t} (x) d t < + \infty .

\int_{\overline{Ω}} φ (x) d ρ_{t} (x) = \int_{Γ} φ (γ (t)) d σ (γ) for every φ \in C (\overline{Ω}), t \in [0, 1] .

\int_{\overline{Ω}} φ (x) d ρ_{t} (x) = \int_{Γ} φ (γ (t)) d σ (γ) for every φ \in C (\overline{Ω}), t \in [0, 1] .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

On the extremal points of the ball of the Benamou–Brenier energy

Kristian Bredies

Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstraße 36, 8010 Graz, Austria.

[email protected]

,

Marcello Carioni

University of Cambridge, Department of Applied Mathematics and Theoretical Physics, Wilberforce Road, Cambridge CB3 0WA, UK

[email protected]

,

Silvio Fanzon

Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstraße 36, 8010 Graz, Austria.

[email protected]

and

Francisco Romero

Institute of Mathematics and Scientific Computing, University of Graz, Heinrichstraße 36, 8010 Graz, Austria.

[email protected]

Abstract.

In this paper we characterize the extremal points of the unit ball of the Benamou–Brenier energy and of a coercive generalization of it, both subjected to the homogeneous continuity equation constraint. We prove that extremal points consist of pairs of measures concentrated on absolutely continuous curves which are characteristics of the continuity equation. Then, we apply this result to provide a representation formula for sparse solutions of dynamic inverse problems with finite dimensional data and optimal-transport based regularization.

Keywords: Benamou–Brenier energy, extremal points, continuity equation, superposition principle, dynamic inverse problems, sparsity

Mathematics Subject Classification (2010): 52A05, 49N45, 49J45, 35F05

1. Introduction

The classical theory of Optimal Transport deals with the problem of efficiently transporting mass from a probability distribution into a target one. In the last thirty years, great advances in the understanding of the underlying theory have been achieved [2, 46, 49]. However, only recently these techniques are starting to be applied in order to solve computational problems in a great variety of fields, with logistic problems [8, 18, 19, 20], crowd dynamics [37, 38], image processing [29, 35, 39, 41, 44, 47, 48], inverse problems [16, 32] and machine learning [5, 27, 28, 40, 45, 52] being a few examples.

In this paper we focus on the so-called Benamou–Brenier formula, which provides an equivalent dynamic formulation of the classical Monge–Kantorovich transport problem [31]. Introduced by Benamou and Brenier in [6], such formula allows to compute an optimal transport between two probability measures $\rho_{0}$ and $\rho_{1}$ on a closed bounded domain $\overline{\Omega}\subset\mathbb{R}^{d}$ through the minimization of the kinetic energy

[TABLE]

among all the pairs $(\rho_{t},v_{t})$ , where $t\mapsto\rho_{t}$ is a curve of probability measures on $\overline{\Omega}$ , $v_{t}\colon\overline{\Omega}\to\mathbb{R}^{d}$ is a time-dependent vector field and the pair $(\rho_{t},v_{t})$ satisfies distributionally the continuity equation

[TABLE]

The interest around the Benamou–Brenier formulation is motivated by its remarkable properties. First, it allows to compute an optimal transport in an efficient way [6] by means of a convex reformulation of (1), by introducing the momentum $m_{t}=\rho_{t}v_{t}$ . More precisely, denoting by $X:=(0,1)\times\overline{\Omega}$ the time-space cylinder, the Benamou–Brenier energy (1) can be equivalently defined as a convex functional on the space of bounded Borel measures $\mathcal{M}:=\mathcal{M}(X)\times\mathcal{M}(X;\mathbb{R}^{d})$ by setting

[TABLE]

whenever $(\rho,m)\in\mathcal{M}$ are such that $\rho\geq 0$ , $m\ll\rho$ , and $B:=+\infty$ otherwise. With this change of variables, the continuity equation at (2) assumes the form

[TABLE]

In addition, the dynamic nature of the Benamou–Brenier reformulation of optimal transport is at the core of many recent developments in the fields of PDEs, optimal transport and inverse problems. Indeed, the dynamic formulation allows to endow the space of probability measures with a differentiable structure [2, 49], making possible the characterization of differential equations as gradient flows in spaces of measures [4, 30, 43] or the derivation of sharp inequalities [23, 36, 42]. Moreover, it motivated recent developments in unbalanced optimal transport theory [21, 22, 33, 34], i.e., when the marginals are arbitrary positive measures. Finally, as the Benamou–Brenier energy provides a description of the optimal flow of the transported mass at each time $t$ , which is a valuable information in applications, it was recently employed as a regularizer for variational inverse problems [12, 13, 16, 29, 35, 50].

The goal of this paper is to characterize the extremal points of the unit ball of the Benamou–Brenier energy $B$ at (3), and of a coercive version of it, which is obtained by adding the total variation of $\rho$ to $B$ . Both functionals are constrained via the continuity equation (2). Precisely, we introduce the functional

[TABLE]

defined for all $(\rho,m)\in\mathcal{M}$ and $\alpha\geq 0$ , $\beta>0$ . We then characterize the extremal points of the subset of $\mathcal{M}$ defined by

[TABLE]

We emphasize that we do not enforce initial conditions to the continuity equation in (5). To be more specific, we prove the following result (see Theorem 6).

Theorem.

Let $\alpha\geq 0$ , $\beta>0$ . The extremal points of the set $C_{\alpha,\beta}$ are exactly given by the zero measure $(0,0)$ and the pairs of measures $(\rho,m)$ such that

[TABLE]

where $\gamma:[0,1]\to\overline{\Omega}$ is an absolutely continuous curve with weak derivative $\dot{\gamma}\in L^{2}$ , and such that $a_{\gamma}<+\infty$ . If $\alpha=0$ the condition $a_{\gamma}<+\infty$ is satisfied if and only if $\gamma$ is not constant.

We therefore show that the extremal points of the set $C_{\alpha,\beta}$ are pairs of measures $(\rho,m)$ , with $\rho$ concentrated on some absolutely continuous curve $\gamma$ in $\overline{\Omega}$ , and the density of $m$ with respect to $\rho$ is given by $\dot{\gamma}$ . Notice that such conditions are equivalent to the existence of a measurable field $v\colon X\to\mathbb{R}^{d}$ such that

[TABLE]

thus showing that $\gamma$ is a characteristic associated to the continuity equation at (2) with respect to the field $v$ . We prove the above Theorem in Section 3, with the aid of a probabilistic version of the superposition principle for positive measure solutions to the continuity equation (2) on the domain $(0,1)\times\overline{\Omega}$ (see Theorem 7). We mention that the ideas behind such superposition principle are not new, and they were originally introduced in [3] for positive measures on $(0,1)\times\mathbb{R}^{d}$ (see also [2, 7, 51]). The result of Theorem 7 allows to decompose any measure solution $(\rho,m)$ of the continuity equation (4) with bounded Benamou–Brenier energy, as superposition of measures concentrated on absolutely continuous characteristics of (4), that is, curves solving (6) with $v=dm/d\rho$ . As a consequence, we show any pair of measures that is not of such a form can be written as a proper convex combination of elements of $C_{\alpha,\beta}$ and thus it is not an extremal point. The opposite inclusion follows from the convexity of the energy and the properties of the continuity equation.

The interest in characterizing extremal points of the Benamou–Brenier energy is not only theoretical. It has been recently shown in [15] and [11] that in the context of variational inverse problems with finite-dimensional data, the structure of sparse solutions is linked to the extremal points of the unit ball of the regularizer. In the classical theory of variational inverse problems one aims to solve

[TABLE]

where $\mathcal{U}$ is the target space, $R$ is a convex regularizer, $A$ is a linear observation operator mapping to a finite-dimensional space and $y$ is the observation. It has been empirically observed that the presence of the regularizer $R$ is promoting the existence of sparse solutions, namely minimizers that can be represented as a finite linear combination of simpler atoms. While this effect has been well-understood in the case when $\mathcal{U}$ is finite dimensional, the infinite-dimensional case has been only recently addressed [11, 15, 24, 25, 54, 53, 55]. In particular, in [11, 15], it has been shown that, under suitable assumptions on $R$ and $A$ , there exists a minimizer of (7) that can be represented as a finite linear combination of extremal points of the unit ball of $R$ ; namely the atoms forming a sparse solution are the extremal points of the ball of the regularizer.

In view of the above discussion, in Section 4 we apply our characterization of the extremal points of the energy $J_{\alpha,\beta}$ at (5) to understand the structure of sparse solutions for inverse problems with such transport energy acting as regularizer. We mention that the analysis is carried out for the case $\alpha>0$ , as the functional $J_{0,\beta}$ , corresponding to the rescaled Benamou–Brenier energy, lacks of compactness properties (see Remark 1). We verify that the assumptions needed to apply the representation theorems in [15] and [11] are satisfied by $J_{\alpha,\beta}$ , and consequently we deduce the existence of a minimizer that is given by a finite linear combination of measures concentrated on absolutely continuous curves in $\overline{\Omega}$ (see Theorem 10). As a specific application of Theorem 10 we consider the setting introduced in [16], where the regularizer $J_{\alpha,\beta}$ is coupled with a fidelity term that penalizes the distance between the unknown measure $\rho_{t}$ computed at $t_{1},\ldots,t_{N}\in(0,1)$ , and the observation at such times (see Section 4.2). This setting is relevant for applications, such as variational reconstruction in undersampled dynamic MRI. Employing the previous results we are able to prove the existence of a sparse solution represented with a finite linear combination of measures concentrated on absolutely continuous curves in $\overline{\Omega}$ (see Corollary 12).

To conclude, we mention that characterizing the extremal points for a given regularizer has important consequences in devising algorithms able to compute a sparse solution. Notable examples have been proposed for the total variation regularizer in the space of measures [10, 17] using so-called generalized conditional gradient methods (or Frank–Wolfe-type algorithms [26]). Inspired by the previous methods, and building on the theoretical results obtained in the present paper, we plan to develop numerical algorithms to compute sparse solutions of dynamic inverse problems with the optimal transport energy $J_{\alpha,\beta}$ as a regularizer [12, 13], effectively providing a numerical counterpart to the theoretical framework established in [16]. Finally, we remark that similar results to the ones presented in this paper can be obtained for unbalanced optimal transport energies. This has been recently achieved in [14], by introducing a novel superposition principle for measure solutions to the inhomogeneous continuity equation.

2. Mathematical setting and preliminaries

In this section we give the basic notions about the continuity equation, the Benamou–Brenier energy, and its coercive version $J_{\alpha,\beta}$ anticipated in the introduction. We refer to [2, 6, 16] for a more detailed overview. For measure theoretical notions, we refer to the definitions in [1].

Given a metric space $Y$ we will denote by $\mathcal{M}(Y)$ (resp. $\mathcal{M}(Y;\mathbb{R}^{d})$ ) the space of bounded Borel measures (resp. bounded vector Borel measures) on $Y$ . Similarly, $\mathcal{M}^{+}(Y)$ and $\mathcal{P}(Y)$ denote the set of bounded positive Borel measures and Borel probability measures on $Y$ , respectively. Let $\Omega\subset\mathbb{R}^{d}$ be an open, bounded domain with $d\in\mathbb{N},d\geq 1$ . Set $X:=(0,1)\times\overline{\Omega}$ ,

[TABLE]

and

[TABLE]

where the solutions of the continuity equation are intended in a distributional sense, that is,

[TABLE]

We remark that the above weak formulation includes no-flux boundary conditions for the momentum $m$ on $\partial\Omega$ . Also, no initial and final data is prescribed in (8). Moreover, by standard approximation arguments, we can consider in (8) test functions in $C^{1}_{c}(X)$ (see [2, Remark 8.1.1]).

We now introduce the Benamou–Brenier energy. For this purpose, define the convex, lower semicontinuous and one-homogeneous map $\Psi\colon\mathbb{R}\times\mathbb{R}^{d}\to[0,\infty]$ by setting

[TABLE]

The Benamou–Brenier energy $B:\mathcal{M}\to[0,\infty]$ is defined for every pair $(\rho,m)\in\mathcal{M}$ as

[TABLE]

$\lambda\in\mathcal{M}^{+}(X)$ is such that $\rho,m\ll\lambda$ . Since $\Psi$ is one-homogeneous, the above representation of $B$ does not depend on $\lambda$ . For some fixed $\alpha\geq 0$ , $\beta>0$ , we consider the following functional

[TABLE]

where $\left\lVert\cdot\right\rVert_{\mathcal{M}(X)}$ denotes the total variation norm in $\mathcal{M}(X)$ .

*Remark 1**.*

Note that in the definition of $J_{\alpha,\beta}$ we add the total variation of $\rho$ to the Benamou–Brenier energy. If $\alpha>0$ this choice enforces the balls of the energy $J_{\alpha,\beta}$ to be compact in the weak* topology of $\mathcal{M}$ (see Lemma 4). As a consequence, the functional $J_{\alpha,\beta}$ is a natural regularizer for dynamic inverse problems when the initial and final data are not prescribed [16]. We remark that, although in the case $\alpha=0$ the unit ball of the energy $J_{0,\beta}$ is not compact, we can still characterize its extremal points. However, in this case, due to the lack of coercivity, $J_{0,\beta}$ has limited use as a regularizer for dynamic inverse problems.

For a measure $\rho\in\mathcal{M}(X)$ , we say that $\rho$ disintegrates with respect to time if there exists a Borel family of measures $\{\rho_{t}\}_{t\in[0,1]}$ in $\mathcal{M}(\overline{\Omega})$ such that

[TABLE]

We denote such disintegration with the symbol $\rho=dt\otimes\rho_{t}$ . Further, we say that a curve of measures $t\in[0,1]\mapsto\rho_{t}\in\mathcal{M}(\overline{\Omega})$ is narrowly continuous if the map

[TABLE]

is continuous for each fixed $\varphi\in C(\overline{\Omega})$ . The family of narrowly continuous curves will be denoted by $C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))$ . We also introduce $C_{\rm w}([0,1];\mathcal{M}^{+}(\overline{\Omega}))$ , as the family of narrowly continuous curves with values into the positive measures on $\overline{\Omega}$ .

We now recall several results about $B$ , $J_{\alpha,\beta}$ and measure solutions of the continuity equation (8), which will be useful in the following analysis. For proofs of such results, we refer the interested reader to Propositions 2.2, 2.4, 2.6 and Lemmas 4.5, 4.6 in [16].

Lemma 2 (Properties of $B$ ).

The functional $B$ defined in (9) is convex, positively one-homogeneous and sequentially lower semicontinuous with respect to the weak topology on $\mathcal{M}$ . Moreover it satisfies the following properties:*

i)

$B(\rho,m)\geq 0$ * for all $(\rho,m)\in\mathcal{M}$ ,* 2. ii)

if $B(\rho,m)<+\infty$ , then $\rho\geq 0$ and $m\ll\rho$ , that is, there exists a measurable map $v\colon X\to\mathbb{R}^{d}$ such that $m=v\rho$ , 3. iii)

if $\rho\geq 0$ and $m=v\rho$ for some $v\colon X\to\mathbb{R}^{d}$ measurable, then

[TABLE]

Lemma 3 (Properties of the continuity equation).

Assume that $(\rho,m)\in\mathcal{M}$ satisfies (8) and that $\rho\in\mathcal{M}^{+}(X)$ . Then $\rho$ disintegrates with respect to time into $\rho=dt\otimes\rho_{t}$ , where $\rho_{t}\in\mathcal{M}^{+}(\overline{\Omega})$ for a.e. $t$ . Moreover $t\mapsto\rho_{t}(\overline{\Omega})$ is constant, with $\rho_{t}(\overline{\Omega})=\rho(X)$ for a.e. $t\in(0,1)$ . If in addition $B(\rho,m)<+\infty$ , that is,

[TABLE]

where $m=v\rho$ for some $v\colon X\to\mathbb{R}^{d}$ measurable, then $t\mapsto\rho_{t}$ belongs to $C_{\rm w}([0,1];\mathcal{M}^{+}(\overline{\Omega}))$ .

Lemma 4 (Properties of $J_{\alpha,\beta}$ ).

Let $\alpha\geq 0$ , $\beta>0$ . The functional $J_{\alpha,\beta}$ is non-negative, convex, positively one-homogeneous and sequentially lower semicontinuous with respect to weak convergence on $\mathcal{M}$ . Assume now $\alpha>0$ . For $(\rho,m)\in\mathcal{M}$ such that $J_{\alpha,\beta}(\rho,m)<+\infty$ we have*

[TABLE]

Moreover, if $\{(\rho^{n},m^{n})\}_{n}$ is a sequence in $\mathcal{M}$ such that

[TABLE]

then $\rho^{n}=dt\otimes\rho_{t}^{n}$ for some $(t\mapsto\rho_{t}^{n})\in C_{\rm w}([0,1];\mathcal{M}^{+}(\overline{\Omega}))$ and there exists some $(\rho,m)\in\mathcal{D}$ with $\rho=dt\otimes\rho_{t}$ , $\rho_{t}\in C_{\rm w}([0,1];\mathcal{M}^{+}(\overline{\Omega}))$ such that, up to subsequences,

[TABLE]

3. Characterization of extremal points

The aim of this section is to characterize the extremal points of the unit ball of the functional $J_{\alpha,\beta}$ at (10) for all $\alpha\geq 0,\beta>0$ , namely, of the convex set

[TABLE]

To this end, let us first introduce the following set.

Definition 5 (Characteristics).

For $\alpha\geq 0$ , $\beta>0$ define the set $\mathcal{C}_{\alpha,\beta}$ of all the pairs $(\rho,m)\in\mathcal{M}$ such that

[TABLE]

where $\gamma\in{\rm AC}^{2}([0,1];\mathbb{R}^{d})$ satisfies $\gamma(t)\in\overline{\Omega}$ for each $t\in[0,1]$ and $a_{\gamma}<+\infty$ .

We remind that ${\rm AC}^{2}([0,1];\mathbb{R}^{d})$ denotes the space of absolutely continuous curves having a weak derivative in $L^{2}$ . We point out that by definition $a_{\gamma}>0$ for all choices of $\alpha\geq 0$ , $\beta>0$ . Moreover the condition $a_{\gamma}<+\infty$ is always satisfied if $\alpha>0$ . When $\alpha=0$ we instead have $a_{\gamma}<+\infty$ if and only if $\int_{0}^{1}|\dot{\gamma}(t)|^{2}\,dt>0$ , that is, the set $\mathcal{C}_{0,\beta}$ does not contain constant curves.

For the extremal points of $C_{\alpha,\beta}$ we have the following characterization.

Theorem 6.

Let $\alpha\geq 0$ , $\beta>0$ be fixed. Then

[TABLE]

The proof of Theorem 6 is postponed to Section 3.2. In order to show the inclusion $\operatorname{Ext}(C_{\alpha,\beta})\subset\{(0,0)\}\cup\mathcal{C}_{\alpha,\beta}$ we will make use of a superposition principle for measure solutions of the continuity equation (8). This result is not new, and it is proved in [2, Ch 8.2] for the case $\Omega=\mathbb{R}^{d}$ . In Section 3.1 we show that it also holds for bounded closed domains.

3.1. The superposition principle

Before stating the superposition principle in $\overline{\Omega}$ , we introduce the following notation. Let

[TABLE]

be equipped with the supremum norm, i.e., $\left\lVert\gamma\right\rVert_{\infty}:=\max_{t\in[0,1]}|\gamma(t)|$ . For every fixed $t\in[0,1]$ let $e_{t}\colon\Gamma\to\mathbb{R}^{d}$ be the evaluation at $t$ , that is, $e_{t}(\gamma):=\gamma(t)$ . Notice that $e_{t}$ is continuous. For a measurable vector field $v\colon(0,1)\times\mathbb{R}^{d}\to\mathbb{R}^{d}$ , we define the following subset of $\Gamma$ consisting of ${\rm AC}^{2}$ curves solving the ODE (6) in the sense of Carathéodory:

[TABLE]

Moreover define the set of solutions to the ODE which live inside $\overline{\Omega}$ for all times:

[TABLE]

The superposition principle for probability solutions to (8) states as follows.

Theorem 7.

Let $t\in[0,1]\mapsto\rho_{t}\in\mathcal{P}(\overline{\Omega})$ be a narrowly continuous solution of the continuity equation in the sense of (8), for some measurable $v\colon(0,1)\times\overline{\Omega}\to\mathbb{R}^{d}$ such that

[TABLE]

Then there exists a probability measure $\sigma\in\mathcal{P}(\Gamma)$ concentrated on $\Gamma_{v}(\overline{\Omega})$ and such that $\rho_{t}=(e_{t})_{\#}\sigma$ for every $t\in[0,1]$ , that is,

[TABLE]

Proof.

Let $\bar{v}:(0,1)\times\mathbb{R}^{d}\rightarrow\mathbb{R}^{d}$ be the extension to zero of $v$ to the whole $\mathbb{R}^{d}$ . Similarly, for each $t\in[0,1]$ , let $\bar{\rho}_{t}\in\mathcal{P}(\mathbb{R}^{d})$ be the extension to zero of $\rho_{t}$ in $\mathbb{R}^{d}$ . Note that the pair $(\bar{\rho},\bar{v}\,\bar{\rho})$ is a solution of the continuity equation in $(0,1)\times\mathbb{R}^{d}$ in the sense of (8). Moreover $\bar{\rho}$ and $\bar{v}$ satisfy (14) in $(0,1)\times\mathbb{R}^{d}$ . Therefore we can apply Theorem 8.2.1 in [2] and obtain a probability measure $\sigma\in\mathcal{P}(\Gamma)$ concentrated on $\Gamma_{\bar{v}}(\mathbb{R}^{d})$ and such that $\bar{\rho}_{t}=(e_{t})_{\#}\sigma$ for all $t\in[0,1]$ , that is,

[TABLE]

We claim that $\sigma$ is concentrated on $\Gamma_{v}(\overline{\Omega})$ . In order to show that, partition $\Gamma_{\bar{v}}(\mathbb{R}^{d})$ into

[TABLE]

where

[TABLE]

Notice that, since $\overline{\Omega}^{c}$ is open and $v\equiv 0$ in $\overline{\Omega}^{c}$ , the curves in $A$ are constant, so that we can write

[TABLE]

From this, it follows that $A\subset e_{0}^{-1}(\overline{\Omega}^{c})$ . Moreover, (16) implies $\bar{\rho}_{0}(\overline{\Omega}^{c})=\sigma(e_{0}^{-1}(\overline{\Omega}^{c}))$ . Therefore, using that $\bar{\rho}_{t}$ is concentrated on $\overline{\Omega}$ , we conclude that $\sigma(A)=0$ , showing that $\sigma$ is concentrated on $\Gamma_{\bar{v}}(\overline{\Omega})$ . Finally, (16) implies (15) since $\bar{\rho}_{t}$ is supported in $\overline{\Omega}$ and it coincides with $\rho_{t}$ in $\overline{\Omega}$ . Also $\Gamma_{\bar{v}}(\overline{\Omega})=\Gamma_{v}(\overline{\Omega})$ by definition of $\bar{v}$ , thus concluding the proof. ∎

3.2. Proof of Theorem 6

Let $\alpha\geq 0$ , $\beta>0$ . We divide the proof into two parts.

Part 1: $\{(0,0)\}\cup\mathcal{C}_{\alpha,\beta}\subset\operatorname{Ext}(C_{\alpha,\beta})$ .

We start by showing that $\{(0,0)\}\cup\mathcal{C}_{\alpha,\beta}\subset C_{\alpha,\beta}$ . The fact that $(0,0)\in C_{\alpha,\beta}$ follows immediately, since $(0,0)$ solves the continuity equation and $J_{\alpha,\beta}(0,0)=0$ (by Lemma 2). Consider now $(\rho,m)\in\mathcal{C}_{\alpha,\beta}$ . Notice that $(\rho,m)\in\mathcal{C}_{\alpha,\beta}$ satisfies the continuity equation in the sense of (8): indeed for every $\varphi\in C^{1}_{c}((0,1)\times\overline{\Omega})$ we have

[TABLE]

since $\varphi$ is compactly supported in $(0,1)\times\overline{\Omega}$ . Moreover, thanks to the fact that $\rho\geq 0$ and $m=\dot{\gamma}\rho$ , we can invoke (11) to obtain

[TABLE]

proving that $(\rho,m)\in C_{\alpha,\beta}$ .

We now want to show that any $(\rho,m)\in\{(0,0)\}\cup\mathcal{C}_{\alpha,\beta}$ is an extremal point for $C_{\alpha,\beta}$ . Hence assume that $(\rho^{1},m^{1}),(\rho^{2},m^{2})\in C_{\alpha,\beta}$ are such that

[TABLE]

for some $\lambda\in(0,1)$ . We need to show that $(\rho,m)=(\rho^{1},m^{1})=(\rho^{2},m^{2})$ . Set $j\in\{1,2\}$ . Since $(\rho^{j},m^{j})$ is such that $J_{\alpha,\beta}(\rho^{j},m^{j})\leq 1$ , from $ii)$ in Lemma 2 we have that $\rho^{j}\geq 0$ and $m^{j}=v^{j}\,\rho^{j}$ for some Borel field $v^{j}\colon X\to\mathbb{R}^{d}$ . In particular, if $(\rho,m)=(0,0)$ , (19) forces $(\rho^{j},m^{j})=0$ , hence showing that $(0,0)$ is an extremal point of $C_{\alpha,\beta}$ .

Let us now consider the case $(\rho,m)\in\mathcal{C}_{\alpha,\beta}$ . By (18) we have $J_{\alpha,\beta}(\rho,m)=1$ . From (19), convexity of $J_{\alpha,\beta}$ , and the fact that $J_{\alpha,\beta}(\rho^{j},m^{j})\leq 1$ , $\lambda\in(0,1)$ , we conclude

[TABLE]

Since $(\rho^{j},m^{j})$ solves the continuity equation, $\rho^{j}\geq 0$ and $J_{\alpha,\beta}(\rho^{j},m^{j})=1$ , from Lemma 3 we deduce that $\rho^{j}=dt\otimes\rho_{t}^{j}$ for some narrowly continuous curve $t\mapsto\rho^{j}_{t}\in\mathcal{M}^{+}(\overline{\Omega})$ , with $\rho^{j}_{t}(\overline{\Omega})$ constant in time. We define $a_{j}:=\rho^{j}_{0}(\overline{\Omega})$ and notice that $a_{j}>0$ : Indeed, $a_{j}=0$ would imply $\rho^{j}=0$ , yielding $J_{\alpha,\beta}(\rho^{j},m^{j})=J_{\alpha,\beta}(0,0)=0$ . This would contradict (20). Now, from condition (19) and uniqueness of the disintegration we deduce

[TABLE]

Since $a_{j}>0$ (and hence $\rho^{j}_{t}\neq 0$ ), the above equality implies that $\operatorname{supp}\rho_{t}^{j}=\{\gamma(t)\}$ , i.e.,

[TABLE]

We now show that $v^{j}=\dot{\gamma}$ on $\operatorname{supp}\rho=\operatorname{graph}(\gamma):=\{(t,\gamma(t))\colon t\in(0,1)\}$ , that is

[TABLE]

By assumption, $\partial_{t}\rho^{j}+\operatorname*{div}m^{j}=0$ in the sense of (8). Therefore, recalling (22) and the fact that $a_{j}>0$ , we get that for each $\varphi\in C^{1}_{c}((0,1)\times\overline{\Omega})$ ,

[TABLE]

where the last equality follows from (17), since $a_{\gamma}>0$ . Let $\psi\in C^{1}_{c}((0,1))$ and define $\varphi(t,x):=x_{i}\psi(t)$ , where $x=(x_{1},\dots,x_{d})$ , so that $\varphi$ is a test function for (24). By plugging $\varphi$ into (24) we obtain

[TABLE]

where $v_{i}^{j}$ and $\dot{\gamma}_{i}$ are the $i$ -th component of $v^{j}$ and $\dot{\gamma}$ , respectively. This implies that $v^{j}(t,\gamma(t))=\dot{\gamma}(t)$ for a.e. $t\in(0,1)$ , that is, $v^{j}=\dot{\gamma}$ a.e. on $\operatorname{graph}(\gamma)$ . With this at hand, by means of (11) we can see that $J_{\alpha,\beta}(\rho^{j},m^{j})=a_{j}/a_{\gamma}$ . Since (20) holds, we obtain $a_{j}=a_{\gamma}$ , thus proving $(\rho,m)=(\rho^{j},m^{j})$ and hence extremality for $(\rho,m)$ in $C_{\alpha,\beta}$ .

Part 2: $\operatorname{Ext}(C_{\alpha,\beta})\subset\{(0,0)\}\cup\mathcal{C}_{\alpha,\beta}$ .

Let $(\rho,m)\in C_{\alpha,\beta}$ be an extremal point. In particular, $J_{\alpha,\beta}(\rho,m)\leq 1$ so that by Lemma 2 $ii)$ , we obtain $\rho\geq 0$ and $m=v\rho$ for some Borel field $v\colon X\to\mathbb{R}^{d}$ . Notice that by extremality of $(\rho,m)$ and one-homogeneity of $J_{\alpha,\beta}$ we immediately infer that either $J_{\alpha,\beta}(\rho,m)=0$ or $J_{\alpha,\beta}(\rho,m)=1$ . If $J_{\alpha,\beta}(\rho,m)=0$ , by decomposing $(\rho,m)$ as

[TABLE]

and using the extremality of $(\rho,m)$ together with the one-homogeneity of $J_{\alpha,\beta}$ we deduce that $(\rho,m)=(0,0)$ . Thus, we consider the case

[TABLE]

Since by definition, $(\rho,m)$ solves the continuity equation in the sense of (8) and $J_{\alpha,\beta}(\rho,m)=1$ , we can apply Lemma 3 to obtain that $\rho=a\,dt\otimes\rho_{t}$ for some narrowly continuous curve $t\mapsto\rho_{t}\in\mathcal{P}(\overline{\Omega})$ , where $a:=\rho(X)>0$ .

Claim: $\operatorname{supp}\rho_{t}$ is a singleton for each $t\in[0,1]$ .

Proof of Claim: The hypotheses of Theorem 7 are satisfied, therefore there exists a measure $\sigma\in\mathcal{P}(\Gamma)$ concentrated on $\Gamma_{v}(\overline{\Omega})$ and such that $\rho_{t}=(e_{t})_{\#}\sigma$ for every $t\in[0,1]$ . Assume by contradiction that there exists a time $\hat{t}\in[0,1]$ such that $\operatorname{supp}\rho_{\hat{t}}$ is not a singleton. Therefore, we can find a Borel set $E\subset\overline{\Omega}$ such that

[TABLE]

Define the Borel set

[TABLE]

By the relation $\rho_{t}=(e_{t})_{\#}\sigma$ and definition of $A$ we obtain $\rho_{\hat{t}}(E)=\sigma(A)$ . Therefore, from (26)

[TABLE]

Define

[TABLE]

where $A^{c}:=\Gamma\smallsetminus A$ . Note that $\lambda_{1},\lambda_{2}$ are well defined (possibly being equal to $+\infty$ ) as the map

[TABLE]

is lower semicontinuous on $\Gamma$ , and hence measurable. Notice that

[TABLE]

because $\sigma$ is concentrated on $\Gamma_{v}(\overline{\Omega})$ . Since $v(t,\cdot)$ belongs to $L^{2}_{\rho_{t}}(\overline{\Omega};\mathbb{R}^{d})$ for a.e. $t\in(0,1)$ , by [9, Theorem 3.6.1] we obtain that the representation formula (15) holds for $\varphi(x):=v(t,x)$ and a.e. $t\in(0,1)$ , that is,

[TABLE]

Therefore, from (11), (25), (29) and (30) we deduce $\lambda_{1}+\lambda_{2}=J_{\alpha,\beta}(\rho,m)=1$ .

We now proceed with the proof of the claim separately for the cases $\alpha>0$ and $\alpha=0$ . Suppose first $\alpha>0$ . Notice that $\lambda_{1},\lambda_{2}>0$ thanks to (27) and the fact that $a>0$ . Decompose $(\rho,m)$ as

[TABLE]

where we defined

[TABLE]

for $j=1,2$ , with $\sigma_{1}:=\sigma\mathbin{\vrule height=6.88889pt,depth=0.0pt,width=0.55974pt\vrule height=0.55974pt,depth=0.0pt,width=5.59721pt}A$ and $\sigma_{2}:=\sigma\mathbin{\vrule height=6.88889pt,depth=0.0pt,width=0.55974pt\vrule height=0.55974pt,depth=0.0pt,width=5.59721pt}A^{c}$ . Notice that $\rho^{j}\in\mathcal{M}^{+}(X)$ , since $\sigma$ is a positive measure concentrated on $\Gamma_{v}(\overline{\Omega})$ , and $a,\lambda_{j}>0$ . We now claim that $(\rho^{j},m^{j})\in C_{\alpha,\beta}$ . First, we prove that $\partial_{t}\rho^{j}+\operatorname*{div}m^{j}=0$ in the sense of (8). Let $j=1$ and fix $\varphi\in C^{1}_{c}((0,1)\times\overline{\Omega})$ . Since $v(t,\cdot)$ belongs to $L^{2}_{\rho_{t}}(\overline{\Omega};\mathbb{R}^{d})$ for a.e. $t\in(0,1)$ , by [9, Theorem 3.6.1], (15) and the definition of $\sigma_{1}$ , we get

[TABLE]

Now recall that $\sigma$ is concentrated on $\Gamma_{v}(\overline{\Omega})$ and that $\varphi$ is compactly supported in time, so that

[TABLE]

The calculation for $j=2$ is similar. Also, by definition of $(\rho^{j},m^{j})$ and of $\lambda_{j}$ , one can perform similar calculations to the ones in (29), (30), and prove that $J_{\alpha,\beta}(\rho^{j},m^{j})=1$ . Hence $(\rho^{j},m^{j})\in C_{\alpha,\beta}$ . We now claim that $(\rho^{1},m^{1})\neq(\rho^{2},m^{2})$ . Suppose by contradiction that $(\rho^{1},m^{1})=(\rho^{2},m^{2})$ . Then in particular $\rho^{1}=\rho^{2}$ , so that by (32) we get

[TABLE]

As $(\rho^{j},m^{j})$ are solutions of the continuity equation and $J_{\alpha,\beta}(\rho^{j},m^{j})=1$ , from Lemma 3 it follows that the maps $t\mapsto(e_{t})_{\#}\sigma_{j}$ are narrowly continuous. In particular, (35) holds for each $t\in[0,1]$ . However, by (27) and by definition of $A$ , $\sigma_{1}$ , $\sigma_{2}$ , we have

[TABLE]

which contradicts (35). Therefore $(\rho^{1},m^{1})\neq(\rho^{2},m^{2})$ , which shows that the decomposition (31) is non-trivial. This is a contradiction, since we are assuming that $(\rho,m)$ is an extremal point for $C_{\alpha,\beta}$ . Thus the claim follows.

Suppose now that $\alpha=0$ and define the set

[TABLE]

Notice that $Z$ is measurable, due to the measurability of the map $L$ at (28). We claim that $\sigma(Z)=0$ . In order to prove that, let $Z^{c}:=\Gamma\smallsetminus Z$ and define the measures $\sigma_{Z}:=\sigma\mathbin{\vrule height=6.88889pt,depth=0.0pt,width=0.55974pt\vrule height=0.55974pt,depth=0.0pt,width=5.59721pt}Z$ , $\sigma_{Z^{c}}:=\sigma\mathbin{\vrule height=6.88889pt,depth=0.0pt,width=0.55974pt\vrule height=0.55974pt,depth=0.0pt,width=5.59721pt}Z^{c}$ , so that $\sigma=\sigma_{Z}+\sigma_{Z^{c}}$ . Recalling that $\rho_{t}=(e_{t})_{\#}\sigma$ for all $t\in[0,1]$ , we can decompose

[TABLE]

where

[TABLE]

Let $j=1,2$ . Notice that $\rho^{j}\in\mathcal{M}^{+}(X)$ since $\sigma$ is a positive measure concentrated on $\Gamma_{v}(\overline{\Omega})$ and $a>0$ . Following similar computation as (33)-(34) we infer that $(\rho^{j},m^{j})$ solves the continuity equation in the sense of (8). Moreover, by definition of $Z$ and the fact that $\sigma$ is concentrated on $\Gamma_{v}(\overline{\Omega})$ we obtain

[TABLE]

where we employed Fubini’s Theorem, which holds thanks to the measurability of the map $L$ and the identity (30), the latter implying boundedness of the last term in (37). By (37) and arguing as in (29)-(30), it is immediate to check that $J_{0,\beta}(\rho^{j},m^{j})=J_{0,\beta}(\rho,m)$ . Recalling (25) we then obtain $(\rho^{j},m^{j})\in C_{0,\beta}$ . As $(\rho,m)$ is an extremal point of $C_{0,\beta}$ , from (36) we deduce that $(\rho^{1},m^{1})=(\rho^{2},m^{2})$ and thus $dt\otimes(e_{t})_{\#}\sigma_{Z}=0$ . In particular, there exists $\hat{t}\in[0,1]$ such that $(e_{\hat{t}})_{\#}\sigma_{Z}=0$ . Hence for every $E\subset\Gamma$ measurable, by the positivity of $\sigma$ , we have $\sigma_{Z}(E)\leq(e_{\hat{t}})_{\#}\sigma_{Z}(e_{\hat{t}}(E))=0$ , implying that $\sigma_{Z}=0$ . By (27) and the definition of $\lambda_{1}$ , $\lambda_{2}$ , we conclude that $\lambda_{1},\lambda_{2}>0$ . With this property established, the claim that $\operatorname{supp}\rho_{t}$ is a singleton for each $t\in[0,1]$ follows by repeating the same arguments of the case $\alpha>0$ , employing the decomposition of $(\rho,m)$ as in (31).

We have shown that for each $t\in[0,1]$ , $\operatorname{supp}\rho_{t}$ is a singleton. We now conclude the proof of Theorem 6. Since $\rho_{t}\in\mathcal{P}(\overline{\Omega})$ , the latter implies the existence of a curve $\gamma\colon[0,1]\to\overline{\Omega}$ such that $\rho_{t}=\delta_{\gamma(t)}$ for each $t\in[0,1]$ . We will now prove that $\gamma\in{\rm AC}^{2}([0,1];\mathbb{R}^{d})$ . By narrow continuity of $t\mapsto\rho_{t}$ , we have that the map $t\mapsto\,\varphi(\gamma(t))$ is continuous for all $\varphi\in C(\overline{\Omega})$ . By testing against the coordinate functions $\varphi(x):=x_{i}$ , we obtain continuity for $\gamma$ . Consider now $\varphi(t,x):=\xi(t)\eta(x)$ with $\xi\in C^{\infty}_{c}((0,1))$ , $\eta\in C^{1}(\overline{\Omega})$ . Notice that the scalar map $t\mapsto\eta(\gamma(t))$ is continuous. Moreover, by testing the continuity equation $\partial_{t}\rho+\operatorname*{div}(v\rho)=0$ against $\varphi$ we get

[TABLE]

which implies that the distributional derivative of the map $t\mapsto\eta(\gamma(t))$ is given by

[TABLE]

We now remark that the above map belongs to $L^{2}((0,1))$ , since

[TABLE]

Therefore, $t\mapsto\eta(\gamma(t))$ belongs to ${\rm AC}^{2}([0,1])$ for every fixed $\eta\in C^{1}(\overline{\Omega})$ . By choosing $\eta(x):=x_{i}$ , $i=1,\dots,d$ , we conclude that $\gamma\in{\rm AC}^{2}([0,1];\mathbb{R}^{d})$ . Since $\partial_{t}\rho+\operatorname*{div}(v\rho)=0$ , we can repeat the same argument employed to prove (23), and infer

[TABLE]

From (38) and the fact that $\rho=a\,dt\otimes\delta_{\gamma(t)}$ , $m=v\rho$ , we then conclude $m=\dot{\gamma}\rho$ . As $a>0$ , we can apply (11) to compute

[TABLE]

Recalling that $J_{\alpha,\beta}(\rho,m)=1$ (see (25)), from (39) we conclude that $a=a_{\gamma}$ with $a_{\gamma}<+\infty$ . Therefore $(\rho,m)$ belongs to $\mathcal{C}_{\alpha,\beta}$ according to Definition 5, and the proof of Theorem 6 is concluded.

4. Application to sparse representation for inverse problems with optimal transport regularization

In this section we deal with the problem of reconstructing a family of time-dependent Radon measures given a finite number of observations. To be more specific, let $H$ be a finite dimensional Hilbert space and $A:C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))\rightarrow H$ be a linear continuous operator, where continuity is understood in the following sense: given a sequence $(t\mapsto\rho^{n}_{t})$ in $C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))$ , we require that

[TABLE]

where, with a little abuse of notation, we will denote by $\rho^{n}$ both the curve $t\mapsto\rho_{t}^{n}$ , as well as the measure $\rho^{n}:=dt\otimes\rho^{n}_{t}$ .

For some given data $y\in H$ , we aim to reconstruct a solution $\rho\in C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))$ to the dynamic inverse problem

[TABLE]

For $\alpha>0$ and $\beta>0$ we regularize the above inverse problem by means of the energy $J_{\alpha,\beta}$ defined in (10), following the approach in [16]. In practice, upon introducing the space

[TABLE]

we consider the Tikhonov functional $G:\widetilde{\mathcal{M}}\rightarrow\mathbb{R}\cup\{+\infty\}$ defined as

[TABLE]

where $F:H\rightarrow\mathbb{R}\cup\{+\infty\}$ is a given fidelity functional for the data $y$ , which is assumed to be convex, lower semicontinuous and bounded from below. Additionally, we assume that $G$ is proper. We then replace (41) by

[TABLE]

*Remark 8**.*

Two common choices for the fidelity term $F$ in the case $H=\mathbb{R}^{k}$ are, for example,

i)

$F(x)=I_{\{y\}}(x)$ for a given $y\in\mathbb{R}^{k}$ that forces the constraint $A\rho=y$ ,

ii)

$F(x)=\frac{1}{2}\|x-y\|_{2}^{2}$ that recovers a classical $l^{2}$ penalization.

*Remark 9**.*

Under the above assumptions on $A$ and $F$ , problem (43) admits a solution. Indeed, since $G$ is proper, any minimizing sequence $\{(\rho^{n},m^{n})\}_{n}$ is such that $\{G(\rho^{n},m^{n})\}_{n}$ is bounded. As $F$ is bounded from below and $J_{\alpha,\beta}\geq 0$ , we deduce that $\{J_{\alpha,\beta}(\rho^{n},m^{n})\}_{n}$ is bounded. Therefore, Lemma 4 implies that $(\rho^{n},m^{n})$ converges (up to subsequences) to some $(\rho,m)\in\widetilde{\mathcal{M}}$ , in the sense of (13). By weak* lower semicontinuity of $J_{\alpha,\beta}$ in $\mathcal{M}$ (see Lemma 4) and by (40) together with the lower-semicontinuity of $F$ , we infer that $(\rho,m)$ solves (43).

It is well-known that the presence of a finite-dimensional constraint in an inverse problem, such as (41), promotes sparsity in the reconstruction. This observation has been recently made rigorous in [15] and [11], where it has been shown that the atoms of a sparse minimizer are the extremal points of the ball of the regularizers. In Theorem 6, we provided a characterization for the extremal points of the ball of $J_{\alpha,\beta}$ . Therefore, specializing the above-mentioned results to our setting yields the following characterization theorem for sparse minimizers to (43).

Theorem 10.

Let $\alpha,\beta>0$ . There exists a minimizer $(\hat{\rho},\hat{m})\in\widetilde{\mathcal{M}}$ of (43) that can be represented as

[TABLE]

where $p\leq\dim(H)$ , $c_{i}>0$ , $\sum_{i=1}^{p}c_{i}=J_{\alpha,\beta}(\hat{\rho},\hat{m})$ , and

[TABLE]

where $\gamma_{i}\in{\rm AC}^{2}([0,1];\mathbb{R}^{d})$ with $\gamma(t)\in\overline{\Omega}$ for each $t\in[0,1]$ , and $a_{\gamma_{i}}^{-1}:=\frac{\beta}{2}\int_{0}^{1}|\dot{\gamma}_{i}|^{2}\,dt+\alpha$ .

In other words, the above theorem ensures the existence of a minimizer of (43) which is a finite linear combination of measures concentrated on the graphs of ${\rm AC}^{2}$ -trajectories contained in $\overline{\Omega}$ . In Section 4.1 we give a proof of Theorem 10, and we conclude the paper with Section 4.2, where we apply the sparsity result of Theorem 10 to dynamic inverse problems with optimal transport regularization, following the approach of [16].

4.1. Proof of Theorem 10

As already mentioned, the proof is an immediate consequence of Theorem 6 and a particular case of [11, Corollary $2$ ] (see also [11, Theorem $1$ ]). Before proceeding with the proof, for the reader’s convenience, we recall Corollary $2$ from [11]. The definitions appearing in the statement below will be briefly recalled in the proof of Theorem 10.

Theorem 11 ([11]).

Let $\mathcal{U}$ be a locally convex space, $H$ be a finite-dimensional Hilbert space, $R:\mathcal{U}\rightarrow[-\infty,+\infty]$ , $F:H\rightarrow[-\infty,+\infty]$ be convex, and $A:\mathcal{U}\rightarrow H$ be linear. Consider the variational problem

[TABLE]

Suppose that the set of minimizers of (45), denoted by $S$ , is non-empty. Additionally, assume that there exists $\hat{u}\in\operatorname{Ext}(S)$ such that the set

[TABLE]

is linearly closed, the lineality space of $\overline{C}$ is $\{(0,0)\}$ and $\inf_{u\in\mathcal{U}}R(u)<R(\hat{u})$ . Then, exactly one of the following conditions holds:

i)

$\hat{u}$ * is a convex combination of at most $\dim(H)$ extremal points of $\overline{C}$ ,*

ii)

$\hat{u}$ * is as a convex combination of at most $\dim(H)-1$ points, which are either extremal points of $\overline{C}$ , or belong to an extreme ray of $\overline{C}$ .*

Proof of Theorem 10.

We just need to verify that we can apply Theorem 11 to the variational problem (43). So, we choose $\mathcal{U}=\widetilde{M}$ , $R=J_{\alpha,\beta}$ and $F$ and $A$ satisfying the assumptions stated above. Let $S$ be the set of solutions to (43).

First, notice that in Remark 9 we have already shown that the set of minimizers for (43) is non-empty, so that $S\neq\emptyset$ . Moreover $S$ is compact with respect to the weak* topology. Indeed, given a sequence $(\rho^{n},m^{n})$ in $S$ we can use Lemma 4 to extract a subsequence (not relabelled) such that $(\rho^{n},m^{n})\stackrel{{\scriptstyle*}}{{\rightharpoonup}}(\rho,m)$ in $\mathcal{M}$ and $\rho_{t}^{n}\stackrel{{\scriptstyle*}}{{\rightharpoonup}}\rho_{t}$ in $\mathcal{M}(\overline{\Omega})$ for every $t\in[0,1]$ . Using the sequential lower semicontinuity of $J_{\alpha,\beta}$ with respect to weak* convergence combined with the continuity of $A$ (according to (40)) and the lower semicontinuity of $F$ , we obtain $(\rho,m)\in S$ . We conclude that $S$ is sequentially weakly* compact and hence weakly* compact, thanks to the metrizability of the weak* convergence on bounded sets. Finally note that $S$ is convex thanks to the convexity of $F$ and $J_{\alpha,\beta}$ (Lemma 4). By Krein–Milman’s Theorem, we then infer the existence of a $(\hat{\rho},\hat{m})\in\operatorname{Ext}(S)$ .

The lineality space of $\overline{C}$ is defined as $\text{lin}(\overline{C})=\text{rec}(\overline{C})\cap(-\text{rec}(\overline{C}))$ , where $\text{rec}(\overline{C})$ is the recession cone of $\overline{C}$ defined as the set of all $(\rho,m)\in\mathcal{U}$ such that $\overline{C}+\mathbb{R}_{+}(\rho,m)\subset\overline{C}$ . Hence, from the coercivity of $J_{\alpha,\beta}$ in Lemma 4 it is immediate to conclude that $\text{lin}(\overline{C})=\{(0,0)\}$ . Moreover, $\overline{C}$ is linearly closed if the intersection of $\overline{C}$ with every line is closed. It is easy to verify that, as $\overline{C}$ is weakly* closed (Remark 9), it is also linearly closed. Finally, the assumption $\inf_{(\rho,m)\in\widetilde{M}}J_{\alpha,\beta}(\rho,m)<J_{\alpha,\beta}(\hat{\rho},\hat{m})$ is satisfied whenever $(\hat{\rho},\hat{m})\neq 0$ , as in this case $J_{\alpha,\beta}(\hat{\rho},\hat{m})>0$ , while $\inf_{(\rho,m)\in\widetilde{M}}J_{\alpha,\beta}(\rho,m)=0$ . Hence, the hypotheses of Theorem 11 for the functional (42) are verified. Notice also that $\overline{C}$ does not contain extreme rays. In order to prove that, we first recall that a ray of $\overline{C}$ is any set of the form $r_{p,v}=\{p+tv:t>0\}$ for $p,v\in\overline{C}$ , $v\neq 0$ . An extreme ray of $\overline{C}$ is a ray $r_{p,v}$ such that for every segment intersecting $r_{p,v}$ , the whole segment is contained in $r_{p,v}$ . Thanks to the coercivity of $J_{\alpha,\beta}$ in Lemma 4, it is immediate to see that $\overline{C}$ contains no rays and thus no extreme rays. Hence, from either of the conclusions $i),ii)$ in Theorem 11, we deduce that there exists a minimizer $(\hat{\rho},\hat{m})\in\mathcal{M}$ of (43) that can be represented as

[TABLE]

where $(\rho_{i},m_{i})\in\operatorname{Ext}(C_{\alpha,\beta})$ , $p\leq\dim(H)$ , $c_{i}>0$ and $\sum_{i=1}^{p}c_{i}=J_{\alpha,\beta}(\hat{\rho},\hat{m})$ . We remark that if $(\hat{\rho},\hat{m})=0$ , the assumption $\inf_{(\rho,m)\in\widetilde{M}}J_{\alpha,\beta}(\rho,m)<J_{\alpha,\beta}(\hat{\rho},\hat{m})$ in Theorem 11 is not satisfied, but the representation (47) holds trivially. Using the characterization of extremal points in Theorem 6 and (47), we obtain an explicit sparse representation for solutions of (43) and the proof is achieved. ∎

4.2. Dynamic inverse problems

Theorem 10 provides a representation formula for sparse solutions of (43) that holds for every $A$ and $F$ satisfying the above-stated hypotheses. A relevant choice for $A$ and $F$ is proposed in [16] as a model for dynamic inverse problems: in particular, the authors apply their framework to variational reconstruction in undersampled dynamic MRI. In what follows we make an explicit choice of $F$ and $A$ in order to apply Theorem 10 to a special case of the framework in [16], namely the case of discrete time sampling, and finite-dimensionality of the data for each sampled time.

To be more specific, consider a discretization of the interval $[0,1]$ in $N$ points $t_{1}<t_{2}<\ldots<t_{N}$ and assume that we want to reconstruct an element of $C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))$ , by only making observations at the time instants $t_{1},\ldots,t_{N}$ . To this aim, let $H_{t_{i}}$ be a family of finite-dimensional Hilbert spaces and introduce the product space $\mathcal{H}:=\bigtimes_{i=1}^{N}H_{t_{i}}$ , normed by $\|y\|_{\mathcal{H}}^{2}:=\sum_{i=1}^{N}\|y_{i}\|^{2}_{H_{t_{i}}}$ . Let $A_{t_{i}}:\mathcal{M}(\overline{\Omega})\to H_{t_{i}}$ be linear operators, which are assumed to be weak* continuous for each $i=1,\ldots,N$ . For a given observation $(y_{t_{1}},\ldots,y_{t_{N}})\in\mathcal{H}$ , consider the problem of finding $\rho\in C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))$ such that

[TABLE]

Following [16], we regularize the above problem by

[TABLE]

In order to recast the above problem into the form (43), let $A:C_{\rm w}([0,1];\mathcal{M}(\overline{\Omega}))\rightarrow\mathcal{H}$ be the linear operator defined by

[TABLE]

Notice that $A$ is continuous in the sense of (40), thanks to the assumptions on $A_{t_{i}}$ . We can then equivalently rewrite (48) as

[TABLE]

In this way, we recover a problem of the type of (43), where $F(x):=\frac{1}{2}\|x-y\|^{2}_{\mathcal{H}}$ . Notice that $F$ is convex, lower semicontinuous and bounded from below. Moreover, the functional in (49) is proper, since $J_{\alpha,\beta}(0,0)=0$ . Hence, we can apply Theorem 10 to conclude the following result.

Corollary 12.

Let $\alpha,\beta>0$ . There exists a minimizer $(\hat{\rho},\hat{m})\in\widetilde{\mathcal{M}}$ of (48) that can be represented as

[TABLE]

where $p\leq\dim(\mathcal{H})=\sum_{i=1}^{N}\dim(H_{i})$ , $c_{i}>0$ , $\sum_{i=1}^{p}c_{i}=J_{\alpha,\beta}(\hat{\rho},\hat{m})$ , and

[TABLE]

where $\gamma_{i}\in{\rm AC}^{2}([0,1];\mathbb{R}^{d})$ with $\gamma(t)\in\overline{\Omega}$ for each $t\in[0,1]$ , and $a_{\gamma_{i}}^{-1}:=\frac{\beta}{2}\int_{0}^{1}|\dot{\gamma}_{i}|^{2}\,dt+\alpha$ .

*Remark 13**.*

The upper bound $p\leq\sum_{i=1}^{N}\dim(H_{i})$ in the representation formula (50) might not be optimal. However, a careful analysis of the faces of the ball of the Benamou–Brenier energy, possibly under additional assumptions on the operator $A$ and fidelity term $F$ , could be needed to substantiate such conjecture. We leave this question open for future research.

Acknowledgements

We thank the referee for the useful suggestions provided, particularly for encouraging us to include the case $\alpha=0$ in the characterization of Theorem 6, which was previously missing. KB and SF gratefully acknowledge support by the Christian Doppler Research Association (CDG) and Austrian Science Fund (FWF) through the Partnership in Research project PIR-27 “Mathematical methods for motion-aware medical imaging” and project P 29192 “Regularization graphs for variational imaging”. MC is supported by the Royal Society (Newton International Fellowship NIF\R1\192048 Minimal partitions as a robustness boost for neural network classifiers). The Institute of Mathematics and Scientific Computing, to which KB, SF and FR are affiliated, is a member of NAWI Graz (http://www.nawigraz.at/en/). The authors KB, SF and FR are further members of/associated with BioTechMed Graz (https://biotechmedgraz.at/en/).

Bibliography55

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] L. Ambrosio, N. Fusco, and D. Pallara. Functions of Bounded Variation and Free Discontinuity Problems . Oxford Science Publications, 2000.
2[2] L. Ambrosio, N. Gigli, and G. Savaré. Gradient Flows: In Metric Spaces and in the Space of Probability Measures . Birkhäuser Basel, 2005.
3[3] L. Ambrosio. Transport equation and Cauchy problem for BV vector fields. Inventiones mathematicae , 158(2):227–260, 2004.
4[4] L. Ambrosio, N. Gigli, and G. Savaré. Calculus and heat flow in metric measure spaces and applications to spaces with Ricci bounds from below. Inventiones mathematicae , 195(2):289–391, 2014.
5[5] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning , volume 70, pages 214–223, 2017.
6[6] J.-D. Benamou and Y. Brenier. A computational fluid mechanics solution to the Monge–Kantorovich mass transfer problem. Numerische Mathematik , 84(3):375–393, 2000.
7[7] P. Bernard. Young measures, superposition and transport. Indiana University Mathematics Journal , 57(1):247–275, 2008.
8[8] M. Bernot, V. Caselles, and J.-M. Morel. Optimal transportation networks . Lecture Notes in Mathematics. Springer-Verlag, Berlin, 2009.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

On the extremal points of the ball of the Benamou–Brenier energy

Abstract.

1. Introduction

Theorem**.**

2. Mathematical setting and preliminaries

Remark 1*.*

Lemma 2** (Properties of BBB).**

Lemma 3** (Properties of the continuity equation).**

Lemma 4** (Properties of Jα,βJ_{\alpha,\beta}Jα,β​).**

3. Characterization of extremal points

Definition 5** (Characteristics).**

Theorem 6**.**

3.1. The superposition principle

Theorem 7**.**

Proof.

3.2. Proof of Theorem 6

4. Application to sparse representation for inverse problems with optimal transport regularization

Remark 8*.*

Remark 9*.*

Theorem 10**.**

4.1. Proof of Theorem 10

Theorem 11** ([11]).**

Proof of Theorem 10.

4.2. Dynamic inverse problems

Corollary 12**.**

Remark 13*.*

Acknowledgements

Theorem.

*Remark 1**.*

Lemma 2 (Properties of $B$ ).

Lemma 3 (Properties of the continuity equation).

Lemma 4 (Properties of $J_{\alpha,\beta}$ ).

Definition 5 (Characteristics).

Theorem 6.

Theorem 7.

*Remark 8**.*

*Remark 9**.*

Theorem 10.

Theorem 11 ([11]).

Corollary 12.

*Remark 13**.*