Unnormalized Optimal Transport

Wilfrid Gangbo; Wuchen Li; Stanley Osher; Michael Puthawala

arXiv:1902.03367·math.OC·October 23, 2019·J. Comput. Phys.

Unnormalized Optimal Transport

Wilfrid Gangbo, Wuchen Li, Stanley Osher, Michael Puthawala

PDF

1 Repo

TL;DR

This paper extends optimal transport theory to handle unnormalized and unequal masses, introducing new equations and duality formulas that enable efficient computation of distances between unnormalized densities.

Contribution

It develops a novel extension of the Monge-Kantorovich problem for unnormalized masses, including new equations and duality formulas, with efficient solution methods.

Findings

01

Introduces a new Monge-Ampere type equation.

02

Develops a Kantorovich duality formula for unnormalized masses.

03

Provides an efficient computational approach using primal-dual algorithms.

Abstract

We propose an extension of the computational fluid mechanics approach to the Monge-Kantorovich mass transfer problem, which was developed by Benamou-Brenier. Our extension allows optimal transfer of unnormalized and unequal masses. We obtain a one-parameter family of simple modifications of the formulation in [4]. This leads us to a new Monge-Ampere type equation and a new Kantorovich duality formula. These can be solved efficiently by, for example, the Chambolle-Pock primal-dual algorithm. This solution to the extended mass transfer problem gives us a simple metric for computing the distance between two unnormalized densities. The L1 version of this metric was shown in [23] (which is a precursor of our work here) to have desirable properties.

Figures27

Click any figure to enlarge with its caption.

Tables1

Table 1. Table 1 . Numerical parameters for our experiments. Note that for our one dimensional experiments, n y subscript 𝑛 𝑦 n_{y} has no value.

Parameter	Value	Parameter	Value
Discretization		Optimization
$n_{t}$	15	Iterations	200,000
$n_{x}$	35	$τ_{1}$	$10^{- 3}$
$n_{y}$	35	$τ_{2}$	$10^{- 1}$
		$α$	100

Equations124

P (Ω) = {μ \in L^{1} (Ω) : μ (x) \geq 0, \int_{Ω} μ (x) d x = 1} .

P (Ω) = {μ \in L^{1} (Ω) : μ (x) \geq 0, \int_{Ω} μ (x) d x = 1} .

M (Ω) = {μ \in L^{1} (Ω) : μ (x) \geq 0} .

M (Ω) = {μ \in L^{1} (Ω) : μ (x) \geq 0} .

U W_{p} (μ_{0}, μ_{1})^{p} = v, μ, f in f \int_{0}^{1} \int_{Ω} ∥ v (t, x) ∥^{p} μ (t, x) d x d t + \frac{1}{α} \int_{0}^{1} ∣ f (t) ∣^{p} d t \cdot ∣Ω∣

U W_{p} (μ_{0}, μ_{1})^{p} = v, μ, f in f \int_{0}^{1} \int_{Ω} ∥ v (t, x) ∥^{p} μ (t, x) d x d t + \frac{1}{α} \int_{0}^{1} ∣ f (t) ∣^{p} d t \cdot ∣Ω∣

\partial_{t} μ (t, x) + \nabla \cdot (μ (t, x) v (t, x)) = f (t), μ (0, x) = μ_{0} (x), μ (1, x) = μ_{1} (x) .

\begin{split}\textrm{UW}_{1}(\mu_{0},\mu_{1})=&\inf_{v(t,x),f(t)}\Big{\{}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|\mu(t,x)dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\\ &\hskip 34.14322pt\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}.\end{split}

\begin{split}\textrm{UW}_{1}(\mu_{0},\mu_{1})=&\inf_{v(t,x),f(t)}\Big{\{}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|\mu(t,x)dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\\ &\hskip 34.14322pt\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}.\end{split}

m (x) = \int_{0}^{1} v (t, x) μ (t, x) d t,

m (x) = \int_{0}^{1} v (t, x) μ (t, x) d t,

\int_{0}^{1} \int_{Ω} ∥ v (t, x) ∥ μ (t, x) d x d t \geq \int_{Ω} ∥ \int_{0}^{1} v (t, x) μ (t, x) d t ∥ d x = \int_{Ω} ∥ m (x) ∥ d x .

\int_{0}^{1} \int_{Ω} ∥ v (t, x) ∥ μ (t, x) d x d t \geq \int_{Ω} ∥ \int_{0}^{1} v (t, x) μ (t, x) d t ∥ d x = \int_{Ω} ∥ m (x) ∥ d x .

\begin{split}&\Big{\{}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|\mu(t,x)dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\\ &\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}\\ \geq&\Big{\{}\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\mu_{1}(x)-\mu_{0}(x)+\int_{0}^{1}f(t)dt+\nabla\cdot m(x)=0\Big{\}}\\ \geq&\Big{\{}\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\Big{|}\int_{0}^{1}f(t)dt\Big{|}\cdot|\Omega|\colon\mu_{1}(x)-\mu_{0}(x)+\int_{0}^{1}f(t)dt+\nabla\cdot m(x)=0\Big{\}}.\end{split}

\begin{split}&\Big{\{}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|\mu(t,x)dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\\ &\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}\\ \geq&\Big{\{}\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\int_{0}^{1}|f(t)|dt\cdot|\Omega|\colon\mu_{1}(x)-\mu_{0}(x)+\int_{0}^{1}f(t)dt+\nabla\cdot m(x)=0\Big{\}}\\ \geq&\Big{\{}\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\Big{|}\int_{0}^{1}f(t)dt\Big{|}\cdot|\Omega|\colon\mu_{1}(x)-\mu_{0}(x)+\int_{0}^{1}f(t)dt+\nabla\cdot m(x)=0\Big{\}}.\end{split}

c=\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}.

c=\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}.

\begin{split}UW_{1}(\mu_{0},\mu_{1})=\inf_{m}\Big{\{}&\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\Big{|}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{|}\colon\\ &\mu_{1}(x)-\mu_{0}(x)+\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}+\nabla\cdot m(x)=0\Big{\}}.\end{split}

\begin{split}UW_{1}(\mu_{0},\mu_{1})=\inf_{m}\Big{\{}&\int_{\Omega}\|m(x)\|dx+\frac{1}{\alpha}\Big{|}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{|}\colon\\ &\mu_{1}(x)-\mu_{0}(x)+\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}+\nabla\cdot m(x)=0\Big{\}}.\end{split}

\begin{split}UW_{1}(\mu_{0},\mu_{1})=&\int_{\Omega}\Big{|}\int^{x}_{0}\mu_{1}(y)dy-\int^{x}_{0}\mu_{0}(y)dy-x\int_{\Omega}(\mu_{1}(z)-\mu_{0}(z))dz\Big{|}dx\\ &+\frac{1}{\alpha}\Big{(}\Big{|}\int_{\Omega}\mu_{1}(z)dz-\int_{\Omega}\mu_{0}(z)dz\Big{|}\Big{)}.\end{split}

\begin{split}UW_{1}(\mu_{0},\mu_{1})=&\int_{\Omega}\Big{|}\int^{x}_{0}\mu_{1}(y)dy-\int^{x}_{0}\mu_{0}(y)dy-x\int_{\Omega}(\mu_{1}(z)-\mu_{0}(z))dz\Big{|}dx\\ &+\frac{1}{\alpha}\Big{(}\Big{|}\int_{\Omega}\mu_{1}(z)dz-\int_{\Omega}\mu_{0}(z)dz\Big{|}\Big{)}.\end{split}

\left\{\begin{aligned} &\frac{m(x)}{\|m(x)\|}=\nabla\Phi(x),\quad\textrm{if $\|m(x)\|\neq 0$}\\ -&\nabla\cdot m(x)=\mu_{1}(x)-\mu_{0}(x)+\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}.\end{aligned}\right.

\left\{\begin{aligned} &\frac{m(x)}{\|m(x)\|}=\nabla\Phi(x),\quad\textrm{if $\|m(x)\|\neq 0$}\\ -&\nabla\cdot m(x)=\mu_{1}(x)-\mu_{0}(x)+\frac{1}{|\Omega|}\Big{(}\int_{\Omega}\mu_{0}(x)dx-\int_{\Omega}\mu_{1}(x)dx\Big{)}.\end{aligned}\right.

v (t, x) = \nablaΦ (t, x), f (t) = α \int_{Ω} Φ (t, x) d x,

v (t, x) = \nablaΦ (t, x), f (t) = α \int_{Ω} Φ (t, x) d x,

⎩ ⎨ ⎧ \partial_{t} μ (t, x) + \nabla \cdot (μ (t, x) \nablaΦ (t, x)) = α \int_{Ω} Φ (t, x) d x \partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} \leq 0 μ (0, x) = μ_{0} (x), μ (1, x) = μ_{1} (x) .

⎩ ⎨ ⎧ \partial_{t} μ (t, x) + \nabla \cdot (μ (t, x) \nablaΦ (t, x)) = α \int_{Ω} Φ (t, x) d x \partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} \leq 0 μ (0, x) = μ_{0} (x), μ (1, x) = μ_{1} (x) .

\partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} = 0.

\partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} = 0.

α \int_{0}^{1} \int_{Ω} Φ (t, x) d x d t = \int_{Ω} μ_{1} (y) d y - \int_{Ω} μ_{0} (y) d y .

α \int_{0}^{1} \int_{Ω} Φ (t, x) d x d t = \int_{Ω} μ_{1} (y) d y - \int_{Ω} μ_{0} (y) d y .

F (m, μ) = ⎩ ⎨ ⎧ \frac{∥ m ∥ ^{2}}{μ} 0 + \infty if μ > 0; if μ = 0, m = 0; Otherwise.

F (m, μ) = ⎩ ⎨ ⎧ \frac{∥ m ∥ ^{2}}{μ} 0 + \infty if μ > 0; if μ = 0, m = 0; Otherwise.

\begin{split}\textrm{UW}_{2}(\mu_{0},\mu_{1})^{2}=&\inf_{m,\mu,f}\Big{\{}\int_{0}^{1}\int_{\Omega}F(m(t,x),\mu(t,x))dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|^{2}dt\colon\\ &\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}.\end{split}

\begin{split}\textrm{UW}_{2}(\mu_{0},\mu_{1})^{2}=&\inf_{m,\mu,f}\Big{\{}\int_{0}^{1}\int_{\Omega}F(m(t,x),\mu(t,x))dxdt+\frac{1}{\alpha}\int_{0}^{1}|f(t)|^{2}dt\colon\\ &\partial_{t}\mu(t,x)+\nabla\cdot(\mu(t,x)v(t,x))=f(t),~{}\mu(0,x)=\mu_{0}(x),~{}\mu(1,x)=\mu_{1}(x)\Big{\}}.\end{split}

\begin{split}\mathcal{L}(m,\mu,\Phi)=&\int_{0}^{1}\int_{\Omega}\frac{\|m(t,x)\|^{2}}{2\mu(t,x)}+\Phi(t,x)\Big{(}\partial_{t}\mu(t,x)+\nabla\cdot m(t,x)-f(t)\Big{)}dxdt+\frac{1}{2\alpha}\int_{0}^{1}f(t)^{2}dt.\end{split}

\begin{split}\mathcal{L}(m,\mu,\Phi)=&\int_{0}^{1}\int_{\Omega}\frac{\|m(t,x)\|^{2}}{2\mu(t,x)}+\Phi(t,x)\Big{(}\partial_{t}\mu(t,x)+\nabla\cdot m(t,x)-f(t)\Big{)}dxdt+\frac{1}{2\alpha}\int_{0}^{1}f(t)^{2}dt.\end{split}

⎩ ⎨ ⎧ \frac{m ( t , x )}{μ ( t , x )} = \nablaΦ (t, x) - \frac{m ( t , x ) ^{2}}{2 μ ( t , x ) ^{2}} - \partial_{t} Φ (t, x) \leq 0 f (t) = α \int_{Ω} Φ (t, x) d x .

⎩ ⎨ ⎧ \frac{m ( t , x )}{μ ( t , x )} = \nablaΦ (t, x) - \frac{m ( t , x ) ^{2}}{2 μ ( t , x ) ^{2}} - \partial_{t} Φ (t, x) \leq 0 f (t) = α \int_{Ω} Φ (t, x) d x .

\begin{split}\textrm{UW}_{2}(\mu_{0},\mu_{1})^{2}=&\inf_{M,f(t)}~{}\int_{\Omega}\|M(x)-x\|^{2}\mu_{0}(x)dx+\alpha\int_{0}^{1}f(t)^{2}dt\\ &+\int_{0}^{1}\int_{0}^{t}f(s)\int_{\Omega}\|M(x)-x\|^{2}\textrm{Det}\Big{(}s\nabla M(x)+(1-s)\mathbb{I}\Big{)}dsdtdx,\end{split}

\begin{split}\textrm{UW}_{2}(\mu_{0},\mu_{1})^{2}=&\inf_{M,f(t)}~{}\int_{\Omega}\|M(x)-x\|^{2}\mu_{0}(x)dx+\alpha\int_{0}^{1}f(t)^{2}dt\\ &+\int_{0}^{1}\int_{0}^{t}f(s)\int_{\Omega}\|M(x)-x\|^{2}\textrm{Det}\Big{(}s\nabla M(x)+(1-s)\mathbb{I}\Big{)}dsdtdx,\end{split}

\mu(1,M(x))\textrm{Det}(\nabla M(x))=\mu(0,x)+\int_{0}^{1}f(t)\textrm{Det}\Big{(}t\nabla M(x)+(1-t)\mathbb{I}\Big{)}dt.

\frac{d}{d t} X_{t} (x) = v (t, X_{t} (x)), X_{0} (x) = x .

\frac{d}{d t} X_{t} (x) = v (t, X_{t} (x)), X_{0} (x) = x .

\begin{split}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|^{2}\mu(t,x)dxdt=&\int_{0}^{1}\int_{\Omega}\|v(t,X_{t}(x))\|^{2}\mu(t,X_{t}(x))dX_{t}(x)dt\\ =&\int_{0}^{1}\int_{\Omega}\|\frac{d}{dt}X_{t}(x)\|^{2}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}dxdt.\end{split}

\begin{split}\int_{0}^{1}\int_{\Omega}\|v(t,x)\|^{2}\mu(t,x)dxdt=&\int_{0}^{1}\int_{\Omega}\|v(t,X_{t}(x))\|^{2}\mu(t,X_{t}(x))dX_{t}(x)dt\\ =&\int_{0}^{1}\int_{\Omega}\|\frac{d}{dt}X_{t}(x)\|^{2}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}dxdt.\end{split}

\begin{split}\frac{d}{dt}J(t,x)=&\frac{d}{dt}\Big{\{}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\Big{\}}\\ =&\partial_{t}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}+\nabla_{X}\mu(t,X_{t}(x))\frac{d}{dt}X_{t}(x)\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\\ &+\mu(t,X_{t}(x))\partial_{t}\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\\ =&\Big{\{}\partial_{t}\mu+\nabla\mu\cdot v+\nabla\cdot v\mu\Big{\}}(t,X_{t}(x))\textrm{Det}(\nabla X_{t}(x))\\ =&\Big{\{}\partial_{t}\mu+\nabla\cdot(\mu v)\Big{\}}(t,X_{t}(x))\textrm{Det}(\nabla X_{t}(x))\\ =&f(t)\textrm{Det}(\nabla X_{t}(x)),\end{split}

\begin{split}\frac{d}{dt}J(t,x)=&\frac{d}{dt}\Big{\{}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\Big{\}}\\ =&\partial_{t}\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}+\nabla_{X}\mu(t,X_{t}(x))\frac{d}{dt}X_{t}(x)\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\\ &+\mu(t,X_{t}(x))\partial_{t}\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}\\ =&\Big{\{}\partial_{t}\mu+\nabla\mu\cdot v+\nabla\cdot v\mu\Big{\}}(t,X_{t}(x))\textrm{Det}(\nabla X_{t}(x))\\ =&\Big{\{}\partial_{t}\mu+\nabla\cdot(\mu v)\Big{\}}(t,X_{t}(x))\textrm{Det}(\nabla X_{t}(x))\\ =&f(t)\textrm{Det}(\nabla X_{t}(x)),\end{split}

\partial_{t}\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\nabla\cdot v(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)},

\partial_{t}\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\nabla\cdot v(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)},

J (t) = J (0) + \int_{0}^{t} \frac{d}{d s} J (s) d s .

J (t) = J (0) + \int_{0}^{t} \frac{d}{d s} J (s) d s .

\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\mu(0,x)+\int_{0}^{t}f(s)\textrm{Det}\Big{(}\nabla X_{s}(x)\Big{)}ds.

\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\mu(0,x)+\int_{0}^{t}f(s)\textrm{Det}\Big{(}\nabla X_{s}(x)\Big{)}ds.

\partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} = 0,

\partial_{t} Φ (t, x) + \frac{1}{2} ∥\nablaΦ (t, x) ∥^{2} = 0,

\frac{d}{d t} X_{t} (x) = v (t, X_{t} (x)) = M (x) - x,

\frac{d}{d t} X_{t} (x) = v (t, X_{t} (x)) = M (x) - x,

\begin{split}\eqref{new}=&\int_{0}^{1}\int_{\Omega}\|\frac{d}{dt}X_{t}(x)\|^{2}J(t)dxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\Big{(}J(0)+\int_{0}^{t}\frac{d}{ds}J(s)ds\Big{)}dxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}J(0)dxdt+\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\int_{0}^{t}\frac{d}{ds}J(s)dsdxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\mu(0,x)dxdt+\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\int_{0}^{t}f(s)\textrm{Det}(\nabla X_{s}(x))dsdxdt\\ =&\int_{\Omega}\|M(x)-x\|^{2}\mu(0,x)dx+\int_{0}^{1}\int_{0}^{t}\int_{\Omega}\|M(x)-x\|^{2}f(s)\textrm{Det}\Big{(}(1-s)\mathbb{I}+s\nabla M(x)\Big{)}dsdxdt.\end{split}

\begin{split}\eqref{new}=&\int_{0}^{1}\int_{\Omega}\|\frac{d}{dt}X_{t}(x)\|^{2}J(t)dxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\Big{(}J(0)+\int_{0}^{t}\frac{d}{ds}J(s)ds\Big{)}dxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}J(0)dxdt+\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\int_{0}^{t}\frac{d}{ds}J(s)dsdxdt\\ =&\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\mu(0,x)dxdt+\int_{0}^{1}\int_{\Omega}\|M(x)-x\|^{2}\int_{0}^{t}f(s)\textrm{Det}(\nabla X_{s}(x))dsdxdt\\ =&\int_{\Omega}\|M(x)-x\|^{2}\mu(0,x)dx+\int_{0}^{1}\int_{0}^{t}\int_{\Omega}\|M(x)-x\|^{2}f(s)\textrm{Det}\Big{(}(1-s)\mathbb{I}+s\nabla M(x)\Big{)}dsdxdt.\end{split}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mputhawala/unnormalized-optimal-transport
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Unnormalized Optimal Transport

Wilfrid Gangbo

,

Wuchen Li

,

Stanley Osher

and

Michael Puthawala

[email protected]

Mathematics department, University of California, Los Angeles

Abstract.

We propose an extension of the computational fluid mechanics approach to the Monge-Kantorovich mass transfer problem, which was developed by Benamou-Brenier in [4]. Our extension allows optimal transfer of unnormalized and unequal masses. We obtain a one-parameter family of simple modifications of the formulation in [4]. This leads us to a new Monge-Ampére type equation and a new Kantorovich duality formula. These can be solved efficiently by, for example, the Chambolle-Pock primal-dual algorithm [6]. This solution to the extended mass transfer problem gives us a simple metric for computing the distance between two unnormalized densities. The $L_{1}$ version of this metric was shown in [23] (which is a precursor of our work here) to have desirable properties.

Key words and phrases:

Optimal transport; Unnormalized density space; Unnormalized Monge-Ampére equation.

The research is supported by AFOSR MURI FA9550-18-1-0502.

1. Introduction

Optimal transport (OT) plays important roles in inverse problems [10, 27] and machine learning [1, 13, 19]. It provides a particular distance function, called the Wasserstein metric or Earth Mover’s distance, among histograms or density functions [4, 26]. In these traditional settings, it assumes that histograms or densities have the same total mass. In real applications, we face a situation where the total mass of each histogram is not equal. For example, when comparing two images, their intensities are not the same. This fact prevents us from applying the classical optimal transport.

In this paper, we formulate simple and natural extensions of optimal transport in unnormalized density space. In a word, we add a spatial independent source function into the continuity equation and cost functional. There are two benefits of the current approach. On the one hand, the changes of the variational problem are simple. They define a robust $L^{p}$ Wasserstein metric in unnormalized density space and do not significantly change the computational complexity of the problem. The proposed model allows us to apply classical algorithms, such as the Chambolle-Pock primal-dual method [6], to solve it. On the other hand, the proposed problem is natural in that it uses the key Hamilton-Jacobi equation as in the original optimal transport problem. These properties allow us to identify new problems corresponding to the Monge problem and Monge-Ampére equation in unnormalized density space.

There have been various extensions of optimal transport for unnormalized or unbalanced densities [2, 3, 8, 5, 11, 12, 18, 21, 22, 24, 25]. In particular, [8, 9, 18] propose the Wasserstein-Fisher-Rao or Hellinger–Kantorovich metric111In the literature, the Wasserstein-Fisher-Rao metric is called unbalanced OT. To distinguish with their approaches, we call our approach unnormalized OT.. In their studies, a spatially dependent source function is introduced, which is a ratio involving the density in the spatial domain. In addition, [7] and [20] study other spatially dependent source functions. Here we propose a spatially independent source function which keeps the key Hamilton-Jacobi equation as in the normalized case. This property allows us to design a simple algorithm and to derive a reasonable simple unnormalized Monge-Ampére equation.

The plan of this paper is as follows. In section 2, we propose and study the properties of the unnormalized dynamical optimal transport problem. The unnormalized Monge problem, Monge-Ampére equation and Kantorovich formulations are all derived. In section 3, we present the algorithms and numerical examples for this proposed metric.

2. Unnormalized optimal transport

In this section, we introduce unnormalized OT problems and show that the proposed unnormalized metric is well defined. We then derive minimization procedures for unnormalized optimal transport.

Denote $\Omega\subset\mathbb{R}^{d}$ as a bounded convex domain with area $|\Omega|$ . Denote the space of normalized densities by

[TABLE]

Let the space of unnormalized densities be

[TABLE]

We note that $\mathcal{P}(\Omega)\subset\mathcal{M}(\Omega)$ . We next define the optimal transport cost between $\mu_{0},\mu_{1}\in\mathcal{M}(\Omega)$ .

Definition 1 (Unnormalized OT).

Define the $L^{p}$ unnormalized Wasserstein distance $UW_{p}\colon$ $\mathcal{M}(\Omega)\times\mathcal{M}(\Omega)\rightarrow\mathbb{R}$ by

[TABLE]

Here $\|\cdot\|$ is the Euclidean norm, $\mu_{0}$ , $\mu_{1}\in\mathcal{M}(\Omega)$ , and the infimum is taken over all continuous unnormalized density functions $\mu\colon[0,1]\times\Omega\rightarrow\mathbb{R}$ , and Borel vector fields $v\colon[0,1]\times\Omega\rightarrow\mathbb{R}^{d}$ with zero flux condition $v(t,x)\cdot n(t,x)=0$ on $(0,1)\times\partial\Omega$ with $n(t,x)$ being the normal vector on the boundary of $\Omega$ , and Borel spatially independent source functions $f\colon[0,1]\rightarrow\mathbb{R}$ .

The new proposed $L^{p}$ Wasserstein metric has an attractive physical interpretation. The above optimization problem can be viewed as a variational fluid dynamics problem in Eulerian coordinates. Definition 1 considers the motion, creation and removal of particles. During this process, the total mass is changing dynamically in a uniform manner, controlled by the positive parameter $\alpha$ and a spatially independent function $f(t)$ . We remark that the spatial independence of the source function introduces a very important natural property, which we will repeat. It uses the same Hamilton-Jacobi equation as in the classical optimal transport, which allows us to obtain a new Monge problem, Monge-Ampére equation and Kantorovich duality problem. In addition, this physical analogy follows approaches in [16]. More interestingly, we notice that problem (1) has essentially the same computational complexity as the classical dynamical optimal transport problem. We will present computational details in section 3.

2.1. $L^{1}$ unnormalized Wasserstein metric

We first study the $L^{1}$ unnormalized Wasserstein metric. When $p=1$ , the problem (1a) becomes:

[TABLE]

Denote

[TABLE]

then by Jensen’s inequality, the minimizer is obtained by a time independent solution. In other words,

[TABLE]

By integrating the time variable in the constraint, we observe that

[TABLE]

Denote $c=\int_{0}^{1}f(t)dt$ , by integrating on both time and spatial domain for continuity equation (1b), it is clear that

[TABLE]

We can show that the minimizer path can be attained in the last inequality, by choosing $\mu(t,x)=(1-t)\mu_{0}(x)+t\mu_{1}(x)$ . Thus we derive the following proposition.

Proposition 2.

The $L^{1}$ unnormalized Wasserstein metric is given by

[TABLE]

In addition, in one space dimension on the interval $\Omega=[0,1]$ , the $L^{1}$ unnormalized Wasserstein metric has the following explicit solution:

[TABLE]

The formulation in proposition 2 has been proposed in [23] for inverse problems. It is one of the prime motivations for this paper. We also note the minimizer satisfies the following form [17]:

[TABLE]

2.2. $L^{2}$ unnormalized Wasserstein metric

We next present the result when $p=2$ . Similar derivations can also be established for $p\in(1,\infty)$ . For simplicity of presentation, we now assume $|\Omega|=1$ .

Proposition 3.

The $L^{2}$ unnormalized Wasserstein metric (1) is a well-defined metric function in $\mathcal{M}(\Omega)$ . In addition, the minimizer $(v(t,x),\mu(t,x),f(t))$ for problem (1) satisfies

[TABLE]

and

[TABLE]

In particular, if $\mu(t,x)>0$ , then

[TABLE]

Remark: We note that equation (2) implies

[TABLE]

*This means that unlike the classical OT, we are not only solving for the unique $\nabla\Phi$ , but also for the unique $\Phi$ . *

Proof.

Denote $m(t,x)=\mu(t,x)v(t,x)$ and

[TABLE]

then variational problem (1) can be reformulated as

[TABLE]

It is clear that (4) is the reformulation of (1). We first prove that the variational problem (4) is well defined. In other words, there exists a feasible path for the dynamical constraint. We construct a feasible path $\mu_{t}$ connecting any $\mu_{0}$ , $\mu_{1}\in\mathcal{M}(\Omega)$ . The proof is divided into three steps.

Step 1. Construct a density path $t\in[0,\frac{1}{3}]$ , there exists a feasible path connecting $\mu_{0}$ and a uniform measure with total mass $\int_{\Omega}\mu_{0}dx$ . In this case, the density path is a normalized (classical) OT between two densities. We set $f(t)=0$ when $t=[0,1/3]$ , there always exists such a path.

Step 2. Construct a density path $t\in[\frac{1}{3},\frac{2}{3}]$ , there exists a feasible path connecting a uniform measure with total mass $\int_{\Omega}\mu_{0}dx$ and a uniform measure with total mass $\int_{\Omega}\mu_{1}dx$ . In this case, we let the transport flux $m(t,x)=0$ , and choose $f(t)=3(\int_{\Omega}\mu^{1}(x)dx-\int_{\Omega}\mu^{0}(x)dx)$ .

Step 3. Construct a density path $t\in[\frac{2}{3},1]$ , there exists a feasible path connecting a uniform measure with total mass $\int_{\Omega}\mu_{1}dx$ and $\mu_{1}$ . In this case, we set $f(t)=0$ . Following the classical OT, we find a feasible path.

Combining steps 1,2,3, the proposed path is feasible with finite cost functional. We next show that the problem has a minimizer. Since the constraint set is not empty, then it is classical to show the cost functional $F(m,\mu)+\frac{1}{\alpha}f(t)^{2}$ is convex and is lower semicontinuous, while the constraint is linear. So the variational problem (2) has a minimizer.

We next apply a Lagrange multiplier to find the minimizer. Denote $\Phi(t,x)$ as the multiplier with

[TABLE]

Assuming $\delta_{m}\mathcal{L}=0$ , $\delta_{\mu}\mathcal{L}\leq 0$ , $\delta_{f}\mathcal{L}=0$ , we derive the property of minimizer as follows:

[TABLE]

Here if $\mu>0$ , we obtain $\delta_{\mu}\mathcal{L}=0$ , which gives equality in the second formula of the above system. Using the fact $\frac{m(t,x)}{\mu(t,x)}=\nabla\Phi(t,x)$ , we prove the result. In this case, the non-negativity, symmetry, triangle inequality of the metric follow directly from the definition. ∎

We next derive our new Monge problem for unnormalized OT. This approach uses the Lagrange coordinates arising in problem (1).

Proposition 4 (Unnormalized Monge problem).

[TABLE]

Proof.

We now derive the Lagrange formulation of the unnormalized OT (1). Consider any mapping function $X_{t}(x)$ with vector field $v(t,X_{t}(x))$ , i.e.

[TABLE]

Then

[TABLE]

We next derive the differential equation for $J(t,x):=\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}$ . Later on, we use the notation $J(t)=J(t,x)$ and $\frac{d}{dt}J(t)=\frac{\partial}{\partial t}J(t,x)$ . Since

[TABLE]

where the third equality is derived by the Jacobi identity, i.e.

[TABLE]

and the last equality holds following our proposed continuity equation with spatial independent source function (1b).

Notice

[TABLE]

Since $X_{0}(x)=x$ and $\nabla X_{0}(x)=\mathbb{I}$ , then $J(0)=\mu(0,x)$ and

[TABLE]

Since the minimizer in Eulerian coordinates satisfies the Hamilton-Jacobi equation in (3):

[TABLE]

and $\frac{d}{dt}X_{t}(x)=\nabla\Phi(t,X_{t}(x))$ , then we naturally have $\frac{d^{2}}{dt^{2}}X_{t}(x)=0$ . This implies

[TABLE]

thus $X_{t}(x)=(1-t)x+tM(x)$ and $\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\textrm{Det}\Big{(}(1-t)\mathbb{I}+t\nabla M(x)\Big{)}$ .

Substituting all the above relations into (6):

[TABLE]

Thus we prove the results. ∎

We next find the relation between the spatial independent source function $f(t)$ and the mapping function $M(x)$ . For simplicity of presentation, we assume periodic boundary conditions on $\Omega$ .

Proposition 5 (Unnormalized Monge-Ampére equation).

The optimal mapping function $M(x)=\nabla\Psi(x)$ satisfies the following unnormalized Monge-Ampére equation

[TABLE]

Proof.

Let us rewrite the minimizer (2) into a time independent formulation. From the Hopf-Lax formula for the Hamilton-Jacobi equation,

[TABLE]

Thus $\nabla\Phi(0,x)+x-M(x)=0$ . We further denote $\Psi(x)=\Phi(0,x)+\frac{\|x\|^{2}}{2}$ , then $M(x)=\nabla\Psi(x)$ . From $X_{t}(x)=(1-t)x+tM(x)$ , then

[TABLE]

and

[TABLE]

From (2) and the above two formulas, then

[TABLE]

Substituting $f(t)^{\prime}s$ formula and $M(x)=\nabla\Psi(x)$ into (5b), we derive the result. ∎

We now present the Kantorovich duality formulation of the problem (1).

Proposition 6 (Unnormalized Kantorovich formulation).

[TABLE]

Proof.

As in [14, 15], we derive the duality formula by integration by parts as follows. Notice the fact that

[TABLE]

We have shown that the minimizer over $m$ is obtained at $\frac{m}{\mu}=\nabla\Phi$ , and $f(t)=\alpha\int_{\Omega}\Phi(t,x)dx$ . The last equality holds because $\mu(t,x)\geq 0$ , thus $\partial_{t}\Phi(t,x)+\frac{1}{2}\|\nabla\Phi(t,x)\|^{2}\leq 0$ .

We next show that the primal-dual gap is zero. From proposition 3, the minimizer $(\mu,\Phi)$ satisfies (2). Thus

[TABLE]

This concludes the proof. ∎

3. The numerical method

In this section, we propose to apply a primal-dual algorithm to solve unnormalized OT numerically. We then provide several numerical examples to demonstrate the effectiveness of this procedure.

3.1. Algorithm

We present a primal-dual algorithm for problem (1). In particular, our method is based on its reformulation (4), named the minimal flux problem. Define the Lagrangian of (4):

[TABLE]

where $\Phi(t,x)$ is the Lagrange multiplier of the unnormalized continuity equation (1b).

Convex analysis shows that $(m^{*}(t,x),\mu^{*}(t,x),f^{*}(t))$ is a solution to (4) if and only if there is a $\Phi^{*}$ such that $(m^{*},\Phi^{*})$ is a saddle point of $\mathcal{L}(m,\mu,f,\Phi)$ . In other words, we can compute minimization (4) by solving the following minimax problem

[TABLE]

It is clear that $\mathcal{L}$ is convex in $m$ , $\mu$ , $f$ and concave in $\Phi$ , and the interaction term is a linear operator. This property allows us to apply the Chambolle-Pock first order primal-dual algorithm [6], which gives the update as follows.

[TABLE]

where $\tau_{1}$ , $\tau_{2}$ are given step sizes for primal, dual variables. These steps can be interpreted as a gradient descent in the primal variable $(m,\mu,f)$ and a gradient ascent in the dual variable $\Phi$ .

It turns out that the optimizations in above update (8) have explicit formulas. The first line becomes

[TABLE]

The second line of (8) simplifies to

[TABLE]

The above problem has an analytical solution by solving a cubic equation. The third line of (8) gives

[TABLE]

The fourth line of (8) gives

[TABLE]

Combining all above formulas, we are now ready to state the algorithm.

[TABLE]

3.2. Numerical Grid

To apply the algorithm, we first define our numerical grid. For simplicity we consider the case where the space of interest is $\Omega=[0,1]^{d}$ and time $\mathcal{T}=[0,1]$ . Further, for the following explanations we consider the problem when $d=2$ , however, our grid construction can be constructed on any dimension by extending it in the obvious way. We will use the same symbol to represent both the continuous $u,m,\Phi,f$ and their respective discretized counterparts, as the difference between the two should be clear from context alone.

Let $n_{t},n_{x}$ , and $n_{y}$ be given then notate $\Delta t=\frac{1}{n_{t}}$ , $\Delta x=\frac{1}{n_{x}}$ , and $\Delta y=\frac{1}{n_{y}}$ . Using this notation we define the following sets:

[TABLE]

where $i=0,\dots,n_{x}-1$ , $j=0,\dots,n_{y}-1$ , and $k=0\dots,n_{t}-1$ unless otherwise specified.

For the discretized problem we consider a $f_{(k)}$ that is constant along each $\mathcal{T}_{(k)}$ , and consider $\mu_{(k,i,j)}$ and $\Phi_{(k,i,j)}$ that are constant along each $\mathcal{T}_{(k)}\times\Omega_{(i,j)}$ . The vector $m_{(k,i,j)}$ has two components $m_{x,(k,i-1/2,j)}$ and $m_{y,(k,i,j-1/2)}$ , that are constant along $\mathcal{T}_{(k)}\times\Omega_{(i-1/2,j)}$ and $\mathcal{T}_{(k)}\times\Omega_{(i,j-1/2)}$ respectively. Numerically $m$ quantifies the movement of density between each of the $\Omega_{(i,j)}$ and its spacial neighbors (i.e. $\Omega_{(i-1,j)},\Omega_{(i,j-1)},\Omega_{(i+1,j)}$ , and $\Omega_{(i,j+1)}$ ) and so it is natural to define the components of $m$ not on $\Omega_{(i,j)}$ but rather on $\Omega_{(i-1/2,j)}$ , $\Omega_{(i+1/2,j)}$ , $\Omega_{(i,j-1/2)}$ and $\Omega_{(i,j+1/2)}$ .

Using the above notation, we write the steps of the algorithm as:

[TABLE]

where

[TABLE]

Note that the unusual boundary conditions of $\partial_{t}\Phi$ arise from the need to satisfy

[TABLE]

3.3. Numerical Experiments

Now we present our numerical results. The first two experiments are in one dimension, and the rest are in two. The numerical parameters for our experiments are given in Table 1.

3.4. Experiment 1

Here we consider the problem where $\rho_{0}$ and $\rho_{1}$ are both one dimensional Gaussians of equal integral, $\Omega=[0,1]$ and

[TABLE]

where $\sigma_{0}=\frac{1}{3},\sigma_{1}=\frac{2}{3},\mu_{0}=\mu_{1}=0.1$ . We plot the results in Figure 1. In this case the input densities are balanced and so $W_{2}(\rho_{0},\rho_{1})$ and $UW_{2}(\rho_{0},\rho_{1})$ appear similar. Indeed $UW_{2}(\rho_{0},\rho_{1})=0.055$ and $W_{2}(\rho_{0},\rho_{1})=0.056$ .

Note that even in this simple case the behavior of $f(t)$ is nuanced. In this case, $\rho_{0}$ and $\rho_{1}$ are smooth, of equal integral and $W_{2}(\rho_{0},\rho_{1})$ is given by a simple analytical formula, and $f(t)$ is not identically zero. Integrating Equation 1b in space and time yields $|\Omega|\int_{[0,1]}f(t)dt=\int_{\Omega}\rho_{1}dx-\int_{\Omega}\rho_{0}dx$ , and so for balanced inputs $\int_{[0,1]}f(t)dt=0$ , but experiment 1 shows that $f\not\equiv 0$ .

3.5. Experiment 2

Again consider $\Omega=[0,1]$ , however in this experiment we analyse the asymptotic behavior of $UW_{2}(\rho_{0},\rho_{1})$ as a function of $\alpha$ and $\alpha\rightarrow 0$ and $\alpha\rightarrow\infty$ . Here

[TABLE]

The balanced case refers to $UW_{2}(\rho_{0}^{\prime},\rho_{1})$ , and the unbalanced refers to $UW_{2}(\rho_{0},\rho_{1})$ . In both cases we compute the unnormalized Wasserstein distance. The results are given in Figure 2.

Figures 2(a) - 2(c) show that (at least numerically) $UW_{2}(\rho_{0},\rho_{1};\alpha),f(t,\alpha)$ and $\Phi(t,x;\alpha)$ converge as $\alpha\rightarrow 0^{+}$ , $\alpha\rightarrow\infty$ when $\int_{\Omega}\rho_{0}dx=\int_{\Omega}\rho_{1}dx$ . Further is seems plausible that for balanced inputs $UW_{2}(\rho_{0},\rho_{1};\alpha)\rightarrow W_{2}(\rho_{0},\rho_{1})$ as $\alpha\rightarrow 0^{+}$ . For any $\alpha$ the $u,m$ and $\Phi$ from $W_{2}(\rho_{0},\rho_{1})$ along with $f(t)\equiv 0$ satisfy the constraint of Equation 1b. Formally sending $\alpha\rightarrow\infty$ causes $f(t)$ to 0.

Figures 2(d) - 2(f) illustrate the asymptotic behavior of $UW_{2}(\rho_{0},\rho_{1};\alpha)$ w.r.t. $\alpha$ when the inputs are unbalanced. In that case we (numerically) see that as $\alpha\rightarrow 0$ , $f(t;\alpha)$ converges to a non-zero value, and both $UW_{2}(\rho_{0},\rho_{1};\alpha)$ and $\Phi(t,x;\alpha)$ diverge. This too is consistent with the formal argument that $UW_{2}(\rho_{0},\rho_{1};\alpha)\rightarrow W_{2}(\rho_{0},\rho_{1})$ as $\alpha\rightarrow 0^{+}$ .

In a predecessor of this work [4] the authors solve for $W_{2}(\rho_{1},\rho_{2})$ using Lagrange multipliers in a similar formulation to equations (1a), (1b). In their work the Lagrange multiplier $\Phi(t,x)$ is given up to an additive constant. If indeed $UW_{2}(\rho_{0},\rho_{1};\alpha)\rightarrow W_{2}(\rho_{0},\rho_{1})$ as $\alpha\rightarrow 0^{+}$ and $\Phi(t,x;\alpha)$ does converge then $\Phi(t,x;0^{+})$ is given uniquely (as a limit) and there is no issue of undetermined constants.

3.6. Experiment 3

Now consider the two dimensional problem where $\Omega=[0,1]^{2}$ . In this case

[TABLE]

where $C$ is a normalization constant such that $\int_{\Omega}N(x,y;\mu_{1},\mu_{2},\sigma_{1}^{2},\sigma_{2}^{2})dxdy=1$ . The results from our experiments are shown in Figure 3. Note that although the mass of $\rho_{0}$ is twice that of $\rho_{1}$ , the optimal $f(t)$ is not non-positive. Indeed from $t=0$ to $t\approx\frac{1}{4}$ , $f(t)$ is positive, before staying non-positive for the rest of the interval. This again illustrates that even in the case of gaussian movement the behavior of $f(t)$ is nuanced, and violates naive basic intuition.

3.7. Experiment 4

Consider again the two dimensional problem, however this time we choose $\rho_{0}$ and $\rho_{1}$ to be the cats in [17]. Our results are summarized in Figure 4. This illustrates that our new method can be used as a general purpose OT solver for unbalanced inputs, and so can be used to interpolate between two functions.

3.8. $L^{1}$ unnormalized Wasserstein metric

In this subsection, we also present several numerical results for $UW_{1}$ in Figure 5. In [23] the authors develop the $UW_{1}$ metric (called the $\operatornamewithlimits{struc}\left[\cdot\right]$ in that work) and show that it has the desirable property that is insensitive to noise and sensitive to the underlying structure. Numerically $UW_{1}(\rho_{0},\rho_{1})$ is much easier to compute as the time dimension can be integrated out, so that $f$ is constant, and $\mu$ , $m$ and $\Phi$ have no time-varying component.

4. Discussion

In this paper, we propose and solve an unnormalized optimal transport problem. We show that the proposed distance is well defined, and we obtain the minimizer using the same key Hamilton-Jacobi equation (3). More importantly, computing the $L^{p}$ unnormalized Wasserstein metric has essentially the same computational complexity as the normalized one. In the future, we intend to study these related geometric properties and applications in inverse problems, machine learning and mean field games.

Bibliography27

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. ar Xiv:1701.07875 [cs, stat] , 2017.
2[2] J. Barrett and L. Prigozhin. Partial L 1 Monge–Kantorovich problem: Variational formulation and numerical approximation. Interfaces and Free Boundaries , pages 201–238, 2009.
3[3] J.-D. Benamou. Numerical resolution of an “unbalanced” mass transport problem. ESAIM: Mathematical Modelling and Numerical Analysis , 37(5):851–868, 2003.
4[4] J.-D. Benamou and Y. Brenier. A Computational Fluid Mechanics Solution to the Monge-Kantorovich Mass Transfer Problem. Numerische Mathematik , 84(3):375–393, 2000.
5[5] L. Caffarelli and R. Mc Cann. Free boundaries in optimal transport and Monge-Ampère obstacle problems. Annals of Mathematics , 171(2):673–730, 2010.
6[6] A. Chambolle and T. Pock. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision , 40(1):120–145, 2011.
7[7] L. Chayes and H. K. Lei. Transport and equilibrium in non-conservative systems. Advances in Differential Equations , 23(1/2):1–64, 2018.
8[8] L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unbalanced Optimal Transport: Geometry and Kantorovich Formulation. ar Xiv:1508.05216 [math] , 2015.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Code & Models

Videos

Unnormalized Optimal Transport

Abstract.

Key words and phrases:

1. Introduction

2. Unnormalized optimal transport

Definition 1** (Unnormalized OT).**

2.1. L1L^{1}L1 unnormalized Wasserstein metric

Proposition 2**.**

2.2. L2L^{2}L2 unnormalized Wasserstein metric

Proposition 3**.**

Proof.

Proposition 4** (Unnormalized Monge problem).**

Proof.

Proposition 5** (Unnormalized Monge-Ampére equation).**

Proof.

Proposition 6** (Unnormalized Kantorovich formulation).**

Proof.

3. The numerical method

3.1. Algorithm

3.2. Numerical Grid

3.3. Numerical Experiments

3.4. Experiment 1

3.5. Experiment 2

3.6. Experiment 3

3.7. Experiment 4

3.8. L1L^{1}L1 unnormalized Wasserstein metric

4. Discussion

Definition 1 (Unnormalized OT).

2.1. $L^{1}$ unnormalized Wasserstein metric

Proposition 2.

2.2. $L^{2}$ unnormalized Wasserstein metric

Proposition 3.

Proposition 4 (Unnormalized Monge problem).

Proposition 5 (Unnormalized Monge-Ampére equation).

Proposition 6 (Unnormalized Kantorovich formulation).

3.8. $L^{1}$ unnormalized Wasserstein metric