TL;DR
This paper extends optimal transport theory to handle unnormalized and unequal masses, introducing new equations and duality formulas that enable efficient computation of distances between unnormalized densities.
Contribution
It develops a novel extension of the Monge-Kantorovich problem for unnormalized masses, including new equations and duality formulas, with efficient solution methods.
Findings
Introduces a new Monge-Ampere type equation.
Develops a Kantorovich duality formula for unnormalized masses.
Provides an efficient computational approach using primal-dual algorithms.
Abstract
We propose an extension of the computational fluid mechanics approach to the Monge-Kantorovich mass transfer problem, which was developed by Benamou-Brenier. Our extension allows optimal transfer of unnormalized and unequal masses. We obtain a one-parameter family of simple modifications of the formulation in [4]. This leads us to a new Monge-Ampere type equation and a new Kantorovich duality formula. These can be solved efficiently by, for example, the Chambolle-Pock primal-dual algorithm. This solution to the extended mass transfer problem gives us a simple metric for computing the distance between two unnormalized densities. The L1 version of this metric was shown in [23] (which is a precursor of our work here) to have desirable properties.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18
Figure 19
Figure 20
Figure 21
Figure 22
Figure 23
Figure 24
Figure 25
Figure 26
Figure 27| Parameter | Value | Parameter | Value |
|---|---|---|---|
| Discretization | Optimization | ||
| 15 | Iterations | 200,000 | |
| 35 | |||
| 35 | |||
| 100 | |||
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Unnormalized Optimal Transport
Wilfrid Gangbo
,
Wuchen Li
,
Stanley Osher
and
Michael Puthawala
Mathematics department, University of California, Los Angeles
Abstract.
We propose an extension of the computational fluid mechanics approach to the Monge-Kantorovich mass transfer problem, which was developed by Benamou-Brenier in [4]. Our extension allows optimal transfer of unnormalized and unequal masses. We obtain a one-parameter family of simple modifications of the formulation in [4]. This leads us to a new Monge-Ampére type equation and a new Kantorovich duality formula. These can be solved efficiently by, for example, the Chambolle-Pock primal-dual algorithm [6]. This solution to the extended mass transfer problem gives us a simple metric for computing the distance between two unnormalized densities. The version of this metric was shown in [23] (which is a precursor of our work here) to have desirable properties.
Key words and phrases:
Optimal transport; Unnormalized density space; Unnormalized Monge-Ampére equation.
The research is supported by AFOSR MURI FA9550-18-1-0502.
1. Introduction
Optimal transport (OT) plays important roles in inverse problems [10, 27] and machine learning [1, 13, 19]. It provides a particular distance function, called the Wasserstein metric or Earth Mover’s distance, among histograms or density functions [4, 26]. In these traditional settings, it assumes that histograms or densities have the same total mass. In real applications, we face a situation where the total mass of each histogram is not equal. For example, when comparing two images, their intensities are not the same. This fact prevents us from applying the classical optimal transport.
In this paper, we formulate simple and natural extensions of optimal transport in unnormalized density space. In a word, we add a spatial independent source function into the continuity equation and cost functional. There are two benefits of the current approach. On the one hand, the changes of the variational problem are simple. They define a robust Wasserstein metric in unnormalized density space and do not significantly change the computational complexity of the problem. The proposed model allows us to apply classical algorithms, such as the Chambolle-Pock primal-dual method [6], to solve it. On the other hand, the proposed problem is natural in that it uses the key Hamilton-Jacobi equation as in the original optimal transport problem. These properties allow us to identify new problems corresponding to the Monge problem and Monge-Ampére equation in unnormalized density space.
There have been various extensions of optimal transport for unnormalized or unbalanced densities [2, 3, 8, 5, 11, 12, 18, 21, 22, 24, 25]. In particular, [8, 9, 18] propose the Wasserstein-Fisher-Rao or Hellinger–Kantorovich metric111In the literature, the Wasserstein-Fisher-Rao metric is called unbalanced OT. To distinguish with their approaches, we call our approach unnormalized OT.. In their studies, a spatially dependent source function is introduced, which is a ratio involving the density in the spatial domain. In addition, [7] and [20] study other spatially dependent source functions. Here we propose a spatially independent source function which keeps the key Hamilton-Jacobi equation as in the normalized case. This property allows us to design a simple algorithm and to derive a reasonable simple unnormalized Monge-Ampére equation.
The plan of this paper is as follows. In section 2, we propose and study the properties of the unnormalized dynamical optimal transport problem. The unnormalized Monge problem, Monge-Ampére equation and Kantorovich formulations are all derived. In section 3, we present the algorithms and numerical examples for this proposed metric.
2. Unnormalized optimal transport
In this section, we introduce unnormalized OT problems and show that the proposed unnormalized metric is well defined. We then derive minimization procedures for unnormalized optimal transport.
Denote as a bounded convex domain with area . Denote the space of normalized densities by
[TABLE]
Let the space of unnormalized densities be
[TABLE]
We note that . We next define the optimal transport cost between .
Definition 1** (Unnormalized OT).**
Define the unnormalized Wasserstein distance by
[TABLE]
Here is the Euclidean norm, , , and the infimum is taken over all continuous unnormalized density functions , and Borel vector fields with zero flux condition on with being the normal vector on the boundary of , and Borel spatially independent source functions .
The new proposed Wasserstein metric has an attractive physical interpretation. The above optimization problem can be viewed as a variational fluid dynamics problem in Eulerian coordinates. Definition 1 considers the motion, creation and removal of particles. During this process, the total mass is changing dynamically in a uniform manner, controlled by the positive parameter and a spatially independent function . We remark that the spatial independence of the source function introduces a very important natural property, which we will repeat. It uses the same Hamilton-Jacobi equation as in the classical optimal transport, which allows us to obtain a new Monge problem, Monge-Ampére equation and Kantorovich duality problem. In addition, this physical analogy follows approaches in [16]. More interestingly, we notice that problem (1) has essentially the same computational complexity as the classical dynamical optimal transport problem. We will present computational details in section 3.
2.1. unnormalized Wasserstein metric
We first study the unnormalized Wasserstein metric. When , the problem (1a) becomes:
[TABLE]
Denote
[TABLE]
then by Jensen’s inequality, the minimizer is obtained by a time independent solution. In other words,
[TABLE]
By integrating the time variable in the constraint, we observe that
[TABLE]
Denote , by integrating on both time and spatial domain for continuity equation (1b), it is clear that
[TABLE]
We can show that the minimizer path can be attained in the last inequality, by choosing . Thus we derive the following proposition.
Proposition 2**.**
The unnormalized Wasserstein metric is given by
[TABLE]
In addition, in one space dimension on the interval , the unnormalized Wasserstein metric has the following explicit solution:
[TABLE]
The formulation in proposition 2 has been proposed in [23] for inverse problems. It is one of the prime motivations for this paper. We also note the minimizer satisfies the following form [17]:
[TABLE]
2.2. unnormalized Wasserstein metric
We next present the result when . Similar derivations can also be established for . For simplicity of presentation, we now assume .
Proposition 3**.**
The unnormalized Wasserstein metric (1) is a well-defined metric function in . In addition, the minimizer for problem (1) satisfies
[TABLE]
and
[TABLE]
In particular, if , then
[TABLE]
Remark: We note that equation (2) implies
[TABLE]
*This means that unlike the classical OT, we are not only solving for the unique , but also for the unique . *
Proof.
Denote and
[TABLE]
then variational problem (1) can be reformulated as
[TABLE]
It is clear that (4) is the reformulation of (1). We first prove that the variational problem (4) is well defined. In other words, there exists a feasible path for the dynamical constraint. We construct a feasible path connecting any , . The proof is divided into three steps.
Step 1. Construct a density path , there exists a feasible path connecting and a uniform measure with total mass . In this case, the density path is a normalized (classical) OT between two densities. We set when , there always exists such a path.
Step 2. Construct a density path , there exists a feasible path connecting a uniform measure with total mass and a uniform measure with total mass . In this case, we let the transport flux , and choose .
Step 3. Construct a density path , there exists a feasible path connecting a uniform measure with total mass and . In this case, we set . Following the classical OT, we find a feasible path.
Combining steps 1,2,3, the proposed path is feasible with finite cost functional. We next show that the problem has a minimizer. Since the constraint set is not empty, then it is classical to show the cost functional is convex and is lower semicontinuous, while the constraint is linear. So the variational problem (2) has a minimizer.
We next apply a Lagrange multiplier to find the minimizer. Denote as the multiplier with
[TABLE]
Assuming , , , we derive the property of minimizer as follows:
[TABLE]
Here if , we obtain , which gives equality in the second formula of the above system. Using the fact , we prove the result. In this case, the non-negativity, symmetry, triangle inequality of the metric follow directly from the definition. ∎
We next derive our new Monge problem for unnormalized OT. This approach uses the Lagrange coordinates arising in problem (1).
Proposition 4** (Unnormalized Monge problem).**
[TABLE]
Proof.
We now derive the Lagrange formulation of the unnormalized OT (1). Consider any mapping function with vector field , i.e.
[TABLE]
Then
[TABLE]
We next derive the differential equation for J(t,x):=\mu(t,X_{t}(x))\textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}. Later on, we use the notation and . Since
[TABLE]
where the third equality is derived by the Jacobi identity, i.e.
[TABLE]
and the last equality holds following our proposed continuity equation with spatial independent source function (1b).
Notice
[TABLE]
Since and , then and
[TABLE]
Since the minimizer in Eulerian coordinates satisfies the Hamilton-Jacobi equation in (3):
[TABLE]
and , then we naturally have . This implies
[TABLE]
thus and \textrm{Det}\Big{(}\nabla X_{t}(x)\Big{)}=\textrm{Det}\Big{(}(1-t)\mathbb{I}+t\nabla M(x)\Big{)}.
Substituting all the above relations into (6):
[TABLE]
Thus we prove the results. ∎
We next find the relation between the spatial independent source function and the mapping function . For simplicity of presentation, we assume periodic boundary conditions on .
Proposition 5** (Unnormalized Monge-Ampére equation).**
The optimal mapping function satisfies the following unnormalized Monge-Ampére equation
[TABLE]
Proof.
Let us rewrite the minimizer (2) into a time independent formulation. From the Hopf-Lax formula for the Hamilton-Jacobi equation,
[TABLE]
Thus . We further denote , then . From , then
[TABLE]
and
[TABLE]
From (2) and the above two formulas, then
[TABLE]
Substituting formula and into (5b), we derive the result. ∎
We now present the Kantorovich duality formulation of the problem (1).
Proposition 6** (Unnormalized Kantorovich formulation).**
[TABLE]
Proof.
As in [14, 15], we derive the duality formula by integration by parts as follows. Notice the fact that
[TABLE]
We have shown that the minimizer over is obtained at , and . The last equality holds because , thus .
We next show that the primal-dual gap is zero. From proposition 3, the minimizer satisfies (2). Thus
[TABLE]
This concludes the proof. ∎
3. The numerical method
In this section, we propose to apply a primal-dual algorithm to solve unnormalized OT numerically. We then provide several numerical examples to demonstrate the effectiveness of this procedure.
3.1. Algorithm
We present a primal-dual algorithm for problem (1). In particular, our method is based on its reformulation (4), named the minimal flux problem. Define the Lagrangian of (4):
[TABLE]
where is the Lagrange multiplier of the unnormalized continuity equation (1b).
Convex analysis shows that is a solution to (4) if and only if there is a such that is a saddle point of . In other words, we can compute minimization (4) by solving the following minimax problem
[TABLE]
It is clear that is convex in , , and concave in , and the interaction term is a linear operator. This property allows us to apply the Chambolle-Pock first order primal-dual algorithm [6], which gives the update as follows.
[TABLE]
where , are given step sizes for primal, dual variables. These steps can be interpreted as a gradient descent in the primal variable and a gradient ascent in the dual variable .
It turns out that the optimizations in above update (8) have explicit formulas. The first line becomes
[TABLE]
The second line of (8) simplifies to
[TABLE]
The above problem has an analytical solution by solving a cubic equation. The third line of (8) gives
[TABLE]
The fourth line of (8) gives
[TABLE]
Combining all above formulas, we are now ready to state the algorithm.
[TABLE]
3.2. Numerical Grid
To apply the algorithm, we first define our numerical grid. For simplicity we consider the case where the space of interest is and time . Further, for the following explanations we consider the problem when , however, our grid construction can be constructed on any dimension by extending it in the obvious way. We will use the same symbol to represent both the continuous and their respective discretized counterparts, as the difference between the two should be clear from context alone.
Let , and be given then notate , , and . Using this notation we define the following sets:
[TABLE]
where , , and unless otherwise specified.
For the discretized problem we consider a that is constant along each , and consider and that are constant along each . The vector has two components and , that are constant along and respectively. Numerically quantifies the movement of density between each of the and its spacial neighbors (i.e. , and ) and so it is natural to define the components of not on but rather on , , and .
Using the above notation, we write the steps of the algorithm as:
[TABLE]
where
[TABLE]
Note that the unusual boundary conditions of arise from the need to satisfy
[TABLE]
3.3. Numerical Experiments
Now we present our numerical results. The first two experiments are in one dimension, and the rest are in two. The numerical parameters for our experiments are given in Table 1.
3.4. Experiment 1
Here we consider the problem where and are both one dimensional Gaussians of equal integral, and
[TABLE]
where . We plot the results in Figure 1. In this case the input densities are balanced and so and appear similar. Indeed and .
Note that even in this simple case the behavior of is nuanced. In this case, and are smooth, of equal integral and is given by a simple analytical formula, and is not identically zero. Integrating Equation 1b in space and time yields , and so for balanced inputs , but experiment 1 shows that .
3.5. Experiment 2
Again consider , however in this experiment we analyse the asymptotic behavior of as a function of and and . Here
[TABLE]
The balanced case refers to , and the unbalanced refers to . In both cases we compute the unnormalized Wasserstein distance. The results are given in Figure 2.
Figures 2(a) - 2(c) show that (at least numerically) and converge as , when . Further is seems plausible that for balanced inputs as . For any the and from along with satisfy the constraint of Equation 1b. Formally sending causes to 0.
Figures 2(d) - 2(f) illustrate the asymptotic behavior of w.r.t. when the inputs are unbalanced. In that case we (numerically) see that as , converges to a non-zero value, and both and diverge. This too is consistent with the formal argument that as .
In a predecessor of this work [4] the authors solve for using Lagrange multipliers in a similar formulation to equations (1a), (1b). In their work the Lagrange multiplier is given up to an additive constant. If indeed as and does converge then is given uniquely (as a limit) and there is no issue of undetermined constants.
3.6. Experiment 3
Now consider the two dimensional problem where . In this case
[TABLE]
where is a normalization constant such that . The results from our experiments are shown in Figure 3. Note that although the mass of is twice that of , the optimal is not non-positive. Indeed from to , is positive, before staying non-positive for the rest of the interval. This again illustrates that even in the case of gaussian movement the behavior of is nuanced, and violates naive basic intuition.
3.7. Experiment 4
Consider again the two dimensional problem, however this time we choose and to be the cats in [17]. Our results are summarized in Figure 4. This illustrates that our new method can be used as a general purpose OT solver for unbalanced inputs, and so can be used to interpolate between two functions.
3.8. unnormalized Wasserstein metric
In this subsection, we also present several numerical results for in Figure 5. In [23] the authors develop the metric (called the in that work) and show that it has the desirable property that is insensitive to noise and sensitive to the underlying structure. Numerically is much easier to compute as the time dimension can be integrated out, so that is constant, and , and have no time-varying component.
4. Discussion
In this paper, we propose and solve an unnormalized optimal transport problem. We show that the proposed distance is well defined, and we obtain the minimizer using the same key Hamilton-Jacobi equation (3). More importantly, computing the unnormalized Wasserstein metric has essentially the same computational complexity as the normalized one. In the future, we intend to study these related geometric properties and applications in inverse problems, machine learning and mean field games.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Arjovsky, S. Chintala, and L. Bottou. Wasserstein GAN. ar Xiv:1701.07875 [cs, stat] , 2017.
- 2[2] J. Barrett and L. Prigozhin. Partial L 1 Monge–Kantorovich problem: Variational formulation and numerical approximation. Interfaces and Free Boundaries , pages 201–238, 2009.
- 3[3] J.-D. Benamou. Numerical resolution of an “unbalanced” mass transport problem. ESAIM: Mathematical Modelling and Numerical Analysis , 37(5):851–868, 2003.
- 4[4] J.-D. Benamou and Y. Brenier. A Computational Fluid Mechanics Solution to the Monge-Kantorovich Mass Transfer Problem. Numerische Mathematik , 84(3):375–393, 2000.
- 5[5] L. Caffarelli and R. Mc Cann. Free boundaries in optimal transport and Monge-Ampère obstacle problems. Annals of Mathematics , 171(2):673–730, 2010.
- 6[6] A. Chambolle and T. Pock. A First-Order Primal-Dual Algorithm for Convex Problems with Applications to Imaging. Journal of Mathematical Imaging and Vision , 40(1):120–145, 2011.
- 7[7] L. Chayes and H. K. Lei. Transport and equilibrium in non-conservative systems. Advances in Differential Equations , 23(1/2):1–64, 2018.
- 8[8] L. Chizat, G. Peyré, B. Schmitzer, and F.-X. Vialard. Unbalanced Optimal Transport: Geometry and Kantorovich Formulation. ar Xiv:1508.05216 [math] , 2015.
