A variational approach to regularity theory in optimal transportation
Michael Goldman (LJLL)

TL;DR
This paper introduces a variational method to analyze the regularity of optimal transport maps, providing quantitative insights and applications to partial regularity and structure predictions in matching problems.
Contribution
It offers a new variational approach to regularity theory in optimal transportation, including a quantitative linearization of the Monge-Ampère equation and applications to existing regularity results.
Findings
A quantitative linearization of the Monge-Ampère equation around the identity.
A variational proof of the partial regularity theorem by Figalli and Kim.
Validation of structure predictions in optimal transport matching problems.
Abstract
This paper describes recent results obtained in collaboration with M. Huesmann and F. Otto on the regularity of optimal transport maps. The main result is a quantitative version of the well-known fact that the linearization of the Monge-Amp{\`e}re equation around the identity is the Poisson equation. We present two applications of this result. The first one is a variational proof of the partial regularity theorem of Figalli and Kim and the second is the rigorous validation of some predictions made by Carraciolo and al. on the structure of the optimal transport maps in matching problems.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeometric Analysis and Curvature Flows · Nonlinear Partial Differential Equations · Advanced Mathematical Modeling in Engineering
A variational approach to regularity theory in optimal transportation
M. Goldman Université de Paris, CNRS, Sorbonne-Université, Laboratoire Jacques-Louis Lions (LJLL), F-75005 Paris, France, [email protected]
Abstract
This paper describes recent results obtained in collaboration with M. Huesmann and F. Otto on the regularity of optimal transport maps. The main result is a quantitative version of the well-known fact that the linearization of the Monge-Ampère equation around the identity is the Poisson equation. We present two applications of this result. The first one is a variational proof of the partial regularity theorem of Figalli and Kim and the second is the rigorous validation of some predictions made by Carraciolo and al. on the structure of the optimal transport maps in matching problems.
1 Introduction
Following Caffarelli’s groundbreaking papers [9, 8], the classical approach to regularity theory for solutions of the optimal transport problem goes through maximum principle arguments and the construction of barriers (see the review paper [13]). The aim of this note is to describe a recent alternative approach, more variational in nature and based on the fact that the linearization of the Monge-Ampère equation around the identity is the Poisson equation (see [27]). Our main achievement in this direction is an harmonic approximation result which says that if at a given scale the transport plan is close to the identity and if at the same scale both the starting and target measures are close (in the Wasserstein metric) to be constant, then on a slightly smaller scale, the transport plan is actually extremely close to an harmonic gradient field. As in De Giorgi’s approach to the regularity theory for minimal surfaces (see [24]) this allows to transfer the good regularity properties of harmonic functions to the transport plan and obtain an “excess improvement by tilting” estimate. This may be used to propagate information from the macroscopic scale down to the microscopic scale through a Campanato iteration.
We give two applications of this result. The first one is a new proof of the partial regularity result of Figalli and Kim [15] (see also [14]). The second one is a validation up to the microscopic scale of the prediction by Caracciolo and al. [11] that for the optimal matching problem between a Poisson point process and the Lebesgue measure, the optimal transport plan is well approximated by the gradient of the solution to the corresponding Poisson equation with very high probability.
The plan of this note is the following. In Section 2 we recall some standard results on optimal transportation. The harmonic approximation theorem is stated together with a sketch of proof in Section 3. We then describe the application to the partial regularity result in Section 4 and to the optimal matching problem in Section 5.
2 The optimal transport problem
Optimal transportation is nowadays a very broad and active field. We give here only a very basic and short introduction to the topic and refer the reader to the monographs [25, 27] for much more details. For and two positive measures on with the optimal transport problem (in its Lagrangian formulation) is
[TABLE]
where for a coupling on , (respectively ) denotes the first (respectively the second) marginal of . Under very mild assumptions on and (for instance compact supports), an optimal transference plan exists (see [27]). The optimality conditions are as follows:
Theorem 2.1**.**
Let be a coupling between and .
- (i)
(Knott-Smith) It is optimal if and only if there exists a convex and lower-semicontinuous function (also called the Kantorovich potential) such that .
- (ii)
(Brenier) Moreover, if does not give mass to Lebesgue negligible sets, then there exists a unique , gradient of a convex function, with and . In this case we let be the optimal transport map.
Let us point out that assuming that is regular and that both and are smooth densities, the condition is nothing else than the Monge-Ampère equation
[TABLE]
In particular, if both and are close to (the same) constant density, then the Monge-Ampère equation linearizes to the Poisson equation (see [27, Ex. 4.1])
[TABLE]
We will also use the Eulerian formulation of the optimal transport problem.
Theorem 2.2** (Benamou-Brenier).**
There holds
[TABLE]
Moreover, if is an optimal transport plan for (2.1), then the density-flux pair defined for by its action on test functions as
[TABLE]
is a minimizer of (2.3).
Let us introduce some further notation. If is a minimizer of (2.3), we define the density-flux pair obtained by integrating in time (for instance ). For and a positive measure on , we denote by , the Wasserstein distance between the restriction of to the ball and the corresponding constant density .
In order to obtain a local version of the equivalence between (2.1) and (2.3), we will need an bound on the displacement (see [20, 19]).
Lemma 2.3**.**
Let be a coupling between two measures and . Assume that is monotoneiiiMeaning that for every and in , . and that for someiiiiiiWe use the short-hand notation to indicate that there exists depending only on the dimension such that . Similarly, means that there exists a constant depending on the dimension such that . , where
[TABLE]
and
[TABLE]
Then, for every
[TABLE]
3 The harmonic approximation theorem
We now state the harmonic approximation theorem. By scaling invariance, it is enough to state it at the unit scale . For , two positive measures and an optimal coupling between them, we define the “excess” energy as in (2.5) and the distance to the data as in (2.6).
Theorem 3.1**.**
([19, Th. 1.4]) For every , there exist and such that provided , there exists a radius such that if is a solution of ( denotes here the external normal to )
[TABLE]
where is the generic constant for which this equation is solvable, then
[TABLE]
The proof of Theorem 3.1 is actually performed at the Eulerian level. Thanks to Lemma 2.3, it is indeed enough to prove:
Theorem 3.2**.**
For every , there exist and such that provided , there exists a radius such that if solves (3.1), then
[TABLE]
To simplify a bit the discussion, we will assume from now on that , so that .
The proof of Theorem 3.2 is based on three ingredients. The first of them is the choice of a ’good’ radius . Indeed, as will become apparent in the discussion below, we need a control on various quantities and this seems to be possible only for generic radii. The second ingredient is an almost orthogonality property. The last one is the construction of a competitor for (2.3).
We define the measure on and then let . Before discussing the almost orthogonality property and the construction, let us point out that for our estimates we would need to control the Dirichlet energy by . Since by elliptic regularity,
[TABLE]
this is only possible if is controlled in (or at least in ). In order to solve this issue, (3.3) is first proven with instead of where solves
[TABLE]
where is a regularized version of in the sense thatiiiiiiiiiFor a measure , we note its positive/negative part.
[TABLE]
The density is obtained by projection on , using the fact that for ’good’ radii, thanks to (2.7), the number of particles crossing is controlled by . We will however forget here about this difficulty and assume that we may choose (and thus ). In particular, in view of (3.5), we will assume that we have the bound
[TABLE]
We may now state the almost-orthogonality property:
Lemma 3.3**.**
(Orthogonality) For every , there exist and such that if ,
[TABLE]
Sketch of proof.
Expanding the squares we have
[TABLE]
Let us estimate the two error terms. Using integration by parts we have (assuming without loss of generality that )
[TABLE]
Forgetting higher order terms (and assuming that ), we have (recall that the Wasserstein distance is homogeneous to the norm)
[TABLE]
Regarding the second term, in the case when for some set , we may argue as in [20, Lem. 3.2] and obtain that by McCann’s displacement convexity, and thus . For generic measures the argument is more subtle and requires a combination of elliptic estimates for (a regularized version of) together with the bound
[TABLE]
which holds for ’good’ radii.
∎
As explained above, the last ingredient is the construction of a competitor:
Lemma 3.4**.**
For every , there exist and such that if , there exists a density-flux pair such that
[TABLE]
and
[TABLE]
Sketch of proof.
We may assume for simplicity that also in . Indeed, otherwise we can connect in the time interval , the measure (in ) to the constant density at a cost of order .
Let be a small parameter to be chosen later on. We make the construction separately in the bulk and in the boundary layer and set
[TABLE]
and require that , in , in and on , so that (3.9) is satisfied. The existence of an admissible pair satisfying the energy bound
[TABLE]
as long as is obtained arguing by duality, in the same spirit as [2, Lem.3.3] (see [20, Lem. 2.4]).
We may now estimate
[TABLE]
where we used that by elliptic regularity, . Choosing to be a large multiple of yields
[TABLE]
which concludes the proof of (3.10) since and . ∎
Proof of Theorem 3.2.
By (local) minimality of , we have so that combining (3.7) and (3.10) together gives the desired estimate (3.3). ∎
4 Application to partial regularity
We now turn to applications of Theorem 3.1 and start with a partial regularity result. Here we are interested in the behavior at small scales.
Let us first recall the main regularity result for optimal transport maps due to Caffarelli [8, 9].
Theorem 4.1**.**
If and have compact supports, are absolutely continuous with respect to the Lebesgue measure with densities bounded from above and below on their support and if is convex, then the optimal transport map from to is .
The hypothesis that is convex is not merely technical. Indeed, considering for instance the optimal transport map between one ball and two disjoint balls, it is easy to construct examples where the optimal transport map is discontinuous. However, building on the ideas of Caffarelli to prove Theorem 4.1, Figalli and Kim proved in [15] that even without the convexity assumption on , the singular set of cannot be too big (see also [14] for a generalization to arbitrary non-degenerate cost functions).
Theorem 4.2**.**
Let and be probability measures with compact supports, both absolutely continuous with respect to the Lebesgue measure with densities bounded from above and below on their support. Then, there exist open sets and with and such that the optimal transport map from to is a homeomorphism between and .
Let us point out that it is actually conjectured that the singular set is much smaller and has the same structure as the singular set of gradients of convex functions i.e. that it is -rectifiable (see [22] for a result in this direction).
A first application of Theorem 3.1 is a new proof of Theorem 4.2 (under the additional hypothesis that and are Hölder continuous). For the sake of simplicity, we will assume from now on that and for some bounded open sets (so that in particular with the notation of Section 3, ). As in [14], we derive Theorem 4.2 combining Alexandrov’s Theorem (see [27]), which state that is differentiable a.e., with an regularity theorem.
Theorem 4.3**.**
([20, Th.1.2]) Let be the optimal transport map from to . For every , there exists such that if is such that and
[TABLE]
then .
By scaling invariance, we may assume that . As already alluded to the proof goes through a Campanato iteration. Indeed, by Campanato’s characterization of spaces (see [10]), it is enough to prove that for every ,
[TABLE]
Defining
[TABLE]
this is in turn obtained by using iteratively the following proposition.
Proposition 4.4**.**
For every , there exist and such that if and , there exist a symmetric matrix with and a vector such that letting ,
[TABLE]
and is the optimal transport map between and .
Sketch of proof.
By scaling we may assume that .
Let be fixed. Applying Theorem 3.1, we find the existence of a function which is harmonic in (under our assumptions in (3.1)) and such that (since )
[TABLE]
[TABLE]
We then define and where . Since is harmonic and thus . Notice that if for some convex function (by Theorem 2.1), then with , which is also a convex function. Therefore is the optimal transport map between and . We may now estimate
[TABLE]
This concludes the proof since we chose and since . ∎
5 Application to the optimal matching problem
We now present an application to the optimal matching problem. As opposed to the previous section, we are interested here at large scales.
Over the last thirty years, optimal matching problems have been the subject of intensive work. We refer for instance to the monograph [26]. One of the simplest example is the problem of matching the empirical measure of a Poisson point process to the corresponding Lebesgue measure. More specifically, we consider for a Poisson point process on the the torus i.e.
[TABLE]
with iid random variables uniformly distributed in and a random variable with Poisson distribution with parameter . The problem is to estimate the random variable
[TABLE]
where indicates the Wasserstein distance on the torus , and to understand the structure of the corresponding optimal transport plans. It is well-known since [1] thativivivWe use the notation for the natural logarithm.
[TABLE]
and thus is a critical dimension. Recently, Caracciolo and al. used the ansatz that the optimal displacement should be well approximated by , where solves the Poisson equation (recall (2.2))
[TABLE]
to make numerous predictions about the optimal prefactor in (5.1) as well as the correlations (see [11, 12]). At the macroscopic scale, this ansatz has been partially rigorously justified by Ambrosio and al. (see [4, 3] and also [23] for a result about the fluctuations) in dimension . To state their resultvvvThe results of [4, 3] are stated on the unit cube with a (deterministic) number of points . However, their results may be easily transposed into our setting by scaling., let us introduce some notation. For , denote the heat kernel on by and let , so that solves
[TABLE]
Theorem 5.1**.**
Let , then
[TABLE]
Moreover, if is the optimal transport plan between and , then setting , for there holds
[TABLE]
Since by (5.3), the displacement is on average of the order of , (5.4) shows that indeed coincides with the displacement to leading order. This leaves open the description of the optimal transport plan at the microscopic scale. To state our main result, fix a smooth cut-off function (which plays a similar role as the heat kernel in Theorem 5.1)
[TABLE]
In [17], we prove the following result (see also [19, Th. 1.2] and [18, Th. 1.1]):
Theorem 5.2**.**
There exists a stationary random variable on with exponential moments such that if is such that , then
[TABLE]
Moreover, there exists such that defining the shift by , we have
[TABLE]
and
[TABLE]
With respect to (5.4), (5.7) proves that (circular) averages of coincide with the displacement up to an error which is of order one. Moreover, (5.5) shows that after averaging, the displacement is actually extremely close to averages of (notice that the error term improves as increases).
By stationarity, it is enough to prove Theorem 5.2 for . The proof is based on the following deterministic result (which is a small post-processing of [19, Th. 1.2]):
Theorem 5.3**.**
Let be a measure on . If for some ,
[TABLE]
then
[TABLE]
Moreover, there exists such that letting , we have
[TABLE]
Notice that (5.7) follows from (5.10) and the bound (2.7) of Lemma 2.3. In order to obtain Theorem 5.2, Theorem 5.3 is combined with a stochastic argument based on (5.3) and a concentration-of-measure argument which ensures that (5.8) is satisfied for the Poisson point process .
The main ingredient for the proof of Theorem 5.3 is a Campanato iteration scheme similar to the one leading to Theorem 4.3 (and mainly based on Theorem 3.1) which allows to transfer the information that (5.10) holds at scale by (5.8) down to the microscopic scale . This is inspired by the approach developed by Armstrong and Smart in [6] (and further refined in [16], see also [5]) for quantitative stochastic homogenization. The main ideas of [6] take roots themselves in previous works of Avellaneda and Lin (see [7]) on periodic homogenization. The outcome of the Campanato scheme may be stated as follows (see [19, Prop. 1.9])
Proposition 5.4**.**
There exists a sequence of approximately geometric radii i.e. with , and such that defining recursively the couplings by and
[TABLE]
where solves
[TABLE]
with defined as in (2.4) with playing the role of , we have for ,
[TABLE]
and
[TABLE]
Let us point out that by invariance of the Lebesgue measure under translations, is the optimal transport plan between and the Lebesgue measure for every (this is the reason why we make the translation in the target space).
Letting and undoing the iterative definition of , we see that (5.11) directly leads to (5.10) with replaced by . The proof of (5.10) is concluded by the estimate (see [19, Prop. 1.10])
[TABLE]
This estimate is also crucial for the proof of (5.9). Let us point out that a naive estimate using (5.12) leads to
[TABLE]
which is suboptimal. In order to obtain a shift with the optimal estimate (5.6) it is therefore important to take into account cancellations and replace by .
Let us close this note by pointing out that in dimension , the optimal transport plans corresponding to a very closely related optimal matching problem, have been used in [21] to construct in the limit , a stationary and locally optimal coupling between the Poisson point process on and the Lebesgue measure. For , such a coupling is expected not to exist. However, using (5.7) and passing to the limit , it is possible to construct (at least in the sense of Young measures) a coupling between the Poisson point process on and the Lebesgue measure, which is locally optimal and has stationary increments (see [18, Th.1.2]).
Acknowledgements
This research has been partially supported by the ANR project SHAPO.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] M. Ajtai, J. Komlós, and G. Tusnády, On optimal matchings. , Combinatorica 4 (1984), 259–264.
- 2[2] G. Alberti, R. Choksi, and F. Otto, Uniform energy distribution for an isoperimetric problem with long-range interactions , J. Amer. Math. Soc. 22 (2009), no. 2, 569–605.
- 3[3] L. Ambrosio, F. Glaudo, and D. Trevisan, On the optimal map in the 2-dimensional random matching problem , 2019, p. 20.
- 4[4] L. Ambrosio, F. Stra, and D. Trevisan, A PDE approach to a 2-dimensional matching problem , Probab. Theory Related Fields 173 (2019), no. 1-2, 433–477.
- 5[5] S. Armstrong, T. Kuusi, and J.-C. Mourrat, Quantitative stochastic homogenization and large-scale regularity , Ar Xiv e-prints (2017).
- 6[6] S. N. Armstrong and C. K. Smart, Quantitative stochastic homogenization of convex integral functionals , Ann. Sci. Éc. Norm. Supér. (4) 49 (2016), no. 2, 423–481.
- 7[7] M. Avellaneda and F.-H. Lin, Compactness methods in the theory of homogenization , Comm. Pure Appl. Math. 40 (1987), no. 6, 803–847.
- 8[8] L. A. Caffarelli, A localization property of viscosity solutions to the Monge-Ampère equation and their strict convexity , Ann. of Math. (2) 131 (1990), no. 1, 129–134.
