Variational characterization of the regularity of Monge-Brenier maps

Ali Suleyman Ustunel

arXiv:1704.00310·math.PR·August 27, 2024

Variational characterization of the regularity of Monge-Brenier maps

Ali Suleyman Ustunel

PDF

Open Access

TL;DR

This paper investigates the regularity of Monge-Brenier maps on Wiener spaces, demonstrating their Sobolev regularity through variational methods and large deviations theory, under specific measure assumptions.

Contribution

It introduces a variational approach to establish Sobolev regularity of Monge-Brenier maps in an abstract Wiener space setting, extending previous results.

Findings

01

Monge-Brenier maps are Sobolev regular under finite information hypothesis.

02

Variational methods can be used to analyze the regularity of optimal transport maps.

03

Results apply to both forward and backward Monge-Brenier maps.

Abstract

On an abstract Wiener space, assume that T is the solution of the quadratic Monge problem associated to the Wiener measure and a second one with a Radon-Nikodym derivative of exponential type. Under the finite information hypothesis, using a variational method, we prove that T minimizes a certain functional originating from the large deviations theory. Applying a variational method a la Euler, we obtain the Sobolev regularity of the backward Monge-Brenier map. A similar result also holds for the forward Monge-Brenier map.

Equations169

d ν = \frac{1}{c} e^{- f} d μ

d ν = \frac{1}{c} e^{- f} d μ

in f (\int_{W \times W} ∣ x - y ∣_{H}^{2} d β (x, y) : β \in Σ (μ, ν)) = d_{2}^{2} (μ, ν),

in f (\int_{W \times W} ∣ x - y ∣_{H}^{2} d β (x, y) : β \in Σ (μ, ν)) = d_{2}^{2} (μ, ν),

- lo g \int_{W} e^{- f} d μ = in f (\int_{W} f d γ + H (γ ∣ μ) : γ \in M_{1} (W))

- lo g \int_{W} e^{- f} d μ = in f (\int_{W} f d γ + H (γ ∣ μ) : γ \in M_{1} (W))

- lo g \int e^{- f} d μ = in f (\int f \circ M d μ + H (M μ ∣ μ) : M = I_{W} + \nabla a, a \in I D_{2, 1}) .

- lo g \int e^{- f} d μ = in f (\int f \circ M d μ + H (M μ ∣ μ) : M = I_{W} + \nabla a, a \in I D_{2, 1}) .

- lo g \int e^{- f} d μ \geq in f (\int f \circ (I_{W} + ξ) d μ + H ((I_{W} + ξ) μ ∣ μ) : ξ \in I D_{2, 0} (H)) .

- lo g \int e^{- f} d μ \geq in f (\int f \circ (I_{W} + ξ) d μ + H ((I_{W} + ξ) μ ∣ μ) : ξ \in I D_{2, 0} (H)) .

in f (\int f d γ + H (γ ∣ μ) : γ \in M_{1} (W)),

in f (\int f d γ + H (γ ∣ μ) : γ \in M_{1} (W)),

J_{f}^{⋆}

J_{f}^{⋆}

d_{H}^{2} (μ, ν)

d_{H}^{2} (μ, ν)

δ ((I_{H} + \nabla^{2} φ)^{- 1} - I_{H}) = \nabla φ + \nabla f \circ (I_{W} + \nabla φ),

δ ((I_{H} + \nabla^{2} φ)^{- 1} - I_{H}) = \nabla φ + \nabla f \circ (I_{W} + \nabla φ),

P_{t} f (x) = \int_{W} f (e^{- t} x + 1 - e^{- 2 t} y) μ (d y),

P_{t} f (x) = \int_{W} f (e^{- t} x + 1 - e^{- 2 t} y) μ (d y),

∥ φ ∥_{p, k} = ∥ (I + L)^{k /2} φ ∥_{L^{p} (μ)}

∥ φ ∥_{p, k} = ∥ (I + L)^{k /2} φ ∥_{L^{p} (μ)}

h \to f (x + h)

h \to f (x + h)

f (x + s h + t k) \leq s f (x + h) + t f (x + k),

f (x + s h + t k) \leq s f (x + h) + t f (x + k),

h \to (x \to f (x + h) + \frac{1}{2} ∣ h ∣_{H}^{2})

h \to (x \to f (x + h) + \frac{1}{2} ∣ h ∣_{H}^{2})

x_{n} \to \frac{1}{2} ∣ x_{n} ∣_{H}^{2} + f (x_{n} + x_{n}^{⊥})

x_{n} \to \frac{1}{2} ∣ x_{n} ∣_{H}^{2} + f (x_{n} + x_{n}^{⊥})

e^{- f_{n}} = E [P_{1/ n} e^{- f} ∣ V_{n}],

e^{- f_{n}} = E [P_{1/ n} e^{- f} ∣ V_{n}],

Λ = det_{2} (I_{H} + \nabla^{2} φ) exp (- L φ - \frac{1}{2} ∣\nabla φ ∣_{H}^{2})

Λ = det_{2} (I_{H} + \nabla^{2} φ) exp (- L φ - \frac{1}{2} ∣\nabla φ ∣_{H}^{2})

H ((I_{W} + \nabla φ) μ ∣ μ) = E [\frac{1}{2} ∣\nabla φ ∣_{H}^{2} - lo g det_{2} (I_{H} + \nabla^{2} φ)] .

H ((I_{W} + \nabla φ) μ ∣ μ) = E [\frac{1}{2} ∣\nabla φ ∣_{H}^{2} - lo g det_{2} (I_{H} + \nabla^{2} φ)] .

L_{t} = \frac{d ν _{t}}{d μ} = c e^{- f_{t}} .

L_{t} = \frac{d ν _{t}}{d μ} = c e^{- f_{t}} .

H (T_{t, ε} μ ∣ μ) = E [\frac{1}{2} ∣ t \nabla φ + ε ξ ∣_{H}^{2} - lo g det_{2} (I_{H} + t \nabla^{2} φ + ε \nabla ξ)] .

H (T_{t, ε} μ ∣ μ) = E [\frac{1}{2} ∣ t \nabla φ + ε ξ ∣_{H}^{2} - lo g det_{2} (I_{H} + t \nabla^{2} φ + ε \nabla ξ)] .

J_{t} (t \nabla φ + ε ξ) = E [f_{t} \circ T_{t, ε} + \frac{1}{2} ∣ t \nabla φ + ε ξ ∣_{H}^{2} - lo g det_{2} (I_{H} + t \nabla^{2} φ + ε \nabla ξ)] .

J_{t} (t \nabla φ + ε ξ) = E [f_{t} \circ T_{t, ε} + \frac{1}{2} ∣ t \nabla φ + ε ξ ∣_{H}^{2} - lo g det_{2} (I_{H} + t \nabla^{2} φ + ε \nabla ξ)] .

\frac{d}{d ε} J_{t} (t \nabla φ + ε ξ) ∣_{ε = 0}

\frac{d}{d ε} J_{t} (t \nabla φ + ε ξ) ∣_{ε = 0}

Θ = {ξ \in I D_{2, 1} (H) : ∥\nabla ξ ∥_{2} \in L^{\infty} (μ)}

Θ = {ξ \in I D_{2, 1} (H) : ∥\nabla ξ ∥_{2} \in L^{\infty} (μ)}

\nabla φ + \nabla f \circ (I_{W} + \nabla φ) - δ [(I_{H} + \nabla^{2} φ)^{- 1} - I_{H}] = 0

\nabla φ + \nabla f \circ (I_{W} + \nabla φ) - δ [(I_{H} + \nabla^{2} φ)^{- 1} - I_{H}] = 0

t \nabla φ + \nabla f_{t} \circ (I_{W} + t \nabla φ) - δ [(I_{H} + t \nabla^{2} φ)^{- 1} - I_{H}] = 0 .

t \nabla φ + \nabla f_{t} \circ (I_{W} + t \nabla φ) - δ [(I_{H} + t \nabla^{2} φ)^{- 1} - I_{H}] = 0 .

trace (K \nabla^{3} φ K e \cdot K \nabla^{3} φ K e) \geq 0 .

trace (K \nabla^{3} φ K e \cdot K \nabla^{3} φ K e) \geq 0 .

trace (K A K A)

trace (K A K A)

E [∥ (I + \nabla^{2} φ)^{- 1} - I ∥_{2}^{2}] \leq 2 E [∣\nabla φ ∣_{H}^{2}] + 2 c E [∣\nabla f ∣_{H}^{2} e^{- f}],

E [∥ (I + \nabla^{2} φ)^{- 1} - I ∥_{2}^{2}] \leq 2 E [∣\nabla φ ∣_{H}^{2}] + 2 c E [∣\nabla f ∣_{H}^{2} e^{- f}],

\int_{I R^{d}} ∣\nabla f ∣^{2} e^{- f} d β < \infty .

\int_{I R^{d}} ∣\nabla f ∣^{2} e^{- f} d β < \infty .

F_{n} (x, y) = φ_{n} (x) + ψ_{n} (y) + \frac{1}{2} ∣ x - y ∣^{2} = 0 γ_{n} - a . s .

F_{n} (x, y) = φ_{n} (x) + ψ_{n} (y) + \frac{1}{2} ∣ x - y ∣^{2} = 0 γ_{n} - a . s .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeometry and complex manifolds · Geometric Analysis and Curvature Flows · Nonlinear Partial Differential Equations

Full text

Variational Characterization of the Regularity of Monge-Brenier Maps

A. S. Üstünel

Abstract: Let $(W,H,\mu)$ be an abstract Wiener space, assume that $T=I_{W}+\nabla\varphi$ is the solution of the Monge problem associated to the measures $d\mu$ and $d\nu=e{{}^{-}f}d\mu$ . Under the finite information hypothesis, using a variational method, we prove that $\delta((I_{H}+\nabla^{2}\varphi)^{-1}-I_{H})=\nabla\varphi+\nabla f\circ T$ and this result implies the Sobolev regularity of the backward Monge-Brenier map. A similar result also holds for the forward Monge-Brenier map.

Keywords: Entropy, adapted perturbation of identity, Wiener measure, Monge and Monge Kantorovich problems, Monnge potential, Monge-Brenier map.

1. Introduction

Let $\nu$ be the probability measure defined by

[TABLE]

such that the relative entropy of $\nu$ w.r.t. the Wiener measure $\mu$ , denoted as $H(\nu|\mu)$ is finite. Let $\Sigma(\mu,\nu)$ be the set of the probability measures on $(W\times W,{\mathcal{B}}(W\times W))$ whose first marginals are $\mu$ and the secones ones are $\nu$ . Consider the problem of minimization which defines also a strong Wasserstein distance between $\mu$ and $\nu$ :

[TABLE]

where $|\cdot|_{H}$ denotes the Cameron-Martin norm. In the finite dimensional case this problem has been extensively studied since almost three centuries and we refer to the texts [14] and [20] for history and references and also to [3] and [13].

In the infinite dimensional case, where the cost function is very singular, in the sense that the set on which the cost function is finite has zero measure w.r.t. the product measure $\mu\times\nu$ has been solved in a series of papers ([8, 9, 10]) and the answer can be summarized as follows: There exists a $1$ -convex function $\varphi:W\to{\rm I\!R}$ , in the Gaussian Sobolev space ${\rm I\!D}_{2,1}$ , called Monge potential or Monge-Brenier map such that the above infimum is attained at $\gamma=(I_{W}\times T)\mu$ , i.e., the image of the measure $\mu$ under the map $I_{W}\times T$ , where $T=I_{W}+\nabla\varphi$ , where $\nabla\varphi$ is the $L^{2}(\mu)$ -extended derivative of $\varphi$ in the direction of Cameron-Martin space. Moreover, there exists also a dual Monge potential $\psi:W\to{\rm I\!R}$ , which has an $L^{2}(\nu)$ -extended derivative in the direction of Cameron-Martin space, such that, the map $S=I_{W}+\nabla\psi$ satisfies $(S\times I_{W})\nu=(I_{W}\times T)\mu=\gamma$ , hence $T\circ S=I_{W}$ $\nu$ -a.s. and $S\circ T=I_{W}$ $\mu$ -a.s. The next important issue in this subject is to show the Sobolev regularity of the Monge-Brenier maps in such a way that one can write the Jacobian functions associated to the corresponding transformations $T$ and $S$ . In finite dimensional case this problem has been treated by several authors (cf. [4] and the references given in [20]). In the infinite dimensional case there are also some results (cf. [2, 6, 11] ) which are generalizations of the results given in [9, 10]. These results are generally the suitable extensions to the finite dimensional case of those which were developped especially by L. Caffarelli, though we have also given another method to calculate the Jacobian functions in infinite dimensions using the Itô calculus.

In this work we shall present a totally different method, namely, we shall prove the Sobolev regularity of the Monge-Brenier functions using the Calculus of variations. Let us begin by recalling a celebrated variational formula, which holds on any measurable space but we formulate on a Wiener space for the notational simplicity:

[TABLE]

where $M_{1}(W)$ denotes the set of probability measures on $(W,{\mathcal{F}})$ , ${\mathcal{F}}$ being the Borel sigma field of $W$ , $\nu,\,f,\,\mu$ are as described above. The infimum is attained at $\nu$ provided that $H(\nu|\mu)$ is finite, cf. [17]. On the other hand, we know from [8] that there exists some $\varphi\in{\rm I\!D}_{2,1}$ , $1$ -convex function such that $(I_{W}+\nabla\varphi)\mu=\nu$ , where we use the same notation for the image of a point and of a measure under a measurable map (here the map under question is $I_{W}+\nabla\varphi$ ). Consequently the following identity holds true:

[TABLE]

Therefore

[TABLE]

For this infimum to be finite we need that $H((I_{W}+\xi)\mu|\mu)<\infty$ , which implies $(I_{W}+\xi)\mu\ll\mu$ . Besides the right hand side of the inequality (1.2) is always greater than

[TABLE]

therefore we have equality between all these expressions:

Theorem 1.

Assume that $H(\nu|\mu)<\infty$ , where $d\nu=(E[e^{-f}])^{-1}e^{-f}d\mu$ and $f$ is a measurable function. Then the infimum

[TABLE]

is attained at the vector field $\xi=\nabla\varphi$ , where $\varphi$ is the unique (up to an additive constant) Monge potential such that $(I_{W}+\nabla\varphi)\mu=\nu$ and that the $L^{2}(\mu,H)$ -norm of $\nabla\varphi$ is equal to the Wasserstein distance between $\nu$ and $\mu$ :

[TABLE]

where $\Sigma_{1}(\mu,\nu)$ denotes the set of probability measures on $W\times W$ , whose first marginals are $\mu$ and the second ones are $\nu$ .

Note that if we could apply the variational principle above, namely, by taking the derivative of the functional $J_{f}$ at the minimizing vector field $\nabla\varphi$ in any admissible direction, we would obtain the following relation:

[TABLE]

where $\delta$ denotes the Gaussian divergence, i.e., the adjoint of the derivative $\nabla$ w.r.t. the Gaussian measure $\mu$ and this equation implies Sobolev regularity of $\varphi$ . A similar method can be used for the dual Monge potential $\psi$ also. We shall realize this programm in the sequel beginning from the finite dimensions and passing to the infinite dimensional case by a limiting argument. Let us not that this method is applicable in other situations than the Gaussian case also as one can see already in the case of dual potential.

Let us resume the following important observation: this work is devoted to the creation of a variational calculus by parametrizing the formula 1.1 with the vector fields which are derivatives of scalar functionals. In another work, which has already appeared, [18], we have parametrized the same formula with adapted vector fields to obtain totally different results, like the existence, uniqueness and non-existence results of stochastic differential equations with past depending drift coefficients.

2. Preliminaries

Let $W$ be a separable Fréchet space equipped with a Gaussian measure $\mu$ of zero mean whose support is the whole space111The reader may assume that $W=C({\rm I\!R}_{+},{\rm I\!R}^{d})$ , $d\geq 1$ or $W={\rm I\!R}^{{\rm I\!N}}$ .. The corresponding Cameron-Martin space is denoted by $H$ . Recall that the injection $H\hookrightarrow W$ is compact and its adjoint is the natural injection $W^{\star}\hookrightarrow H^{\star}\subset L^{2}(\mu)$ . The triple $(W,\mu,H)$ is called an abstract Wiener space. Recall that $W=H$ if and only if $W$ is finite dimensional. A subspace $F$ of $H$ is called regular if the corresponding orthogonal projection has a continuous extension to $W$ , denoted again by the same letter. It is well-known that there exists an increasing sequence of regular subspaces $(F_{n},n\geq 1)$ , called total, such that $\cup_{n}F_{n}$ is dense in $H$ and in $W$ . Let $V_{n}$ be the $\sigma$ -algebra generated by $\pi_{F_{n}}$ , then for any $f\in L^{p}(\mu)$ , the martingale sequence $(E[f|V_{n}],n\geq 1)$ converges to $f$ (strongly if $p<\infty$ ) in $L^{p}(\mu)$ . Observe that the function $f_{n}=E[f|V_{n}]$ can be identified with a function on the finite dimensional abstract Wiener space $(F_{n},\mu_{n},F_{n})$ , where $\mu_{n}=\pi_{n}\mu$ .

Since the translations of $\mu$ with the elements of $H$ induce measures equivalent to $\mu$ , the Gâteaux derivative in $H$ direction of the random variables is a closable operator on $L^{p}(\mu)$ -spaces and this closure will be denoted by $\nabla$ cf., for example [15]. The corresponding Sobolev spaces (the equivalence classes) of the real random variables will be denoted as ${\rm I\!D}_{p,k}$ , where $k\in{\rm I\!N}$ is the order of differentiability and $p>1$ is the order of integrability. If the random variables are with values in some separable Hilbert space, say $\Phi$ , then we shall define similarly the corresponding Sobolev spaces and they are denoted as ${\rm I\!D}_{p,k}(\Phi)$ , $p>1,\,k\in{\rm I\!N}$ . Since $\nabla:{\rm I\!D}_{p,k}\to{\rm I\!D}_{p,k-1}(H)$ is a continuous and linear operator its adjoint is a well-defined operator which we represent by $\delta$ . In the case of classical Wiener space, i.e., when $W=C({\rm I\!R}_{+},{\rm I\!R}^{d})$ , then $\delta$ coincides with the Itô integral of the Lebesgue density of the adapted elements of ${\rm I\!D}_{p,k}(H)$ (cf.[15]).

For any $t\geq 0$ and measurable $f:W\to{\rm I\!R}_{+}$ , we note by

[TABLE]

it is well-known that $(P_{t},t\in{\rm I\!R}_{+})$ is a hypercontractive semigroup on $L^{p}(\mu),p>1$ , which is called the Ornstein-Uhlenbeck semigroup (cf.[15]). Its infinitesimal generator is denoted by $-{\mathcal{L}}$ and we call ${\mathcal{L}}$ the Ornstein-Uhlenbeck operator (sometimes called the number operator by the physicists). Due to the Meyer inequalities (cf., for instance [15]), the norms defined by

[TABLE]

are equivalent to the norms defined by the iterates of the Sobolev derivative $\nabla$ . This observation permits us to identify the duals of the space ${\rm I\!D}_{p,k}(\Phi);p>1,\,k\in{\rm I\!N}$ by ${\rm I\!D}_{q,-k}(\Phi^{\prime})$ , with $q^{-1}=1-p^{-1}$ , where the latter space is defined by replacing $k$ in (2.3) by $-k$ , this gives us the distribution spaces on the Wiener space $W$ (in fact we can take as $k$ any real number). An easy calculation shows that, formally, $\delta\circ\nabla={\mathcal{L}}$ , and this permits us to extend the divergence and the derivative operators to the distributions as linear, continuous operators. In fact $\delta:{\rm I\!D}_{q,k}(H\otimes\Phi)\to{\rm I\!D}_{q,k-1}(\Phi)$ and $\nabla:{\rm I\!D}_{q,k}(\Phi)\to{\rm I\!D}_{q,k-1}(H\otimes\Phi)$ continuously, for any $q>1$ and $k\in{\rm I\!R}$ , where $H\otimes\Phi$ denotes the completed Hilbert-Schmidt tensor product (cf., for instance [15]). The following assertion is useful: assume that $(Z_{n},n\geq 1)\subset{\rm I\!D}^{\prime}$ converges to $Z$ in ${\rm I\!D}^{\prime}$ , assume further that each each $Z_{n}$ is a probability measure on $W$ , then $Z$ is also a probability and $(Z_{n},n\geq 1)$ converges to $Z$ in the weak topology of measures. In particular, a lower bounded distribution (in the sense that there exists a constant $c\in{\rm I\!R}$ such that $Z+c$ is a positive distribution) is a (Radon) measure on $W$ , c.f. [15].

A measurable function $f:W\to{\rm I\!R}\cup\{\infty\}$ is called $H$ -convex (cf.[7]) if

[TABLE]

is convex $\mu$ -almost surely, i.e., if for any $h,k\in H$ , $s,t\in[0,1],\,s+t=1$ , we have

[TABLE]

almost surely, where the negligeable set on which this inequality fails may depend on the choice of $s,h$ and of $k$ . We can rephrase this property by saying that $h\to(x\to f(x+h))$ is an $L^{0}(\mu)$ -valued convex function on $H$ . $f$ is called $1$ -convex if the map

[TABLE]

is convex on the Cameron-Martin space $H$ with values in $L^{0}(\mu)$ . Note that all these notions are compatible with the $\mu$ -equivalence classes of random variables thanks to the Cameron-Martin theorem. It is proven in [7] that this definition is equivalent the following condition: Let $(\pi_{n},n\geq 1)$ be a sequence of regular, finite dimensional, orthogonal projections of $H$ , increasing to the identity map $I_{H}$ . Denote also by $\pi_{n}$ its continuous extension to $W$ and define $\pi_{n}^{\bot}=I_{W}-\pi_{n}$ . For $x\in W$ , let $x_{n}=\pi_{n}x$ and $x_{n}^{\bot}=\pi_{n}^{\bot}x$ . Then $f$ is $1$ -convex if and only if

[TABLE]

is $\pi_{n}^{\bot}\mu$ -almost surely convex. We define similarly the notion of $H$ -concave and $H$ -log-concave functions. In particular, one can prove that, for any $H$ -log-concave function $f$ on $W$ , $P_{t}f$ and $E[f|V_{n}]$ are again $H$ -log-concave [7].

3. Variational calculations

Assume for a while that $\varphi\in{\rm I\!D}_{2,1}$ is smooth; this can be achived by replacing $f$ by its regularization defined as

[TABLE]

where $(P_{t},t\geq 0)$ is the Ornstein-Uhlenbeck semi-group, $V_{n}$ is the sigma-algebra generated by $\{\delta e_{1},\ldots,\delta e_{n}\}$ and $(e_{n},n\geq 1)$ is a complete, orthonormal basis of $H$ . Since $J_{f}^{\star}=J(\nabla\varphi)$ , if we take the Gateau derivative of $J$ at $\nabla\varphi$ , it should give zero: Let $L=(E[^{-f}])^{-1}e^{-f}$ and denote by $\Lambda$ the Gaussian Jacobian of $I_{W}+\nabla\varphi$ :

[TABLE]

where ${\mathcal{L}}$ is the Ornstein-Uhlenbeck operator ${\mathcal{L}}=\delta\circ\nabla$ , ${\textstyle{\det_{2}}}$ denotes the modified Carleman-Fredholm determinant, $\delta=\nabla^{\star}$ where the adjoint is taken w.r.t. the Wiener measure $\mu$ , c.f. [19]. It follows from the change of variables formula, c.f.[19], that $L\circ(I_{W}+\nabla\varphi)\,\Lambda=1$ , hence

[TABLE]

In particular, thanks to the $1$ -convexity of $\varphi$ , if we replace $\varphi$ by $t\varphi$ , for small $t\in[0,1]$ , the shift $T_{t}=I_{W}+t\nabla\varphi$ becomes strongly monotone and it is the solution of the Monge transportation problem for the measure $\nu_{t}=T_{t}\mu$ (i.e., the image of $\mu$ under $T_{t}$ ). Let $f_{t}$ be defined as

[TABLE]

If $\xi\in{\rm I\!D}_{2,1}(H)$ such that $\nabla\xi$ has small $L^{\infty}$ -norm as a Hilbert-Schmidt operator, then $T_{t,\varepsilon}=I_{W}+t\nabla\varphi+\varepsilon\xi$ is a strongly monotone shift for small $t,\varepsilon>0$ , hence it is almost-surely invertible (cf. [19], Corollary 6.4.2). Note moreover that the shift $I_{W}+t\nabla\varphi$ is the unique solution of another Monge problem, namely the one which corresponds to the measure $ce^{-f_{t}}d\mu$ . Here the multiplication with a small $t$ permits us to have a sufficiently large set on which we calculate the Gateau derivative while preserving the $1$ -convexity of the corresponding Monge potential, namely $t\varphi$ . Using again the change of variables formula for $T_{t,\varepsilon}$ , we get

[TABLE]

Therefore

[TABLE]

Since $t\nabla\varphi$ minimizes the function $J_{t}$ between all the absolutely continuous shifts, we should have

[TABLE]

for any $\xi\in{\rm I\!D}_{2,1}(H)$ with $\|\nabla\xi\|_{2}\in L^{\infty}(\mu)$ . Since the set of vector fields

[TABLE]

is dense in any $L^{p}(\mu,H)$ , we have proved the following

Theorem 2.

In the finite dimensional smooth case, the Monge potential $\varphi$ satisfies the following relation

[TABLE]

almost surely, where $\delta$ denotes the Gaussian divergence w.r.t. $\mu$ , i.e., the adjoint of $\nabla$ w.r.t. $\mu$ .

**Proof: **In the equation (3) we have a term with trace, we just interpret it as a scalar product on the Hilbert-Schmidt operators on the Cameron-Martin space and the claim follows, for the case $t\varphi$ , from the definition of $\delta$ as a mapping from Hilbert-Schmidt-valued operators to the vector fields under this scalar product. Hence we have the identity

[TABLE]

Since we have $\Lambda_{t}ce^{-f_{t}\circ T_{t}}=1$ a.s., where $\Lambda_{t}={\textstyle{\det_{2}}}(I_{H}+t\nabla^{2}\varphi)\exp\left(-t{\mathcal{L}}\varphi-\frac{1}{2}\>|t\nabla\varphi|_{H}^{2}\right)$ and $T_{t}=I_{W}+t\nabla\varphi$ , $\lim_{t\to 1}\nabla f_{t}\circ T_{t}=\nabla f\circ T$ in probability, where $T=T_{1}=I_{W}+\nabla\varphi$ . The justification of the other terms being trivial, the proof is completed.

Lemma 1.

Let $K=(I_{H}+\nabla^{2}\varphi)^{-1}$ and let $e$ be any fixed element of the Cameron-Martin space $H$ , then we have almost surely

[TABLE]

**Proof: **Let $A=\nabla^{3}\varphi Ke$ , this is a symmetric operator; as $K$ is a positive operator, we can write

[TABLE]

trivially, where $\|\cdot\|_{2}$ denotes the Hilbert-Schmidt norm.

Proposition 1.

Let $\varphi$ be the Monge potential of Monge-Kantorovich problem with the target measure $ce^{-f}$ . Assume that $\varphi$ is smooth as explained above, then we have

[TABLE]

where $\|\cdot\|_{2}$ denotes the Hilbert-Schmidt norm on $H\otimes H$ .

**Proof: **The proof follows from the calculation of the second moment of the norm of a vector-valued divergence of Theorem 2 combined with the result of Lemma 1.

The next two lemmas give useful stability results of the forward and backward potentials in the finite dimensional situations whenever the target measures are approximated with more regular measures. There are some results in the literature (cg. [5, 20]), but they are of limited applicability. sequel:

Lemma 2.

Let $\beta$ be the standard Gaussian measure on ${\rm I\!R}^{d}$ , $f\in{\rm I\!D}_{2,1}$ s.t.

[TABLE]

Let $(\varphi,\psi)$ be the Monge potentials associated to the Monge-Kantorovitch problem $\Sigma(\beta,\nu)$ , where $d\nu=ce^{-f}d\beta$ . Define $f_{n}$ as to be $Q_{1/n}e^{-f}=e^{-f_{n}}$ , where $(Q_{t},\,t\geq 0)$ denotes the Ornstein-Uhlenbeck semigroup on ${\rm I\!R}^{d}$ . Let $(\varphi_{n},\psi_{n})$ be the Monge potentials corresponding to Monge-Kantorovich problem $\Sigma(\beta,\nu_{n})$ , where $d\nu_{n}=ce^{-f_{n}}d\beta$ . Then $(\varphi_{n},n\geq 1)$ converges to $\varphi$ in ${\rm I\!D}_{2,1}$ , $(Q_{1/n}\psi_{n},n\geq 1)$ converges to $\psi$ in $L^{1}(\nu)$ and $(Q_{1/n}\nabla\psi_{n},n\geq 1)$ converges to $\nabla\psi$ in $L^{2}(\nu,{\rm I\!R}^{d})$

**Proof: **Let $\gamma_{n},\gamma$ be the unique solutions of Monge-Kantorovitch problems for $(\beta,\nu_{n})$ and $(\beta,\nu)$ respectively. From Brenier’s theorem (cf.[3])

[TABLE]

and $F_{n}(x,y)\geq 0$ for any $(x,y)\in{\rm I\!R}^{d}\times{\rm I\!R}^{d}$ . Similarly

[TABLE]

and $F(x,y)\geq 0$ for any $(x,y)\in{\rm I\!R}^{d}\times{\rm I\!R}^{d}$ . As $\nu_{n}\to\nu$ weakly and since $(Q_{1/n}(|\cdot|^{2})(x),\,n\geq 1)$ is exponentially integrable w.r.t. $\beta$ uniformly in $n\geq 1$ , it is easy to deduce, using the Young inequality, that

[TABLE]

and this implies that (cf.[1], Lemma 8.3)

[TABLE]

where $d_{2}$ denotes the second order Wasserstein distance on the probability measures on ${\rm I\!R}^{d}$ . These relations imply that $(\varphi_{n},n\geq 1)$ is bounded in $L^{2}(\gamma)$ . Moreover, we have

[TABLE]

By the boundedness of $(\varphi_{n},n\geq 1)$ in $L^{2}(\beta)$ there exists $a^{\prime}\in L^{2}(\beta)$ such that $(\varphi_{n},n\geq 1)$ converges weakly to $a^{\prime}$ (upto a subsequence) in $L^{2}(\beta)$ , hence also in $L^{2}(\gamma)$ . In the sequel we replace $\varphi_{n}$ by $\varphi_{n}-E_{\beta}[\varphi_{n}]$ and $Q_{1/n}\psi_{n}$ by $Q_{1/n}\psi_{n}-E_{\beta}[\varphi_{n}]$ to avoid the ambiguities about the constants. We have

[TABLE]

for any $(x,y)\in{\rm I\!R}^{d}\times{\rm I\!R}^{d}$

Moreover

[TABLE]

Hence $(\varphi_{n}(x)+Q_{1/n}\psi_{n}(y)+\frac{1}{2}\>|x-y|^{2},\,n\geq 1)$ converges to [math] in $L^{1}(\gamma)$ , consequently $(Q_{1/n}\psi_{n},\,n\geq 1)$ is also uniformly integrable in $L^{1}(\gamma)$ , therefore there exists some $b^{\prime}\in L^{1}(\nu)$ which is a weak adherent point of $(Q_{1/n}\psi_{n},\,n\geq 1)$ . Therefore

[TABLE]

$\gamma$ -a.s. Let $(\varphi_{n}^{\prime},n\geq 1)$ and $(Q_{1/n}\psi_{n}^{\prime},n\geq 1)$ be the convex combinations of the sequences $(\varphi_{n})$ and $(Q_{1/n}\psi_{n})$ respectively, which converge strongly in $L^{2}(\gamma)$ and $L^{1}(\gamma)$ respectively. Let $a(x)=\limsup_{n}\varphi_{n}^{\prime}(x)$ and $b(y)=\limsup Q_{1/n}\psi_{n}^{\prime}(y)$ . We have then

[TABLE]

for all $(x,y)\in{\rm I\!R}^{d}\times{\rm I\!R}^{d}$ and

[TABLE]

$\gamma$ -almost surely. By the uniqueness of the solution of the Monge-Kantorovitch problem we should have $a=\varphi$ and $b=\psi$ $\gamma$ -a.s. Assume now that $\tilde{a}$ is another weak cluster point of $(\varphi_{n},n\geq 1)$ , then $\nabla\tilde{a}(x)=y-x$ $\gamma$ -a.s., hence $a=\tilde{a}=\varphi$ $\gamma$ -a.s. Hence $(\varphi_{n},n\geq 1)$ converges to $\varphi$ in ${\rm I\!D}_{2,1}$ . Similarly $(Q_{1/n}\psi_{n},n\geq 1)$ converges to $\psi$ in $L^{1}(\nu)$ , moreover $\nabla$ is closable on $L^{p}(\nu),\,p\geq 1$ and $\lim_{n}E_{\nu}[|\nabla Q_{1/n}\psi_{n}|^{2}]=E_{\nu}[|\nabla\psi|_{H}^{2}]$ and this completes the proof.

Lemma 3.

Let $\beta$ be the standard Gaussian measure on ${\rm I\!R}^{d}$ , $L\in L^{1}(\beta)$ be a probability density such that

[TABLE]

Let $(\varphi,\psi)$ be the Monge potentials associated to the Monge-Kantorovitch problem $\Sigma(\beta,\nu)$ , where $d\nu=Ld\beta$ . Define $L_{n}=c_{n}L\theta_{n}(L)$ as another density, where $\theta_{n}\in C_{K}^{\infty}({\rm I\!R}^{d})$ is approximating the constant $1$ . Let $(\varphi_{n},\psi_{n})$ be the Monge potentials corresponding to Monge-Kantorovich problem with quadratic cost over $\Sigma(\beta,\nu_{n})$ Then $(\varphi_{n},n\geq 1)$ converges to $\varphi$ in ${\rm I\!D}_{2,1}$ , $(\psi_{n},n\geq 1)$ converges to $\psi$ in $L^{1}(\nu)$ .

**Proof: **The proof is similar to the proof of Lemma 2. Let $\gamma$ and $\gamma_{n}$ be the transport plans corresponding to the Monge-Kantorovitch problems for $(\beta,\nu)$ and $(\beta,\nu_{n})$ respectively. As in the Lemma 2, we have

[TABLE]

and $F_{n}(x,y)\geq 0$ for any $(x,y)\in{\rm I\!R}^{d}\times{\rm I\!R}^{d}$ . Since

[TABLE]

we have

[TABLE]

By positivity, we deduce that $(\varphi_{n},n\geq 1)$ is bounded in ${\rm I\!D}_{2,1}$ and that $(c_{n}\psi_{n}\theta_{n}(L),n\geq 1)$ is uniformly integrable in $L^{1}(\nu)$ . Hence there are weak some adherence points of $(\varphi_{n},n\geq 1)$ in $L^{2}(\beta)$ denoted as $a^{\prime}$ and of $(c_{n}\psi_{n}\theta_{n}(L),n\geq 1)$ in $L^{1}(\nu$ denoted as $b^{\prime}$ such that

[TABLE]

$\gamma$ -a.s. By taking convex combinations, we can assume these convergences to be in the strong sense. Let us define $a$ and $b$ as $a(x)=\lim\sup_{n}\,co(\varphi_{n})$ and $b(y)=\limsup_{n}\,co(c_{n}\psi_{n}\theta_{n}(L))$ , where $co$ denotes convex combinations, we obtain

[TABLE]

for all $x,y\in{\rm I\!R}^{d}$ and that

[TABLE]

$\gamma$ -a.s. By uniqueness, we should have $a=\varphi$ and $b=\psi$ $\gamma$ -a.s. Since this construction holds for any infinite subsequences of $(\varphi_{n})$ and $(c_{n}\theta_{n}(L)\psi_{n})$ , these sequences have unique weak accumulation points, i.e., they converge weakly in ${\rm I\!D}_{2,1}$ and in $L^{1}(\nu)$ respectively. Moreover we know that $\lim_{n}E_{\beta}[|\nabla\varphi_{n}|^{2}]=E_{\beta}[|\varphi|^{2}]$ , hence $(\varphi_{n})$ converges strongly in ${\rm I\!D}_{2,1}$ and hence $(c_{n}\theta_{n}(L)\psi_{n})$ strongly in $L^{1}(\nu)$ .

After these preparations, we can prove our first regularity result:

Theorem 3.

Assume that $T=I_{W}+\nabla\varphi$ is the Monge-Brenier map which is the solution of Monge problem with the quadratic cost $c(x,y)=|x-y|_{H}^{2}$ where $|\cdot|_{H}$ is the norm of the Cameron-Martin space and with the target measure $d\nu=c^{-1}e^{-f}d\mu$ with $f\in{\rm I\!D}_{p,1}$ for some $p>1$ and let $S=I_{W}+\nabla\psi$ be the inverse map wich maps $\nu$ to $\mu$ . Assume that

[TABLE]

Then we have $\nabla^{2}\psi\in L^{2}(\nu,H\otimes_{2}H)$ , where $H\otimes_{2}H$ denotes the space of Hilbert-Schmidt operators on $H$ . In other words $\psi$ belongs to the $L^{2}(\nu)$ -domain of $\nabla$ and $\nabla\psi$ belongs to $L^{2}(\nu)$ -domain of $\nabla$ again. Besides we have the following control:

[TABLE]

**Proof: **Note that the relation (3.7) implies the closability of the gradient operator $\nabla$ in $L^{2}(\nu)$ , hence $\nabla\psi$ and $\nabla^{2}\psi$ are well-defined. For the proof, let $f_{n}$ be defined as before with the relation $e^{-f_{n,\varepsilon}}=E[P_{\varepsilon}e^{-f}|V_{n}]$ . Let us denote by $\varphi_{n,\varepsilon}$ and by $\psi_{n,\varepsilon}$ forward and backward Monge potentials corresponding to the measure $d\nu_{n,\varepsilon}=e^{-f_{n,\varepsilon}}d\mu$ . From Proposition 1, we have

[TABLE]

Since $T_{n,\varepsilon}=I_{W}+\nabla\varphi_{n,\varepsilon}$ and $S_{n,\varepsilon}=I_{W}+\nabla\psi_{n,\varepsilon}$ are inverse to each other, we have

[TABLE]

Substituting this identity in (3.9), we obtain

[TABLE]

Let $(\varphi_{n},\psi_{n})$ be the Monge potentials corresponding to the transportation of $d\mu$ to the measure $d\nu_{n}=E[e^{-f}|V_{n}]d\mu$ . It follows from Lemma 2 that $(\nabla P_{\varepsilon}\psi_{n,\varepsilon},\varepsilon>0,n\geq 1)$ converges to $\nabla\psi$ in $L^{1}(\nu,H)$ and that $(\nabla\varphi_{n,\varepsilon},\varepsilon>0,n\geq 1)$ converges to $\nabla\varphi$ in $L^{2}(\mu,H)$ as $\varepsilon\to 0$ and as $n\to\infty$ . Hence we have

[TABLE]

Moreover, taking weak limits first as $\varepsilon\to 0$ then as $n\to\infty$ we obtain, due to the weak lower semi-continuity of the norms, that

[TABLE]

Let us show now the regularity of the forward Monge potential $\varphi$ : assume first that, we have reduced the problem to the case where everything is smooth using the approximation results that we have proven before. Let $\nu$ be the measure defined by $d\nu=e^{-f}d\mu$ . The following relation holds then true:

[TABLE]

It is important to remark that in the equation 3.11, the infimum is taken over the set of probability measures and in the equation 3.12, the infimum is taken over the perturbations of identity of the form $U=I_{W}+u$ when $u$ runs in the set of the gradients of $1$ -convex functions, cf. [9]. Moreover, denoting $\frac{dU\mu}{d\nu}$ by $l_{U}$ , we have

[TABLE]

where $\Lambda_{u}$ is the Gaussian Jacobian associated to $U=I_{W}+u$ . Therefore $\log l_{U}\circ U=f\circ u-\log\Lambda_{u}-f$ and we get

[TABLE]

Consequently

[TABLE]

We know that the above infimum is attained at $S=T^{-1}=I_{W}+\nabla\psi$ , hence we should have

[TABLE]

for any smooth $\xi:W\to H$ suct that $\|\nabla\xi\|_{2}\in L^{\infty}(\mu)$ . A similar calculation as performed before implies that

[TABLE]

for any $\xi$ as above. Consequently we have

Theorem 4.

The dual Monge potential satisfies the relation

[TABLE]

where $\delta_{\nu}$ denotes the adjoint of $\nabla$ w.r.t. the measure $\nu$ .

We need a couple of techical results:

Lemma 4.

Let $\xi:W\to H$ be a smooth vector field, then the following results hold true:

(1)

$\delta_{\nu}\xi=\delta\xi+(\nabla f,\xi)_{H}$ . 2. (2)

For any $h\in H$ ,

[TABLE] 3. (3)

For any $h\in H$ and smooth $\alpha:W\to{\rm I\!R}$ ,

[TABLE]

Lemma 5.

For any smooth $\xi:W\to H$ , we have

[TABLE]

**Proof: **By the definition of $\delta_{\nu}$ , we have

[TABLE]

Besides $(\xi,\delta\otimes\nabla\xi)=\delta\nabla_{\xi}\xi+{\,\,\rm trace\,\,}(\nabla\xi\cdot\nabla\xi)$ (cf.[19]). Hence

[TABLE]

We also have

[TABLE]

Substituting this expression in the above calculation gives

[TABLE]

Theorem 5.

Assume that $f\in L^{p}(\mu)$ for some $p>1$ , satisfying $E[|\nabla f|_{H}^{2}e^{-f}]<\infty$ . Assume moreover that it is $(1-\varepsilon)$ -convex for some $\varepsilon>0$ , in the sense that the mapping

[TABLE]

is a convex map from the Cameron-Martin space $H$ to $L^{0}(\mu)$ (i.e., the equivalence class of real-valued Wiener functionals under the topology of convergence in probability). Then the forward Monge potential $\varphi$ belongs to the Gaussian Sobolev space ${\rm I\!D}_{2,2}$ .

**Proof: **let $f_{n},n\geq 1$ be defined as $e^{-f_{n}}=P_{1/n}E[e^{-f}|V_{n}]$ . Since $f_{n}$ is a smooth, $1$ -convex function, the corresponding forward potential $\varphi_{n}$ is also smooth from the classical finite dimensional results (cf. [4], [20]). Let $d\nu_{n}=e^{-f_{n}}d\mu$ , then we have, from Theorem 4

[TABLE]

From Lemma 5 and denoting $(I_{H}+\nabla^{2}\psi_{n})^{-1}$ by $M_{n}$ , we get

[TABLE]

Since the second terms at the right of the second line is positive (as we have already observed), we obtain

[TABLE]

Hence

[TABLE]

but $E_{\nu_{n}}[|\nabla\psi_{n}|_{H}^{2}]=E[|\nabla\varphi_{n}|_{H}^{2}]$ and

[TABLE]

We also have

[TABLE]

Consequently we get

[TABLE]

and the claim follows by taking the limit at the r.h.s. and he limit inferior at the l.h.s. even with an explicit bound:

[TABLE]

The next corollary follows from Theorem 3 and from Lemma 5 it is about the regularity of the dual potential $\psi$ :

Corollary 1.

Assume that $E[\|\nabla^{2}f\|^{2}_{\infty}e^{-f}]<\infty$ , where $\|\cdot\|_{\infty}$ denotes the operator norm on $H$ . Then $(\delta_{\nu}\circ\nabla)\psi={\mathcal{L}}_{\nu}\psi$ belongs to $L^{2}(\nu)$ .

We can get rid of the hypothesis of the second order differentiability of $f$ by increasing the degree of integrability of $|\nabla f|_{H}$ . We show first

Lemma 6.

Assume that $E_{\nu}[|\nabla f|_{H}^{4}]<\infty$ , then

[TABLE]

where $c_{4}$ is a universal constant.

**Proof: **We can work without loss of generality on the classical Wiener space on which lives the canonical Brownian motion $(W_{t},t\in[0,1])$ . Let $d_{4}$ be the fourth order Wasserstein distance, i.e.,

[TABLE]

Let $L=ce^{-f}$ , we can represent it as $L=\exp(-\int_{0}^{1}(\dot{v}_{s},dW_{s})-\int_{0}^{1}|\dot{v}_{s}|^{2}ds)$ using the Itô representation theorem, where $(\dot{v}_{t},t\in[0,1])$ is an element of $L^{2}(\mu,L^{2}([0,1],{\rm I\!R}^{d})))$ such that $w\to\dot{v}_{t}(w)$ is ${\mathcal{F}}_{t}$ -measurable for almost all $t\in[0,1]$ . Moreover, from the Clark formula, we have

[TABLE]

where $D_{t}f$ is the Lebesgue density of $\nabla f$ . Define $V:W\to W$ as $V(w)=w+\int_{0}^{\cdot}\dot{v}_{s}ds$ , the Girsanov theorem implies that the measure $(V\times I_{W})(Ld\mu)$ belongs to $\Sigma(\mu,\nu)$ , hence

[TABLE]

where $c_{4}$ is a universal constant coming from the Burkholder-Davis-Gundy inequality. By the cyclic monotonicity of the $H$ -sections of $\psi$ , we know that $(I\times(I_{W}+\nabla\varphi))(\mu)=((I_{W}+\nabla\psi)\times I_{W})(\nu)$ is the unique solution of the Monge-Kantorovitch problem of fourth degree, hence $E_{\nu}[|\nabla\psi|_{H}^{4}]=d_{4}^{4}(\mu,\nu)$ .

Theorem 6.

Assume that $E_{\nu}[|\nabla f|^{4}]=E[e^{-f}|\nabla f|^{4}]<\infty$ , then $E_{\nu}[|{\mathcal{L}}_{\nu}\psi|^{2}]<\infty$ .

**Proof: **Let $(f_{n},n\geq 1)$ be the smooth approximations of $f$ defined as $e^{-f_{n}}=P_{1/n}E[e^{-f}|V_{n}]$ and let $(\psi_{n},n\geq 1)$ be the corresponding dual Monge potentials converging to $\psi$ in $L^{1}(\nu)$ s.t. $\lim_{n}E_{\nu}[|\nabla\psi_{n}-\nabla\psi|^{2}]=0$ . We have

[TABLE]

where $(\cdot,\cdot)_{2}$ denotes the Hilbert-Schmidt inner product. Applying again the Hölder inequality to the remaining terms and using the formula given in Lemma 5, and applying Lemma 6, we get

[TABLE]

Since $(P_{1/n}{\mathcal{L}}_{\nu_{n}}\psi_{n},n\geq 1)$ converges to ${\mathcal{L}}_{\nu}\psi$ on the smooth cylindrical functions, the inequality implies that this sequence is bounded in $L^{2}(\nu)$ . Using the inequality that we have already proven:

[TABLE]

and using the weak lower semi-continuity, we obtain

[TABLE]

Bibliography20

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] P.J. Bickel and D.A. Freedman: “Some asymptotic theory for the bootstrap”. Ann. Statis. 9(6), 1196–1217 (1981)
2[2] V. I. Bogachev and A. V. Kolesnikov:“ Sobolev regularity for the Monge-Ampère equation in the Wiener space. Kyoto J. Math. 53, no. 4, 713–738 (2013).
3[3] Y. Brenier: “Polar factorization and monotone rearrangement of vector valued functions”. Comm. Pure Appl. Math, 44 , 375-417, 1991.
4[4] L. A. Caffarelli: “The regularity of mappings with a convex potential”. J. Am. Math. Soc. 5, 99–104 (1992)
5[5] V. Chernozhukov: “Uniform convergence of transport maps”. ar Xiv:1412.8434 v 1.
6[6] S. Fang and V. Nolot: “Sobolev estimates for optimal transport maps on Gaussian spaces”. J. Funct. Anal. 266, no. 8, 5045–5084 (2014).
7[7] D. Feyel and A.S. Üstünel. The notion of convexity and concavity on Wiener space. Journal of Functional Analysis , vol. 176,pp. 400-428, 2000.
8[8] D. Feyel and A. S. Üstünel: “Transport of measures on Wiener space and the Girsanov theorem”. Comptes Rendus Mathématiques, Vol. 334 , Issue 1, 1025-1028, 2002.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Variational Characterization of the Regularity of Monge-Brenier Maps

1. **Introduction **

Theorem 1**.**

2. Preliminaries

3. Variational calculations

Theorem 2**.**

Lemma 1**.**

Proposition 1**.**

Lemma 2**.**

Lemma 3**.**

Theorem 3**.**

Theorem 4**.**

Lemma 4**.**

Lemma 5**.**

Theorem 5**.**

Corollary 1**.**

Lemma 6**.**

Theorem 6**.**

1. Introduction

Theorem 1.

Theorem 2.

Lemma 1.

Proposition 1.

Lemma 2.

Lemma 3.

Theorem 3.

Theorem 4.

Lemma 4.

Lemma 5.

Theorem 5.

Corollary 1.

Lemma 6.

Theorem 6.