Regularity of Schr\"odinger's functional equation in the weak topology   and moment measures

Toshio Mikami

arXiv:1812.10908·math.PR·March 31, 2020

Regularity of Schr\"odinger's functional equation in the weak topology and moment measures

Toshio Mikami

PDF

Open Access

TL;DR

This paper investigates the continuity and measurability of solutions to Schrödinger's functional equation under weak topology, and applies these results to construct convex functions with prescribed moment measures via stochastic optimal transportation.

Contribution

It extends previous work by analyzing the problem under weak topology and introduces a method to construct convex functions with specific moment measures using zero noise limits.

Findings

01

Established continuity and measurability of solutions in weak topology.

02

Constructed convex functions with given moment measures.

03

Linked Schrödinger's equation solutions to stochastic optimal transportation.

Abstract

We study the continuity and the measurability of the solution to Schr\"odinger's functional equation, with respect to space, kernel and marginals, provided the space of all Borel probability measures is endowed with the weak topology. This is a continuation of our previous result where the space of all Borel probability measures was endowed with the strong topology. As an application, we construct a convex function of which the moment measure is a given probability measure, by the zero noise limit of a class of stochastic optimal transportation problems.

Equations370

⎩ ⎨ ⎧ μ_{1} (d x) = ν_{1} (d x) \int_{S} q (x, y) ν_{2} (d y), μ_{2} (d y) = ν_{2} (d y) \int_{S} q (x, y) ν_{1} (d x) .

⎩ ⎨ ⎧ μ_{1} (d x) = ν_{1} (d x) \int_{S} q (x, y) ν_{2} (d y), μ_{2} (d y) = ν_{2} (d y) \int_{S} q (x, y) ν_{1} (d x) .

ν_{1} (K_{m (μ_{1}, μ_{2})}) = ν_{2} (K_{m (μ_{1}, μ_{2})}),

ν_{1} (K_{m (μ_{1}, μ_{2})}) = ν_{2} (K_{m (μ_{1}, μ_{2})}),

m (μ_{1}, μ_{2}) := min {m \geq 1∣ μ_{1} (K_{m}) μ_{2} (K_{m}) > 0}, C (μ_{1}, μ_{2}) := (\frac{ν _{2} ( K _{m (μ_{1}, μ_{2})} )}{ν _{1} ( K _{m (μ_{1}, μ_{2})} )})^{1/2} .

m (μ_{1}, μ_{2}) := min {m \geq 1∣ μ_{1} (K_{m}) μ_{2} (K_{m}) > 0}, C (μ_{1}, μ_{2}) := (\frac{ν _{2} ( K _{m (μ_{1}, μ_{2})} )}{ν _{1} ( K _{m (μ_{1}, μ_{2})} )})^{1/2} .

μ (d x d y) := ν_{1} (d x) q (x, y) ν_{2} (d y),

μ (d x d y) := ν_{1} (d x) q (x, y) ν_{2} (d y),

u_{i}(x_{i}):=\log\biggl{(}\int_{S}q(x_{1},x_{2})\nu_{j}(dx_{j})\biggr{)},\quad i,j=1,2,i\neq j.

u_{i}(x_{i}):=\log\biggl{(}\int_{S}q(x_{1},x_{2})\nu_{j}(dx_{j})\biggr{)},\quad i,j=1,2,i\neq j.

μ (d x d y) = q (x, y) exp (- u_{1} (x) - u_{2} (y)) μ_{1} (d x) μ_{2} (d y) .

μ (d x d y) = q (x, y) exp (- u_{1} (x) - u_{2} (y)) μ_{1} (d x) μ_{2} (d y) .

μ_{i} (d x_{i}) = exp (- u_{i} (x_{i})) μ_{i} (d x_{i}) \int_{S} q (x_{1}, x_{2}) exp (- u_{j} (x_{j})) μ_{j} (d x_{j}) .

μ_{i} (d x_{i}) = exp (- u_{i} (x_{i})) μ_{i} (d x_{i}) \int_{S} q (x_{1}, x_{2}) exp (- u_{j} (x_{j})) μ_{j} (d x_{j}) .

ν_{i} (d x) = ν_{i} (d x; q, μ_{1}, μ_{2}), u_{i} (x) = u_{i} (x; q, μ_{1}, μ_{2}), i = 1, 2.

ν_{i} (d x) = ν_{i} (d x; q, μ_{1}, μ_{2}), u_{i} (x) = u_{i} (x; q, μ_{1}, μ_{2}), i = 1, 2.

ν_{i} (d x; \cdot, \cdot, \cdot) : C (S \times S) \times P (S) \times P (S) ⟶ M (S),

ν_{i} (d x; \cdot, \cdot, \cdot) : C (S \times S) \times P (S) \times P (S) ⟶ M (S),

{u_{i} (x; \cdot, \cdot, \cdot)}_{x \in S} : C (S \times S) \times P (S) \times P (S) ⟶ C (S)

{u_{i} (x; \cdot, \cdot, \cdot)}_{x \in S} : C (S \times S) \times P (S) \times P (S) ⟶ C (S)

\int_{S} f (x) ν_{i} (d x; \cdot, \cdot, \cdot) : C (S \times S) \times P (S) \times P (S) ⟶ R, f \in C_{0} (S)

\int_{S} f (x) ν_{i} (d x; \cdot, \cdot, \cdot) : C (S \times S) \times P (S) \times P (S) ⟶ R, f \in C_{0} (S)

u_{i} : S \times C (S \times S) \times P (S) \times P (S) ⟶ R \cup {\infty} .

u_{i} : S \times C (S \times S) \times P (S) \times P (S) ⟶ R \cup {\infty} .

ν_{i} (d x_{i}; q, μ_{1}, μ_{2}) = \frac{ν _{i} ( d x _{i} ; q , μ _{1} , μ _{2} ) ν _{j} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} )}{( ν _{1} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} ) ν _{2} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} ) ) ^{1/2}} .

ν_{i} (d x_{i}; q, μ_{1}, μ_{2}) = \frac{ν _{i} ( d x _{i} ; q , μ _{1} , μ _{2} ) ν _{j} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} )}{( ν _{1} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} ) ν _{2} ( K _{m (μ_{1}, μ_{2})} ; q , μ _{1} , μ _{2} ) ) ^{1/2}} .

μ (d x) := (D u)_{#} (exp (- u (x)) d x) .

μ (d x) := (D u)_{#} (exp (- u (x)) d x) .

W_{2} (μ_{1}, μ_{2}) :=

W_{2} (μ_{1}, μ_{2}) :=

\displaystyle\qquad m(dx\times\mathbb{R}^{d})=\mu_{1}(dx),m(\mathbb{R}^{d}\times dy)=\mu_{2}(dy)\biggr{\}}\biggl{)}^{1/2}.

d X^{ε, γ} (t) = γ (t) d t + ε d W (t) .

d X^{ε, γ} (t) = γ (t) d t + ε d W (t) .

V_{\varepsilon}(P_{0},P_{1}):=\inf\biggl{\{}E\biggl{[}\int_{0}^{1}\frac{1}{2\varepsilon}|\gamma(t)|^{2}dt\biggr{]}\biggr{|}PX^{\varepsilon,\gamma}(t)^{-1}=P_{t},t=0,1\biggr{\}},

V_{\varepsilon}(P_{0},P_{1}):=\inf\biggl{\{}E\biggl{[}\int_{0}^{1}\frac{1}{2\varepsilon}|\gamma(t)|^{2}dt\biggr{]}\biggr{|}PX^{\varepsilon,\gamma}(t)^{-1}=P_{t},t=0,1\biggr{\}},

S (P) := ⎩ ⎨ ⎧ \int_{R^{d}} p (x) lo g p (x) d x, \infty, if p (x) := \frac{P ( d x )}{d x} exists, otherwise.

S (P) := ⎩ ⎨ ⎧ \int_{R^{d}} p (x) lo g p (x) d x, \infty, if p (x) := \frac{P ( d x )}{d x} exists, otherwise.

Ψ_{ε, r} (P_{1})

Ψ_{ε, r} (P_{1})

:=

B_{r} := {x \in R^{d} : ∣ x ∣ \leq r} .

B_{r} := {x \in R^{d} : ∣ x ∣ \leq r} .

V_{ε} (P_{0}, P_{1}) =

V_{ε} (P_{0}, P_{1}) =

\frac{\partial φ ( t , x )}{\partial t} + \frac{ε}{2} △_{x} φ (t, x) + \frac{ε}{2} ∣ D_{x} φ (t, x) ∣^{2}

\frac{\partial φ ( t , x )}{\partial t} + \frac{ε}{2} △_{x} φ (t, x) + \frac{ε}{2} ∣ D_{x} φ (t, x) ∣^{2}

φ (1, x)

g_{ε} (t, z) :=

g_{ε} (t, z) :=

g_{ε} (t) (x, y) :=

d X^{ε} (t) =

d X^{ε} (t) =

P X^{ε} (t)^{- 1} =

P (X^{ε} (0), X^{ε} (1))^{- 1} (d x d y) =

P (X^{ε} (0), X^{ε} (1))^{- 1} (d x d y) =

f_{o} (y) - φ (0, x; f_{o}) =

f_{o} (y) - φ (0, x; f_{o}) =

V_{ε} (P_{0}, P_{1}) =

V_{ε} (P_{0}, P_{1}) =

=

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic processes and financial applications · advanced mathematical theories · Advanced Mathematical Modeling in Engineering

Full text

Regularity of Schrödinger’s functional equation in the weak topology and moment measures

††thanks: 2010 Mathematics Subject Classification : Primary 60G30 ; Secondary 93E20

Toshio Mikami

This work was supported by JSPS KAKENHI Grant Numbers JP26400136 and JP16H03948.

Abstract

We study the continuity and the measurability of the solution to Schrödinger’s functional equation, with respect to space, kernel and marginals, provided the space of all Borel probability measures is endowed with the weak topology. This is a continuation of our previous result where the space of all Borel probability measures was endowed with the strong topology. As an application, we construct a convex function of which the moment measure is a given probability measure, by the zero noise limit of a class of stochastic optimal transportation problems.

1 Introduction

E. Schrödinger considered the following problem to find the statistical property of a particle on a finite time interval. Suppose that there exist $N\geq 2$ particles in a set $A:=\{a_{1},\cdots,a_{n_{0}}\}\subset{\bf R}^{3}$ and each particle moves independently, with a given transition probability, to a set $B:=\{b_{1},\cdots,b_{n_{1}}\}\subset{\bf R}^{3}$ , where $1\leq n_{0},n_{1}\leq N$ . Find the maximal probability of such events, provided the numbers of particles in each point $a_{i}$ , $b_{j}$ are fixed (see section 7 in [41] and also [40]). Though he did not succeed in finding the maximal probability, he obtained Euler’s equation for the variational problem above. The continuum limit is called Schrödinger’s functional equation (see [5, 9, 22, 24] for the solution of this problem). S. Bernstein [4] generalized Schrödinger’s idea and introduced the so-called Bernstein processes which are also called reciprocal processes. The theory of stochastic differential equation for Schrödinger’s functional equation was given by B. Jamison [25]. The solution is Doob’s h-path process (see [15]) with given two end point marginals. Schrödinger’s problem is also related to the theory of large deviations, the optimal mass transportation problem, entropic estimates and functional inequalities (see, e.g. [1, 2, 11, 12, 16, 18, 23, 26, 27, 28, 30, 31, 35, 36, 37, 43] and the references therein).

We describe E. Schrödinger’s functional equation (see e.g. [24]) in the setting considered in this paper. Let $S$ be a $\sigma$ -compact metric space and $q\in C(S\times S;(0,\infty))$ . For Borel probability measures $\mu_{1},\mu_{2}$ on $S$ , find nonnegative $\sigma$ -finite Borel measures $\nu_{1},\nu_{2}$ on $S$ for which the following holds:

[TABLE]

It is known that there exists a solution $(\nu_{1},\nu_{2})$ of (1.1) (see [9, 24]). $(\nu_{1},\nu_{2})$ is unique up to a constant though the product measure $\nu_{1}\times\nu_{2}$ is unique. Indeed, for any $C>0$ , $(C\nu_{1},C^{-1}\nu_{2})$ is also a solution of (1.1). By the uniqueness of the solution to (1.1), we mean that of the product measure $\nu_{1}\times\nu_{2}$ . Let $\{K_{m}\}_{m\geq 1}$ be a nondecreasing sequence of compact subsets of $S$ such that $S=\cup_{m\geq 1}K_{m}$ , where $K_{1}\equiv S$ when $S$ is compact. When we consider $\nu_{1}$ and $\nu_{2}$ separately, considering $(C(\mu_{1},\mu_{2})\nu_{1},C(\mu_{1},\mu_{2})^{-1}\nu_{2})$ if necessary, we assume that the following holds:

[TABLE]

where

[TABLE]

Then $\exp(u_{1}(x))$ and $\exp(u_{2}(x))$ are positive and

[TABLE]

(1.1) can be rewritten as follows: for $i,j=1,2$ , $i\neq j$ ,

[TABLE]

In particular, Schrödinger’s problem (1.1) is equivalent to finding functions $u_{1}$ and $u_{2}$ for which (1.6) holds. $(u_{1},u_{2})$ is unique up to a constant though $u_{1}(x_{1})+u_{2}(x_{2})$ is unique.

Let ${\cal M}(S)$ and ${\cal P}(S)$ denote the space of all Radon measures and that of all Borel probability measures on $S$ , respectively, where a Radon measure means a locally finite and inner regular Borel measure. It is easy to see that $\nu_{1}$ and $\nu_{2}$ are functionals of $\mu_{1}$ , $\mu_{2}$ and $q$ :

[TABLE]

In [33], we considered the case where ${\cal P}(S)$ is endowed with the strong topology and showed that if $S$ is compact, then the following is continuous:

[TABLE]

and $u_{i}\in C(S\times C(S\times S)\times{\cal P}(S)\times{\cal P}(S))$ . Here ${\cal M}(S)$ is endowed with the strong topology and $C(S\times S)$ and $C(S)$ are endowed with the topology induced by the uniform convergence on $S\times S$ and $S$ , respectively. We also showed that if $S$ is $\sigma$ -compact, then the following is Borel measurable:

[TABLE]

As an application of this measurability result, we showed that the coefficients of the mean field PDE system for the h-path process with given two end point marginals are measurable functions of space, time and marginal.

Remark 1.1

(1.2) was assumed in [33] and implies that for $i,j=1,2$ ,

[TABLE]

In particular, the measurability of $(q,\mu_{1},\mu_{2})\mapsto\nu_{1}(dx_{1};q,\mu_{1},\mu_{2})\nu_{2}(dx_{2};q,\mu_{1},\mu_{2})$ implies that of $(q,\mu_{1},\mu_{2})\mapsto\nu_{i}(dx_{i};q,\mu_{1},\mu_{2})$ .

In this paper we consider the case where ${\cal P}(S)$ is endowed with the weak topology and show the continuity and measurability results on $\nu_{i}$ and $u_{i}$ (see Theorem 2.1 and Corollaries 2.1-2.3 in section 2). Our continuity result in the weak topology is useful when one considers the existence of a minimizer of a variational problem (see [10] for the continuity result on optimal transport). Indeed, it is not easy to show that a minimizing sequence is compact in the strong topology. As an application (see Theorem 2.2 in section 2), we give a stochastic optimal transportation approach to moment measures (see [13, 39]). The definition of a moment measure of a convex function is the following.

Definition 1.1

Given a convex function $u:\mathbb{R}^{d}\longrightarrow\mathbb{R}\cup\{\infty\}$ , the following is called the moment measure of $u$ :

[TABLE]

Remark 1.2

If $\mu$ is a moment measure of a convex function $u:\mathbb{R}^{d}\longrightarrow\mathbb{R}\cup\{\infty\}$ , then $\exp(-u(x))dx\delta_{Du(x)}(dy)$ is the unique minimizer of the $2$ -Wasserstein distance $W_{2}(\exp(-u(x))dx,\mu(dx))$ , provided $\int_{{\mathbb{R}}^{d}}\exp(-u(x))dx=1$ and $W_{2}(\exp(-u(x))dx,\mu(dx))$ is finite (see [7, 8, 43]). Here $\delta_{x}(dy)$ denotes the delta measure on $\{x\}$ and for $\mu_{1},\mu_{2}\in{\cal P}(\mathbb{R}^{d})$ ,

[TABLE]

We describe an application of our continuity result more precisely. Let $\varepsilon>0$ and let $W(t)$ and $\gamma(t)=\gamma(t;\omega)$ denote a $d$ -dimensional Brownian motion and a progressively measurable $\mathbb{R}^{d}$ -valued stochastic process on a filtered probability space, respectively. Consider the following SDE in a weak sense (see e.g. [20]):

[TABLE]

For $P_{0},P_{1}\in{\cal P}(\mathbb{R}^{d})$ ,

[TABLE]

where $V_{\varepsilon}(P_{0},P_{1}):=\infty$ if the set over which the infimum is taken is empty (see [1, 2, 16, 18, 19, 27] for related problems on large deviations). For $P\in{\cal P}(\mathbb{R}^{d})$ ,

[TABLE]

For $\varepsilon,r>0$ , $P_{1}\in{\cal P}(\mathbb{R}^{d})$ ,

[TABLE]

where

[TABLE]

By our weak continuity result of $(q,\mu_{1},\mu_{2})\mapsto\mu(dxdy;q,\mu_{1},\mu_{2})$ , we can easily prove the existence of a minimizer $P_{0,r,\varepsilon}$ of $\Psi_{\varepsilon,r}(P_{1})$ from the lower semicontinuities of a relative entropy and of ${\cal S}$ with respect to the weak topology (see (1.20) and also Lemmas 3.4 and 3.5 in section 3). We show that a subsequence of $\{p_{0,r,\varepsilon}(x)dx\}_{\varepsilon>0}$ weakly converges, as $\varepsilon\to 0$ , to a Borel probability measure $p_{0}(x)dx$ such that $-\log p_{0}(x)$ is convex and $P_{1}$ is a moment measure of $-\log p_{0}(x)$ . This is formally implied by the representation of $P_{0,r,\varepsilon}$ and the SDE for the minimizer of $V_{\varepsilon}(P_{0,r,\varepsilon},P_{1})$ (see (2.11) and (1.17)). We also show that $p_{0,r,\varepsilon}(x)$ has a subsequence which uniformly converges, as $\varepsilon\to 0$ , to $p_{0}(x)$ , provided $P_{1}$ is compactly supported.

$\Psi_{\varepsilon,r}(P_{1})$ formally converges, as $\varepsilon\to 0$ , to the functional considered in [39] where they take the infimum over ${\cal P}(\mathbb{R}^{d})$ instead of ${\cal P}(B_{r})$ . Our approach makes the proof easier than [39] since ${\cal P}(B_{r})$ is compact in the weak topology but can not be applied if we replace ${\cal P}(B_{r})$ by ${\cal P}(\mathbb{R}^{d})$ , which we regret.

In the proof of the representation of $P_{0,r,\varepsilon}$ in (2.11), we also make use of properties of the solution to Schrödinger’s functional equation and the duality theorem for $V_{\varepsilon}(P_{0},P_{1})$ :

[TABLE]

Here the supremum is taken over all classical solutions $\varphi(t,x;f)$ to the following Hamilton-Jacobi-Bellman PDE:

[TABLE]

(see [30, 31, 34, 42] and the references therein).

[TABLE]

It is known that for any $P_{0},P_{1}\in{\cal P}({\bf R}^{d})$ for which $P_{1}(dy)\ll dy$ , there exists the unique weak solution to the following two end points problem of SDE (see [25] and also [33, 34]):

[TABLE]

$X^{\varepsilon}(t)$ is called the h-path process for $\sqrt{\varepsilon}W(t)$ on $[0,1]$ with initial and terminal distribution $P_{0}$ and $P_{1}$ , respectively. The following is also known:

[TABLE]

Suppose that $V_{\varepsilon}(P_{0},P_{1})$ is finite (see Remark 2.2 in section 2 for a sufficient condition). Then $X^{\varepsilon}$ in (1.17) is the unique minimizer of $V_{\varepsilon}(P_{0},P_{1})$ (see [14, 21], [26]-[38], [42], [44] and the references therein). Besides, there exists $f_{o}\in L^{1}(P_{1})$ which is unique up to a constant such that the following holds (see [30, 31, 33, 34, 42] and the references therein and also (1.5)):

[TABLE]

In particular, the following holds:

[TABLE]

Here $H$ denotes the relative entropy of two measures: for $m,n\in{\cal P}(S\times S)$ ,

[TABLE]

Remark 1.3

If $V_{\varepsilon}(P_{0},P_{1})$ is finite, then $P_{1}(dy)\ll dy$ . Indeed, $V_{\varepsilon}(P_{0},P_{1})$ is the relative entropy of $P(X^{\varepsilon})^{-1}$ with respect to $P_{0}\ast P(\sqrt{\varepsilon}W)^{-1}$ on $C([0,1];\mathbb{R}^{d})$ and

[TABLE]

Here $\ast$ denotes the convolution of two measures.

In section 2 we state our main results and prove them in section 4 by lemmas which are given in section 3.

2 Main result

In this section we state our main results. We first describe assumptions precisely.

(A1) $S$ is a complete $\sigma$ -compact metric space.

(A1)’ $S$ is a compact metric space.

(A2) $q\in C(S\times S;(0,\infty))$ .

We remark that ${\cal P}(S)$ is endowed with the weak topology and $C(S\times S)$ is endowed with the topology induced by the uniform convergence on every compact subset of $S$ .

Under (A1), let $\{\varphi_{m}\}_{m\geq 1}$ be a nondecreasing sequence of functions in $C_{0}(S;[0,1])$ such that the following holds:

[TABLE]

(see (1.2)). If $S=\mathbb{R}^{d}$ , then $K_{m}:=B_{m}$ and we assume that $\varphi_{m}\in C_{0}(B_{m+1};[0,1])$ . For $i\neq j$ , $i,j=1,2$ ,

[TABLE]

provided the right hand side is well defined (see (1.7) and also (1.4)).

[TABLE]

The following is the continuity result on $\nu_{1}\times\nu_{2}$ , $\mu$ and $u_{i|m}$ .

Theorem 2.1

Suppose that (A1) and (A2) hold and that $q_{n}\in C(S\times S;(0,\infty))$ , $\mu_{i},\mu_{i,n}\in{\cal P}(S)$ , $n\geq 1$ , $i=1,2$ and

[TABLE]

Then for any $f\in C_{0}(S\times S)$ ,

[TABLE]

In particular,

[TABLE]

For any $\{x_{i,n}\}_{n\geq 1}\subset S$ which converges, as $n\to\infty$ , to $x_{i}\in S$ , $i=1,2$ and for sufficiently large $m\geq 1$ ,

[TABLE]

Since $(\mu_{1},\mu_{2})\mapsto m(\mu_{1},\mu_{2})$ is measurable, Theorem 2.1 implies the following.

Corollary 2.1

Suppose that (A1) and (A2) hold. Then the following are Borel measurable: for $i=1,2$ ,

[TABLE]

If $S$ is compact, then $\nu_{1}(S)=\nu_{2}(S)$ (see (1.2)). This implies, from Theorem 2.1, the following of which the proof is omitted.

Corollary 2.2

Suppose that (A1)’ and the assumption of Theorem 2.1 except (A1) hold. Then the following holds: for $i=1,2$ ,

[TABLE]

and for any $\{x_{n}\}_{n\geq 1}\subset S$ which converges, as $n\to\infty$ , to $x\in S$ ,

[TABLE]

A uniformly bounded sequence of convex functions on a convex neighborhood $N_{A}$ of a convex subset $A$ of $\mathbb{R}^{d}$ is compact in $C(A)$ , provided $dist(A,N_{A}^{c})$ is positive (see e.g., [3], section 3.3). We describe an additional assumption and state a stronger result than above, provided $S\subset\mathbb{R}^{d}$ .

(A3. $r$ ) There exists $C_{r}>0$ for which $x\mapsto C_{r}|x|^{2}+\log q(x,y)$ and $y\mapsto C_{r}|y|^{2}+\log q(x,y)$ are convex on $B_{r}$ for any $y\in B_{r}$ and any $x\in B_{r}$ , respectively.

Remark 2.1

If $\log q(x,y)$ has bounded second order partial derivatives on $B_{r}$ , then (A3. $r$ ) holds.

[TABLE]

The following is a stronger convergence result than Corollary 2.2.

Corollary 2.3

Let $r>0$ . Suppose that (A3. $r$ ) and the assumptions of Corollary 2.2 with $S=B_{r}$ hold. Then for any $r^{\prime}<r$ ,

[TABLE]

As an application of our regularity result, we show that there exists a convex function of which the moment measure is a given probability measure.

Theorem 2.2

For any $P_{1}(dx)=p_{1}(x)dx\in{\cal P}_{2}(\mathbb{R}^{d})$ for which ${\cal S}(P_{1})$ is finite, there exists a minimizer of $\Psi_{\varepsilon,r}(P_{1})$ . For any minimizer $P_{0,r,\varepsilon}(dx)=p_{0,r,\varepsilon}(x)dx$ of $\Psi_{\varepsilon,r}(P_{1})$ ,

[TABLE]

where $C_{\varepsilon}$ is a normalizing constant. Besides, there exists a subsequence of $p_{0,r,\varepsilon}(x)dx$ which weakly converges, as $\varepsilon\to 0$ , to a probability measure $p_{0}(x)dx$ such that $p_{1}(x)dx$ is a moment measure of $-\log p_{0}$ . Suppose, in addition, that $P_{1}$ is compactly supported. Then there exists a subsequence of $p_{0,r,\varepsilon}(x)$ which uniformly converges, as $\varepsilon\to 0$ , to a probability density function $p_{0}(x)$ such that $p_{1}(x)dx$ is a moment measure of $-\log p_{0}$ .

Remark 2.2

If $P_{0},P_{1}(dx)=p_{1}(x)dx\in{\cal P}_{2}(\mathbb{R}^{d})$ and ${\cal S}(P_{1})$ is finite, then $V_{\varepsilon}(P_{0},P_{1})$ is finite. Indeed, from the last equality of (1.20),

[TABLE]

*since, the relative entropy is nonnegative. *

3 Lemmas

In this section we state and prove lemmas. When it is not confusing, we omit the dependence of $u_{i},\nu_{i}$ on $q,\nu_{1},\nu_{2}$ .

3.1 Lemmas for the proof of Theorem 2.1 and Corollary 2.3

The following lemma will be used in the proof of Theorem 2.1.

Lemma 3.1

Suppose that (A1) and (A2) hold. Then for any $\mu_{1},\mu_{2}\in{\cal P}(S)$ , $\mu$ defined by (1.3),

[TABLE]

(Proof) The proof is done by the following (see (1.3)):

[TABLE]

For $r>0$ and $q\in C(B_{r}\times B_{r};(0,\infty))$ ,

[TABLE]

Lemmas 3.2 and 3.3 will be used to prove Corollary 2.3.

Lemma 3.2

([5], p. 194)* Let $r>0$ . Suppose that (A2) with $S=B_{r}$ holds. Then, for any $\mu_{1},\mu_{2}\in{\cal P}(B_{r})$ , the following holds (see (1.4) for notation):*

[TABLE]

By the method of proving the convexity of a log moment generating function, we obtain the following.

Lemma 3.3

Let $C$ and $\nu\in{\cal M}(\mathbb{R}^{d})$ be a convex subset of $\mathbb{R}^{d}$ and a nonnegative Radon measure, respectively. Suppose that $C\ni x\mapsto f(x,y)$ is convex, $\nu(dy)$ -a.e.. Then $C\ni x\mapsto\log\int_{\mathbb{R}^{d}}\exp(f(x,y))\nu(dy)$ is convex.

(Proof) For $x,y\in C$ and $\lambda\in(0,1)$ , by Hölder’s inequality,

[TABLE]

3.2 Lemmas for the proof of Theorem 2.2

In this subsection, we prove lemmas for the proof of Theorem 2.2. Lemma 3.3 will be also used in the proof of Theorem 2.2.

[TABLE]

The lower semicontinuity of a relative entropy and the continuity result in Theorem 2.1 imply the following.

Lemma 3.4

Suppose that Theorem 2.1 holds. Then for any $r,\varepsilon>0$ , the following is lower-semicontinuous on $B_{{\cal P}_{2}(\mathbb{R}^{d}),r}\times B_{{\cal P}_{2}(\mathbb{R}^{d}),r}$ (see (1.4), (1.7) and (1.16) for notation):

[TABLE]

(Proof) From (1.20),

[TABLE]

(see (1.18) and (2.2) for notation). Since $(m,n)\mapsto H(m(dxdy)|n(dxdy))$ is lower semicontinuous (see [17], Lemma 1.4.3), the proof is over from Theorem 2.1. $\Box$

The following lemma can be proved by the lower semicontinuity of a relative entropy.

Lemma 3.5

For any $r>0$ , ${\cal S}$ is lower-semicontinuous on $B_{{\cal P}_{2}(\mathbb{R}^{d}),r}$ in the weak topology.

(Proof)

[TABLE]

The proof is done by the following:

[TABLE]

(see e.g. [17], Lemma 1.4.3). $\Box$

Lemma 3.6

Suppose that (A1) and (A2) hold. Then for any $\mu_{1},\mu_{2}\in{\cal P}(S)$ , $\mu$ defined by (1.3) and sufficiently large $m\geq 1$ , $m\mapsto u_{i|m}$ is nondecreasing, $i=1,2$ and the following holds:

[TABLE]

(Proof) The proof is done by the following (see (1.3) and (2.1)):

[TABLE]

provided the right hand side is positive. $\Box$

For $i=1,2,m\geq 1,\varepsilon>0,x\in\mathbb{R}^{d},$

[TABLE]

In the following lemma, the boundedness of the set $B_{r}$ plays a crucial role.

Lemma 3.7

For any $\varepsilon,r>0$ and $P_{1}(dx)=p_{1}(x)dx\in{\cal P}(\mathbb{R}^{d})$ ,

[TABLE]

Suppose that $P_{0,r,\varepsilon}$ in (2.11) is a minimizer of $\Psi_{\varepsilon,r}(P_{1})$ . Then for $y_{0}:=\int_{\mathbb{R}^{d}}xP_{1}(dx)$ ,

[TABLE]

In particular, for any sequence $\{\varepsilon_{n}\}_{n\geq 1}$ which converges to [math] as $n\to\infty$ , the set $\{x\in B_{r}|\liminf_{n\to\infty}(\overline{u}_{1,\varepsilon_{n}}(x)+\overline{u}_{2,\varepsilon_{n}}(y_{0}))<\infty\}$ has a positive Lebesgue measure, provided $P_{1}\in{\cal P}_{2}(\mathbb{R}^{d})$ and ${\cal S}(P_{1})$ is finite.

(Proof) Let $p_{uni,r}$ denote the probability density function of the uniform distribution on $B_{r}$ . Then the following implies (3.11):

[TABLE]

We prove (3.12). We only have to consider the case where ${\cal S}(P_{1})$ is finite and $P_{1}\in{\cal P}_{2}(\mathbb{R}^{d})$ . From (1.20) and (2.11), by Jensen’s inequality,

[TABLE]

Indeed, one can show that $\overline{u}_{2,\varepsilon}$ is convex from Lemma 3.3 and that $\overline{u}_{2,\varepsilon}$ is finite and continuous on $\mathbb{R}^{d}$ since $\nu_{1}(dx;g_{\varepsilon}(1),P_{0,r,\varepsilon},P_{1})$ is a finite measure on $B_{r}$ . The last part of this lemma can be shown by Fatou’s lemma from (3.8) in Lemma 3.6 and from the following: for $m>r$ ,

[TABLE]

since $\nu_{1}(dx;g_{\varepsilon}(1),P_{0,r,\varepsilon},P_{1})$ is supported on $B_{r}$ . $\Box$

For a convex function $f:\mathbb{R}^{d}\longrightarrow\mathbb{R}\cup\{\infty\}$ , the [math]-sublevel set $f^{-1}((-\infty,0])$ is convex. Roughy speaking, the following lemma can be proved from the fact that a uniformly bounded sequence of convex functions defined on the same open set is compact in the sup norm on any compact subset of the open set (see section 3.3 in [3]).

Lemma 3.8

(i) For a convex set $C\subset\mathbb{R}^{d}$ , $dist(x,C)$ is a convex function. (ii) For a bounded sequence of convex sets $\{C_{n}\subset\mathbb{R}^{d}\}_{n\geq 1}$ , there exists a closed convex set $C_{\infty}$ and a subsequence $\{C_{n_{k}}\}_{k\geq 1}$ of $\{C_{n}\}_{n\geq 1}$ such that $\{dist(x,C_{n_{k}})\}_{k\geq 1}$ converges, as $k\to\infty$ , to $dist(x,C_{\infty})$ uniformly on every compact subset of $\mathbb{R}^{d}$ . (iii) For any $\gamma>0$ , the following holds: for sufficiently large $k\geq 1$ ,

[TABLE]

where $U_{\gamma}(y):=\{x\in\mathbb{R}^{d}:|x-y|<\gamma\}$ .

(Proof) (i) For $x_{1},x_{2}\in\mathbb{R}^{d}$ , $\lambda\in(0,1)$ , $y_{1},y_{2}\in C$ , since $\lambda y_{1}+(1-\lambda)y_{2}\in C$ ,

[TABLE]

Taking the infimum over all $y_{1},y_{2}\in C$ , the proof is done.

(ii) Since $\{C_{n}\}_{n\geq 1}$ is bounded, $\{dist(x,C_{n})\}_{n\geq 1}$ is also locally bounded, which implies that there exists a convex function $h(x)$ and a subsequence $\{dist(x,C_{n_{k}})\}_{k\geq 1}$ such that

[TABLE]

uniformly on every compact subset of $\mathbb{R}^{d}$ (see, e.g., [3], section 3.3).

[TABLE]

Then it is easy to see that the set $C_{\infty}$ is a closed convex set and $h(x)=dist(x,C_{\infty})$ .

(iii) We only have to consider the case where $U_{-\gamma}(C_{\infty})\neq\emptyset$ . From (ii), for sufficiently large $k\geq 1$ ,

[TABLE]

where

[TABLE]

For $x\in U_{-\gamma}(C_{\infty})$ , if $x\notin C_{n_{k}}$ , then the following which contradicts (3.16) holds: for $\tilde{\gamma}<\gamma$ ,

[TABLE]

Indeed, since $C_{n_{k}}$ is convex, for $x\notin C_{n_{k}}$ , there exists $p\in\mathbb{R}^{d}$ such that

[TABLE]

4 Proof of main results

In this section we prove our main results.

(Proof of Theorem 2.1) We first prove (2.5). For the sake of simplicity,

[TABLE]

Since $\{\mu_{1,n}(dx)=\mu_{n}(dx\times S),\mu_{2,n}(dy)=\mu_{n}(S\times dy)\}_{n\geq 1}$ is convergent, $\{\mu_{n}\}_{n\geq 1}$ is tight. Indeed, for any Borel sets $A,B\in\mathbb{R}^{d}$ ,

[TABLE]

and a convergent sequence of probability measures on a complete separable metric space is tight by Prohorov’s Theorem (see, e.g., [6]). Here notice that a $\sigma$ -compact metric space is separable. By Prohorov’s theorem, take a weakly convergent subsequence $\{\mu_{n_{k}}\}_{k\geq 1}$ and denote the limit by $\mu$ . Then it is easy to see that the following holds:

[TABLE]

From (A2) and (2.3)-(2.4), the following holds: for any $f\in C_{0}(S\times S)$ ,

[TABLE]

Indeed,

[TABLE]

The rest of the proof of (2.5) is divided into the following (4.3)-(4.4) which will be proved later.

There exists a subsequence $\{\overline{n}_{k}\}\subset\{n_{k}\}$ and finite measures $\overline{\nu}_{1,m}$ , $\overline{\nu}_{2,m}\in{\cal M}(supp(\varphi_{m}))$ such that for sufficiently large $m\geq 1$ and any $f\in C_{0}(S\times S)$ ,

[TABLE]

From (4.3), for sufficiently large $m\geq 1$ and any Borel sets $A_{1},A_{2}\subset S$ ,

[TABLE]

(4.4) implies that $q(x,y)^{-1}\mu(dxdy)$ is a product measure which satisfies (1.1). (4.2) and the uniqueness of the solution to (1.1) implies that (2.5) is true.

We prove (4.3)-(4.4) to compete the proof of (2.5). (4.3) can be proved by the diagonal method, since $\{\mu_{n}\}_{n\geq 1}$ is tight and since for sufficiently large $m\geq 1$ ,

[TABLE]

has a convergent subsequence from (3.1) in Lemma 3.1 by Prohorov’s Theorem and since any weak limit is a product measure. We prove (4.4). From (4.2) and (4.3), for sufficiently large $\tilde{m}\geq 1$ ,

[TABLE]

From (4.6), for $\tilde{m}\geq m$ , setting $A_{i}=K_{m}$ ,

[TABLE]

Substitute (4.7) to (4.6) and let $\tilde{m}\to\infty$ . Then we obtain (4.4). (2.7) can be shown from (2.5) by the following: from (2.1),

[TABLE]

provided the right hand side is positive. $\Box$

For a compact set $K\subset S$ , ${\cal P}(S)\ni\nu\mapsto\nu(K)$ is upper semicontinuous in the weak topology and is hence measurable. Corollary 2.1 can be proved in the same way as in [33] and we omit the proof.

As we mentioned in section 2, we omit the proof of Corollary 2.2. Corollary 2.2 and Lemmas 3.2 and 3.3 immediately imply Corollary 2.3 (see [3], section 3.3) and we omit the proof. Indeed, if any subsequence of a sequence of pointwise convergent continuous functions has a uniformly convergent subsequence, then it is uniformly convergent.

Before we prove Theorem 2.2, we briefly describe the idea of the proof. Theorem 2.1 and Lemmas 3.4 - 3.5 imply the lower semicontinuity of the functional that we minimize in $\Psi_{\varepsilon,r}(P_{1})$ . (3.11) in Lemma 3.7 implies the finiteness of $\Psi_{\varepsilon,r}(P_{1})$ . In particular, the existence of a minimizer $p_{0,r,\varepsilon}$ of $\Psi_{\varepsilon,r}(P_{1})$ is obtained. (2.11) can be proved by the Duality Theorem (1.14) for $V_{\varepsilon}(P_{0,r,\varepsilon},P_{1})$ and by the fact that the relative entropy of two probability measures is nonnegative and is equal to zero if and only if two probability measures are the same. The characterization of the limit $p_{0}$ of $p_{0,r,\varepsilon}$ , as $\varepsilon\to 0$ , can be inferred from the following. Roughly speaking, from [28],

[TABLE]

(see (1.9) and (1.11) for notation). Besides, there exists a convex function $u:\mathbb{R}^{d}\longrightarrow\mathbb{R}\cup\{\infty\}$ such that for the minimizer $X^{\varepsilon}$ of $V_{\varepsilon}(P_{0,r,\varepsilon},P_{1})$ , as $\varepsilon\to 0$ ,

[TABLE]

In particular,

[TABLE]

(Proof of Theorem 2.2) Since ${\cal P}(B_{r})$ is tight, by Prohorov’s Theorem (see, e.g., [6]), Lemmas 3.4-3.5 and (3.11) in Lemma 3.7 imply the existence of a minimizer $P_{0,r,\varepsilon}(dx)=p_{0,r,\varepsilon}(x)dx$ of $\Psi_{\varepsilon,r}(P_{1})$ . By (1.14),

[TABLE]

Let $f_{0,r,\varepsilon}$ denote $f_{o}$ in (1.19) with $P_{0}=P_{0,r,\varepsilon}$ . Then

[TABLE]

(see (1.19), (1.5) and Remark 2.2). Indeed, for $p(x)dx\in{\cal P}(B_{r})$ ,

[TABLE]

since

[TABLE]

and by Jensen’s inequality,

[TABLE]

(2.11) holds since $\varphi(0,x;f_{0,r,\varepsilon})-u_{1}(x;g_{\varepsilon}(1),P_{0,r,\varepsilon},P_{1})$ is a constant $C$ (see (1.19)) and since, for $p(x)dx\in{\cal P}(B_{r})$ ,

[TABLE]

Here

[TABLE]

and the equality holds if and only if

[TABLE]

We prove the second part of Theorem 2.2. For $i=1,2$ , $m\geq 1$ and $x\in\mathbb{R}^{d}$ ,

[TABLE]

(see (2.1) for notation). Since ${\cal P}(B_{r})$ is compact, $\{P_{0,r,\varepsilon}\}_{\varepsilon>0}$ and $\{\mu_{\varepsilon}\}_{\varepsilon>0}$ has a weakly convergent subsequence by Prohorov’s theorem in the same way as in the proof of Theorem 2.1 (see [6]). Let $P_{0}$ and $\mu$ denote the weak limit along the same subsequence, as $\varepsilon\to 0$ , of $P_{0,r,\varepsilon}$ and $\mu_{\varepsilon}$ , respectively. For sufficiently large $m\geq 1$ , by the diagonal method, $\overline{u}_{1|m,\varepsilon}(x)+\overline{u}_{2|m,\varepsilon}(y)$ has a subsequence which is uniformly convergent, as $\varepsilon\to 0$ , on every compact subset of $\mathbb{R}^{d}\times\mathbb{R}^{d}$ (see (3.10) for notation). Indeed, for sufficiently large $m\geq 1$ and small $\varepsilon>0$ , $\overline{u}_{i|m,\varepsilon}$ , $i=1,2$ are convex from Lemma 3.3, and $\overline{u}_{1|m,\varepsilon}(x)+\overline{u}_{2|m,\varepsilon}(y)$ is uniformly bounded on every compact subset of $\mathbb{R}^{d}\times\mathbb{R}^{d}$ , from (3.8) in Lemma 3.6:

[TABLE]

Let $\{\varepsilon_{n}\}_{n\geq 1}$ denote a sequence which converges to [math], as $n\to\infty$ and along which the above sequences are all convergent.

[TABLE]

There exists the limit

[TABLE]

Indeed, $m\mapsto u_{m}$ is nondecreasing since

[TABLE]

From the last statement of Lemma 3.7, there exists $x_{0}\in B_{r}$ such that $u(x_{0},y_{0})<\infty$ , since

[TABLE]

To complete the proof of Theorem 2.2, we show that the following holds:

[TABLE]

where $D$ is a convex subset of $B_{r}$ and $C$ is a normalizing constant. Notice that $u(x,y)$ is convex and is differentiable a.e. on its domain.

$\underline{\hbox{Proof of (\ref{419})}}$ The following implies that (4.16) holds: for sufficiently large $m\geq r$ ,

[TABLE]

Indeed, from (4.14) and (4.19), for sufficiently large $m>r$ ,

[TABLE]

$(x,y)\in\mathbb{R}^{d}\times Int(supp(\varphi_{m}))(\subset\mathbb{R}^{d}\times Int(supp(\varphi_{m^{\prime}})))$ , $\mu$ -a.s.. To prove (4.19), we first prove that the following holds: for sufficiently large $m\geq r$ ,

[TABLE]

For $i\neq j$ , $i,j=1,2$ ,

[TABLE]

Then for $\delta>0$ and $(x,y)\in supp(P_{0})\times(supp(P_{1})\cap Int(supp(\varphi_{m})))$ ,

[TABLE]

(see (4.15)). Indeed, for $m>r$ , $\mu_{1|m,\varepsilon}(dx)$ is supported on $B_{r}$ since $P_{0,r,\varepsilon}\in{\cal P}(B_{r})$ and

[TABLE]

(4.22) implies (4.20) since

[TABLE]

Next we prove that the following holds: for sufficiently large $m\geq 1$ ,

[TABLE]

Then $A_{m,\delta,k}$ is open since $u_{m}$ is convex and finite (see (4.12)-(4.13)) and is continuous. The following implies that (4.23) is true: from (4.13), for sufficiently large $m\geq 1$ ,

[TABLE]

$\underline{\hbox{Proof of (\ref{420})}}$ For $(x,y)\in supp(P_{0})\times supp(P_{1})$ ,

[TABLE]

Indeed, from (4.14) and (4.20), for sufficiently large $m>r$ such that $y\in Int(supp(\varphi_{m}))$ ,

[TABLE]

(4.14) and the following imply (4.25): from (4.13),

[TABLE]

$u(x,y_{0})$ and $u(x_{0},y)$ are finite for $(x,y)\in A$ from (4.12), since from (4.14) and the equality in (4.25),

[TABLE]

For a set $B\subset\mathbb{R}^{d}$ and a function $f:B\longrightarrow\mathbb{R}$ ,

[TABLE]

Then, from (4.25), for $x\in supp(P_{0})$ ,

[TABLE]

Here $(u|_{supp(P_{1})})(x_{0},y)$ denotes the restriction of $u(x_{0},y)$ on $supp(P_{1})$ and the equality holds if $(x,y_{x})\in A$ for some $y_{x}\in supp(P_{1})$ , in which case $x\in\partial_{y}con\hbox{ }(u|_{supp(P_{1})})(x_{0},y_{x})$ , where for a function $f:\mathbb{R}^{d}\longrightarrow\mathbb{R}\cup\{\infty\}$ ,

[TABLE]

In particular, $x\in\partial_{y}con\hbox{ }(u|_{supp(P_{1})})(x_{0},y)$ , $\mu$ -a.s. from (4.16). $x=D_{y}u(x_{0},y),\mu$ -a.s. since

[TABLE]

and since $P_{1}(dx)$ has a probability density function. In the same way, one can show that $y=D_{x}u(x,y_{0})$ , $\mu$ -a.s..

$\underline{\hbox{Proof of (\ref{421})}}$

[TABLE]

(see (3.10) for notation). Then, from Lemma 3.7,

[TABLE]

Indeed,

[TABLE]

For $\delta>0$ ,

[TABLE]

Then, from Lemma 3.8, there exists a convergent subsequence $\{\psi_{\delta,R,\varepsilon_{n_{k}}}(x)\}_{k\geq 1}$ in $C(B_{r})$ and a closed convex set $D_{R,0}\subset B_{r}$ such that

[TABLE]

Then we prove that the following holds: for a closed set $B\subset B_{r}$ ,

[TABLE]

The proof of (4.29) is done by the following (4.30)-(4.31) which will be proved later.

[TABLE]

Notice that, from (4.13)-(4.14) and Lemma 3.7, the following holds:

[TABLE]

We prove (4.30). For sufficiently large $m\geq 1$ ,

[TABLE]

(see (3.10) for notation). Let $\psi_{\delta}$ denote the function $\psi_{\delta,R,0}$ with $D_{R,0}$ replaced by $D$ . Then for $m>r$ , from (4.13) and (4.28),

[TABLE]

since $R\mapsto D_{R,\varepsilon}$ is nondecrerasing.

We prove (4.31).

[TABLE]

Then

[TABLE]

From (4.32), we only have to prove that the following holds:

[TABLE]

since $\psi_{\delta,R,\varepsilon}^{-1}((0,1])=U_{\delta}(D_{R,\varepsilon})$ . For any $\gamma>0$ , sufficiently large $m\geq m_{0}\geq 1$ and $k$ , from Lemma 3.8 and (4.13),

[TABLE]

Here, from (3.10) and (4.21) (see also (1.1)),

[TABLE]

(4.12) and (4.13) complete the proof of (4.34).

If $P_{1}$ is compactly supported, then $u_{i|m,\varepsilon}=u_{i|m^{\prime},\varepsilon}$ and $u(x,y)=u_{m^{\prime}}(x,y)$ for $m^{\prime}\geq m$ , provided $B_{r}\cup supp(P_{1})\subset B_{m}$ . (4.11)-(4.13) imply that the last statement of Theorem 2.2 holds. $\Box$

Acknowledgement: This work was supported by JSPS KAKENHI Grant Numbers JP26400136 and JP16H03948. We would also like to thank an anonymous referee for useful suggestions.

Bibliography44

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] S. Adams, N. Dirr, M. A. Peletier, J. Zimmer, From a large-deviations principle to the Wasserstein gradient flow: a new micro-macro passage, Commun. in Math. Phys. 307 , (2011), 791–815.
2[2] J. Backhoff, G. Conforti, I. Gentil, C. Léonard, The mean field Schrödinger problem: ergodic behavior, entropy estimates and functional inequalities, ar Xiv:1905.02393 v 1.
3[3] I. J. Bakelman, Convex Analysis and Nonlinear Geometric Elliptic Equations, Springer-Verlag, 1994.
4[4] S. Bernstein, Sur les liaisons entre les grandeurs alétoires, Verh. des intern. Mathematikerkongr. Zurich 1932, Band 1 , (1932), 288–309.
5[5] A. Beurling, An Automorphism of Product Measures, Ann. of Math. 72 , (1960), 189–200.
6[6] P. Billingsley, Convergence of Probability Measures, Wiley-Interscience, 1999.
7[7] Y. Brenier, Décomposition polaire et réarrangement monotone des champs de vecteurs, C. R. Acad. Sci. Paris Série I, 305 , no. 19, (1987), 805–808.
8[8] Y. Brenier, Polar factorization and monotone rearrangement of vector-valued functions, Comm. Pure Appl. Math., 44 , no. 4, (1991), 375–417.