An optimal transport problem with backward martingale constraints   motivated by insider trading

Dmitry Kramkov; Yan Xu

arXiv:1906.03309·math.PR·September 13, 2022

An optimal transport problem with backward martingale constraints motivated by insider trading

Dmitry Kramkov, Yan Xu

PDF

TL;DR

This paper investigates a single-period optimal transport problem with a backward martingale constraint and a covariance-type cost, providing conditions for optimality, uniqueness, and representation, inspired by insider trading models.

Contribution

It introduces a novel optimal transport framework with backward martingale constraints and characterizes optimal plans via maximal monotone sets, extending classical models.

Findings

01

Optimal plans are supported on maximal monotone sets.

02

Sharp regularity conditions for uniqueness and map representation.

03

Connection to insider trading models like the Kyle model.

Abstract

We study a single-period optimal transport problem on $R^{2}$ with a covariance-type cost function $c (x, y) = (x_{1} - y_{1}) (x_{2} - y_{2})$ and a backward martingale constraint. We show that a transport plan $γ$ is optimal if and only if there is a maximal monotone set $G$ that supports the $x$ -marginal of $γ$ and such that $c (x, y) = min_{z \in G} c (z, y)$ for every $(x, y)$ in the support of $γ$ . We obtain sharp regularity conditions for the uniqueness of an optimal plan and for its representation in terms of a map. Our study is motivated by a variant of the classical Kyle model of insider trading from Rochet and Vila (1994).

Equations488

minimize E (c (X, Y)) over X \in X (Y),

minimize E (c (X, Y)) over X \in X (Y),

minimize \int c (x, y) d γ over γ \in Γ (ν),

minimize \int c (x, y) d γ over γ \in Γ (ν),

maximize \int x_{1} x_{2} d γ over γ \in Γ (ν),

maximize \int x_{1} x_{2} d γ over γ \in Γ (ν),

\int c (x, y) d γ = \int y_{1} y_{2} d ν - \int x_{1} x_{2} d μ .

\int c (x, y) d γ = \int y_{1} y_{2} d ν - \int x_{1} x_{2} d μ .

c (x, y) = ϕ_{G} (y) ≜ z \in G in f c (z, y), (x, y) \in supp γ .

c (x, y) = ϕ_{G} (y) ≜ z \in G in f c (z, y), (x, y) \in supp γ .

H = {z \in R^{2} : c (y, z) = ϕ_{G} (y)} = {z \in R^{2} : z_{2} = y_{2} + \frac{ϕ _{G} ( y )}{z _{1} - y _{1}}};

H = {z \in R^{2} : c (y, z) = ϕ_{G} (y)} = {z \in R^{2} : z_{2} = y_{2} + \frac{ϕ _{G} ( y )}{z _{1} - y _{1}}};

maximize \int ϕ_{G} d ν over G \in M,

maximize \int ϕ_{G} d ν over G \in M,

μ (f ∣ g) = (μ (f_{1} ∣ g_{1}, \dots, g_{n}), \dots, μ (f_{m} ∣ g_{1}, \dots, g_{n})) .

μ (f ∣ g) = (μ (f_{1} ∣ g_{1}, \dots, g_{n}), \dots, μ (f_{m} ∣ g_{1}, \dots, g_{n})) .

Γ (ν) ≜ {γ \in P_{2} (R^{2} \times R^{2}) : γ (R^{2}, d y) = ν (d y) and γ (y ∣ x) = x} .

Γ (ν) ≜ {γ \in P_{2} (R^{2} \times R^{2}) : γ (R^{2}, d y) = ν (d y) and γ (y ∣ x) = x} .

minimize \int c (x, y) d γ over γ \in Γ (ν)

minimize \int c (x, y) d γ over γ \in Γ (ν)

c (x, y) = (x_{1} - y_{1}) (x_{2} - y_{2}), x, y \in R^{2} .

c (x, y) = (x_{1} - y_{1}) (x_{2} - y_{2}), x, y \in R^{2} .

maximize \int x_{1} x_{2} d γ over γ \in Γ (ν) .

maximize \int x_{1} x_{2} d γ over γ \in Γ (ν) .

\int c (x, y) d γ

\int c (x, y) d γ

= \int y_{1} y_{2} d ν - \int x_{1} x_{2} d γ,

c (r, s) = (r_{1} - s_{1}) (r_{2} - s_{2}) \geq 0, r, s \in G .

c (r, s) = (r_{1} - s_{1}) (r_{2} - s_{2}) \geq 0, r, s \in G .

ϕ_{G} (y) ≜ x \in G in f c (x, y) = x \in G in f (x_{1} - y_{1}) (x_{2} - y_{2}), y \in R^{2} .

ϕ_{G} (y) ≜ x \in G in f c (x, y) = x \in G in f (x_{1} - y_{1}) (x_{2} - y_{2}), y \in R^{2} .

(1 - t) c (x^{0}, y^{0}) + t c (x^{1}, y^{1}) \leq t (1 - t) c (y^{0}, y^{1}), t \in [0, 1] .

(1 - t) c (x^{0}, y^{0}) + t c (x^{1}, y^{1}) \leq t (1 - t) c (y^{0}, y^{1}), t \in [0, 1] .

c (x, y) \leq ϕ_{G} (y), (x, y) \in supp γ .

c (x, y) \leq ϕ_{G} (y), (x, y) \in supp γ .

c (x, y) = ϕ_{G} (y) = z \in G min c (z, y), (x, y) \in supp γ .

c (x, y) = ϕ_{G} (y) = z \in G min c (z, y), (x, y) \in supp γ .

H^{0} = {z \in R^{2} : c (z, y^{0}) = c (x^{0}, y^{0}), z_{1} > y_{1}^{0}},

H^{0} = {z \in R^{2} : c (z, y^{0}) = c (x^{0}, y^{0}), z_{1} > y_{1}^{0}},

H^{1} = {z \in R^{2} : c (z, y^{1}) = c (x^{1}, y^{1}), z_{1} < y_{1}^{1}},

W_{2} (μ, ν) ≜ π \in Π (μ, ν) in f \int ∣ x - y ∣^{2} d π .

W_{2} (μ, ν) ≜ π \in Π (μ, ν) in f \int ∣ x - y ∣^{2} d π .

maximize \int x_{1} x_{2} d π over π \in Π (μ_{1}, μ_{2}),

maximize \int x_{1} x_{2} d π over π \in Π (μ_{1}, μ_{2}),

W_{2} (μ_{1}, μ_{2}) = \int ∣ x_{1} - x_{2} ∣^{2} d μ .

W_{2} (μ_{1}, μ_{2}) = \int ∣ x_{1} - x_{2} ∣^{2} d μ .

f (x_{1}) = in f {x_{2} \in R : (x_{1}, x_{2}) \in G}, x_{1} \in P_{1},

f (x_{1}) = in f {x_{2} \in R : (x_{1}, x_{2}) \in G}, x_{1} \in P_{1},

μ (B) = μ_{1} {t \in R : (t, f (t)) \in B}, B \in B (R^{2}) .

μ (B) = μ_{1} {t \in R : (t, f (t)) \in B}, B \in B (R^{2}) .

minimize \int c (x, y) d π over π \in Π (μ, ν) .

minimize \int c (x, y) d π over π \in Π (μ, ν) .

\int ϕ d ν = \int c (x, y) d γ,

\int ϕ d ν = \int c (x, y) d γ,

ϕ^{c} (x) ≜ y \in R^{2} in f (c (x, y) - ϕ (y)), x \in R^{2},

ϕ^{c} (x) ≜ y \in R^{2} in f (c (x, y) - ϕ (y)), x \in R^{2},

\int ϕ^{c} d μ = 0.

\int ϕ^{c} d μ = 0.

\int c (x, y) d γ = \int ϕ d ν + \int ϕ^{c} d μ \leq \int c (x, y) d π, π \in Π (μ, ν),

\int c (x, y) d γ = \int ϕ d ν + \int ϕ^{c} d μ \leq \int c (x, y) d π, π \in Π (μ, ν),

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

An optimal transport problem with backward martingale

constraints motivated by insider trading.

Dmitry Kramkov111Carnegie Mellon University, Department of Mathematical Sciences, 5000 Forbes Avenue, Pittsburgh, PA, 15213-3890, USA. *Email: *[email protected]

Yan Xu222Carnegie Mellon University, Department of Mathematical Sciences, 5000 Forbes Avenue, Pittsburgh, PA, 15213-3890, USA. *Email: *[email protected]

Abstract

We study a single-period optimal transport problem on $\mathbb{R}^{2}$ with a covariance-type cost function $c(x,y)=(x_{1}-y_{1})(x_{2}-y_{2})$ and a backward martingale constraint. We show that a transport plan $\gamma$ is optimal if and only if there is a maximal monotone set $G$ that supports the $x$ -marginal of $\gamma$ and such that $c(x,y)=\min_{z\in G}c(z,y)$ for every $(x,y)\in\operatorname{supp}\gamma$ . We obtain sharp regularity conditions for the uniqueness of an optimal plan and for its representation in terms of a map. Our study is motivated by a variant of the classical Kyle model of insider trading from Rochet and Vila (1994).

Keywords:

martingale optimal transport, Kyle equilibrium.

AMS Subject Classification (2010):

60G42, 91B24, 91B52.

1 Introduction

Let $(\Omega,\mathcal{F},\mathbb{P})$ be a probability space and $Y=(Y_{1},Y_{2})$ be a $2$ -dimensional random variable with finite second moment: $Y\in\mathcal{L}^{2}(\Omega,\mathcal{F},\mathbb{P})$ . Our goal is to

[TABLE]

for the cost function $c(x,y)=(x_{1}-y_{1})(x_{2}-y_{2})$ , $x,y\in\mathbb{R}^{2}$ , and the domain $\mathcal{X}(Y)$ that consists of $Y$ -measurable random variables $X=(X_{1},X_{2})$ such that $(X,Y)$ is a martingale: $\mathbb{E}\left(\left.Y\right\lvert{X}\right)=X$ . A relaxation of the $Y$ -measurability constraint on $X$ leads to the optimal transport problem:

[TABLE]

where $\nu=\operatorname{{Law}}(Y)$ and $\Gamma(\nu)$ is the family of probability measures $\gamma=\gamma(dx,dy)$ on $\mathbb{R}^{2}\times\mathbb{R}^{2}$ that have $\nu$ as their $y$ -marginal: $\gamma(\mathbb{R}^{2},dy)=\nu(dy)$ , and make a martingale out of the canonical process: $\gamma(y|x)=x$ . In view of the martingale constraint, problem (2) admits an equivalent formulation:

[TABLE]

and thus, has a natural connection to the classical Fréchet-Hoeffding inequality and the Wasserstein $2$ -distance.

Problem (2) exhibits a backward structure in the sense that the initial marginal $\mu(dx)=\gamma(dx,\mathbb{R}^{2})$ is part of the solution. In this regard, it differs from the “standard” single-period martingale transport problem in Beiglböck and Juillet (2016), Beiglböck et al. (2017), Henry-Labordère and Touzi (2016), and Ghoussoub et al. (2019), among others, where both the initial and terminal marginals are fixed. We point out that for our cost function $c(x,y)=(x_{1}-y_{1})(x_{2}-y_{2})$ , the standard problem is trivial, as every martingale measure $\gamma=\gamma(dx,dy)$ with given marginals $\mu=\mu(dx)$ and $\nu=\nu(dy)$ produces the same average cost:

[TABLE]

Our work is motivated by the classical Kyle (1985) equilibrium with insider from financial economics. More precisely, we consider the model from Rochet and Vila (1994), where the insider observes both the terminal value $V$ of the risky asset and the order flow $U$ of the noise traders; see Section 6 for details. Setting $Y=(U,V)$ we establish in Theorem 6.3 the equivalence between the existence of equilibrium and that of an optimal map $X$ for (1) such that $\gamma=\operatorname{{Law}}{(X,Y)}$ is an optimal plan for (2). Moreover, the components of $X=(R,S)$ are naturally identified as equilibrium’s total order $R$ and price $S$ . To the best of our knowledge, the connection between the Kyle equilibrium and a martingale optimal transport is new.

The main results of the paper are Theorems 2.2 and 4.6. In Theorem 2.2 we prove the existence of an optimal plan for (2) and characterize its support. We show that $\gamma\in\Gamma(\nu)$ is optimal if and only if there is a maximal monotone set $G$ in $\mathbb{R}^{2}$ that supports the $x$ -marginal of $\gamma$ and such that

[TABLE]

Geometrically, the support of ${\gamma}$ has the hyperbolic tangent property: it connects $y\not\in G$ to those $x\in G$ , that are touched by the hyperbola

[TABLE]

see Figure 1. Surprisingly, as a consequence of (3), the optimal plan $\gamma$ possesses properties of solutions to classical unconstrained problems. By Corollary 2.3, the $x$ -marginal of $\gamma$ is a Fréchet-Hoeffding coupling between its first and second coordinates, while, by Corollary 2.5, $\gamma$ is a classical optimal coupling between its $x$ - and $y$ -marginals.

In Theorem 3.2 we show that the set $G$ from (3) is a solution of the dual problem:

[TABLE]

where $\mathfrak{M}$ is the family of all maximal monotone sets in $\mathbb{R}^{2}$ , and that primal and dual problems (2) and (4) have identical values. The dual problem appears in (Rochet and Vila, 1994, Eq. (2.3)), where $G$ stands for the graph of a pricing rule. When $\nu=\operatorname{{Law}}(Y)$ has a Gaussian or, more generally, elliptically contoured distribution, $G$ becomes a line with strictly positive slope; see Example 5.1.

In Theorem 4.1, we show that optimal map and plan problems (1) and (2) have identical values, provided that $\nu$ is atomless. The result is similar to that of Pratelli (2007) for the classical unconstrained case. The existence of an optimal map $X$ for (1) that induces an optimal plan $\gamma=\operatorname{{Law}}(X,Y)$ for (2) is obtained in Theorem 4.5 under the condition that $\nu$ gives zero mass to the graphs of strictly decreasing Lipschitz functions. This assumption is weaker than the standard regularity condition of the Brenier theorem, see (Ambrosio and Gigli, 2013, Theorem 1.26), that requires $\nu$ to assign zero mass to rotations of the graphs of Lipschitz functions. Our second main result, Theorem 4.6, establishes the uniqueness of solutions to (1) and (2) if, in addition, the (one-dimensional) distribution functions of $Y_{1}$ and $Y_{2}$ are continuous. Examples 5.2 and 5.3 show that the conditions of Theorems 4.1 and 4.5 are sharp.

Being applied to the model of Rochet and Vila (1994), Theorems 4.5 and 4.6 yield sufficient conditions for the existence and uniqueness of equilibria, which are stated in Theorem 6.7. These assumptions generalize those in Rochet and Vila (1994), where $Y=(U,V)$ is required to have a continuous compactly supported density in $\mathbb{R}^{2}$ . Rochet and Vila (1994) work with dual problem (4) and rely on the properties of the space of closed graph correspondences endowed with the Hausdorff topology.

Finally, Appendix A contains a density result for the Wasserstein spaces, for which we could not find a ready reference, while Appendix B collects the properties of the function $\phi_{G}$ from (3).

2 A backward martingale optimal transport problem

We denote by $\mathcal{P}_{2}(\mathbb{R}^{d})$ the family of Borel probability measures with finite second moments and by $\mathcal{B}(\mathbb{R}^{d})$ the Borel $\sigma$ -algebra on $\mathbb{R}^{d}$ . For a Borel probability measure $\mu$ on $\mathbb{R}^{d}$ , a $\mu$ -integrable $m$ -dimensional Borel function $f=(f_{1},\dots,f_{m})$ , and an $n$ -dimensional Borel function $g=(g_{1},\dots,g_{n})$ , the notation $\mu(f|g)$ stands for the $m$ -dimensional vector of conditional expectations of $f_{i}$ given $g$ under $\mu$ :

[TABLE]

Similarly, $\int fd\mu=(\int f_{1}d\mu,\dots,\int f_{m}d\mu)$ . We write a point in $\mathbb{R}^{4}=\mathbb{R}^{2}\times\mathbb{R}^{2}$ as $(x,y)$ , where $x=(x_{1},x_{2})$ and $y=(y_{1},y_{2})$ belong to $\mathbb{R}^{2}$ , and think about $x$ and $y$ as the initial and terminal values of the canonical two-dimensional process.

Let $\nu=\nu(dy)\in\mathcal{P}_{2}(\mathbb{R}^{2})$ . We denote by $\Gamma(\nu)$ the family of probability measures $\gamma=\gamma(dx,dy)\in\mathcal{P}_{2}(\mathbb{R}^{2}\times\mathbb{R}^{2})$ that have $\nu$ as their $y$ -marginal and make a martingale out of the canonical process:

[TABLE]

Our goal is to

[TABLE]

for the covariance-type cost function

[TABLE]

Problem (5) belongs to the class of optimal transport problems with backward martingale constraints, in the sense that the initial $x$ -marginal is part of the solution. As we shall see in Section 6, such problem naturally arises in the study of the Kyle-type equilibrium with insider.

*Remark 2.1**.*

Problem (5) admits several equivalent formulations. For instance, it has same solutions as the one, where we

[TABLE]

The justification comes from the identity

[TABLE]

where the second equality holds by the martingale property of $\gamma\in\Gamma(\nu)$ .

For a Borel probability measure $\gamma$ on $\mathbb{R}^{d}$ we denote by $\operatorname{supp}{\gamma}$ its support, that is, the smallest closed set with full measure. We recall that a set $G\subset\mathbb{R}^{2}$ is monotone if

[TABLE]

A monotone set $G$ is maximal if it is not a proper (or strict) subset of a monotone set. We denote by $\mathfrak{M}$ the family of maximal monotone sets in $\mathbb{R}^{2}$ . It is well-known that $G\in\mathfrak{M}$ if and only if $G$ is the graph of the subdifferential of a proper closed convex function on $\mathbb{R}$ .

For $G\in\mathfrak{M}$ we define a function

[TABLE]

Such functions $\phi_{G}$ will play a key role in our study. Their properties are collected in Appendix B. In particular, Lemma B.1 states that $\phi_{G}$ takes values in $[-\infty,0]$ and $G=\left\{{x\in\mathbb{R}^{2}}:\;\phi_{G}(x)=0\right\}$ .

The main results of the paper are Theorems 2.2 and 4.6. Theorem 2.2 establishes the existence of an optimal plan $\gamma$ for (5) and shows the structure of its support. Theorem 4.6 contains a uniqueness result.

Theorem 2.2.

Let $\nu\in\mathcal{P}_{2}(\mathbb{R}^{2})$ . An optimal plan for (5) exists. For a probability measure $\gamma\in\Gamma(\nu)$ the following conditions are equivalent:

(a)

$\gamma$ * is an optimal plan for (5).* 2. (b)

If points $(x^{0},y^{0})$ and $(x^{1},y^{1})$ belong to $\operatorname{supp}{\gamma}$ , then

[TABLE] 3. (c)

There is $G\in\mathfrak{M}$ such that

[TABLE]

Moreover, if $G$ is a maximal monotone set satisfying (8) and $\mu$ is the $x$ -marginal of $\gamma$ , then $G$ contains $\operatorname{supp}{\mu}$ and

[TABLE]

Figure 1 illustrates the properties of the support of an optimal plan $\gamma$ stated in Theorem 2.2. Let $(x^{0},y^{0})$ and $(x^{1},y^{1})$ belong to $\operatorname{supp}{\gamma}$ and be such that $x^{0}\not=x^{1}$ and the points $y^{0}$ and $y^{1}$ lie, respectively, strictly above and strictly below the maximal monotone set $G$ from item (c). As Lemma 2.11 shows, item (b) means that the hyperbolas

[TABLE]

do not cross. The geometric interpretation of item (c) is that these hyperbolas are tangent to $G$ .

Before proceeding with the proof of Theorem 2.2, we establish rather surprising connections between an optimal martingale plan for (5) and the solutions of classical unconstrained optimal transport problems. If $\mu$ and $\nu$ are Borel probability measures on $\mathbb{R}^{d}$ , then $\Pi(\mu,\nu)$ denotes the family of all couplings of $\mu$ and $\nu$ , that is, the family of Borel probability measures $\pi$ on $\mathbb{R}^{d}\times\mathbb{R}^{d}=\left\{{(x,y)}:\;x,y\in\mathbb{R}^{d}\right\}$ with $x$ -marginal $\mu$ and $y$ -marginal $\nu$ . For $\mu,\nu\in\mathcal{P}_{2}(\mathbb{R}^{d})$ , the Wasserstein 2-metric is given by

[TABLE]

Corollary 2.3.

Let $\nu\in\mathcal{P}_{2}(\mathbb{R}^{2})$ , $\mu$ be the $x$ -marginal of an optimal plan $\gamma$ for (5), and $\mu_{i}$ be the $x_{i}$ -marginal of $\mu$ , $i=1,2$ . Then $\mu$ is a solution of the optimal transport problem:

[TABLE]

or, equivalently,

[TABLE]

Proof.

It is well-known that problems (9) and (10) have same solutions and that an element of $\Pi(\mu_{1},\mu_{2})$ is such a solution if and only if its support belongs to a cyclically monotone set. By Theorem 2.2, there is a monotone set $G$ that contains the support of $\mu$ . Since every monotone set in $\mathbb{R}^{2}$ is also cyclically monotone, the result follows. ∎

*Remark 2.4**.*

Let $G$ be a maximal monotone set from Theorem 2.2 and $P_{1}$ be its projection on $x_{1}$ -coordinate. If $\mu_{1}$ is atomless, then the increasing function

[TABLE]

taking values in $\mathbb{R}\cup\left\{{-\infty}\right\}$ , defines an optimal map solution to (10):

[TABLE]

The function $f$ is a pricing rule in a version of the Kyle equilibrium with insider studied in Section 6.

Corollary 2.5.

Let $\nu\in\mathcal{P}_{2}(\mathbb{R}^{2})$ , $\gamma$ be an optimal plan for (5), and $\mu$ be the $x$ -marginal of $\gamma$ . Then $\gamma$ is a solution of the optimal transport problem:

[TABLE]

Proof.

By Theorem 2.2, there is $G\in\mathfrak{M}$ such that

[TABLE]

where $\phi=\phi_{G}$ . Lemma B.1 shows that the $c$ -conjugate function

[TABLE]

takes values in $[-\infty,0]$ and $G=\left\{{x\in\mathbb{R}^{2}}:\;\phi^{c}(x)=0\right\}$ . By Theorem 2.2, $\operatorname{supp}{\mu}\subset G$ and thus,

[TABLE]

Since $\phi^{c}(x)+\phi(y)\leq c(x,y)$ , we deduce that

[TABLE]

and the optimality of $\gamma$ for (11) follows. ∎

*Remark 2.6**.*

We point out that the assertions of the corollaries are not sufficient for the optimality of $\gamma\in\Gamma(\nu)$ . Indeed, let $\gamma$ be the simplest martingale measure, whose $x$ -marginal $\mu$ is the Dirac measure concentrated at the mean $\int yd\nu\in\mathbb{R}^{2}$ . In this case, the families $\Pi(\mu_{1},\mu_{2})$ and $\Pi(\mu,\nu)$ are singletons and hence, $\mu$ and $\gamma$ are trivial solutions to (10) and (11). An elementary analysis of (8) shows that such $\gamma$ is optimal for (5) if and only if the support of $\nu$ belongs to a line with negative or infinite slope.

The rest of the section is devoted to the proof of Theorem 2.2, which we divide into lemmas. We start with the existence part and recall some basic facts on the Wasserstein distance $W_{2}$ ; see (Ambrosio and Gigli, 2013, Theorem 2.7 and Proposition 2.4). If $(\mu_{n})$ and $\mu$ are in $\mathcal{P}_{2}(\mathbb{R}^{d})$ , then $W_{2}(\mu_{n},\mu)\to 0$ if and only if $\int f(x)d\mu_{n}\to\int f(x)d\mu$ for every continuous function $f=f(x)$ on $\mathbb{R}^{d}$ with quadratic growth:

[TABLE]

where $K=K(f)>0$ is a constant. A set $A\subset\mathcal{P}_{2}(\mathbb{R}^{d})$ is pre-compact under $W_{2}$ if and only if

[TABLE]

Lemma 2.7.

The family $\Gamma(\nu)$ is a convex compact set in $\mathcal{P}_{2}(\mathbb{R}^{2}\times\mathbb{R}^{2})$ under the Wasserstein metric $W_{2}$ .

Proof.

The martingale property $\gamma(y|x)=x$ of $\gamma\in\Gamma(\nu)$ is equivalent to the identity

[TABLE]

for every bounded and continuous function $f=f(x)$ on $\mathbb{R}^{2}$ . The convexity and the closedness of $\Gamma(\nu)$ under $W_{2}$ readily follow. It only remains to be shown that $\Gamma(\nu)$ is pre-compact under $W_{2}$ or, equivalently, that

[TABLE]

For $\gamma\in\Gamma(\nu)$ we have that

[TABLE]

and then that

[TABLE]

Finally, we obtain that

[TABLE]

and the result follows. ∎

Lemma 2.8.

An optimal plan $\gamma$ for (5) exists.

Proof.

Let $(\gamma_{n})$ be a sequence in $\Gamma(\nu)$ such that

[TABLE]

By Lemma 2.7, $\Gamma(\nu)$ is compact under $W_{2}$ . Hence, there is a subsequence $(\gamma_{n_{k}})\subset(\gamma_{n})$ that converges to $\gamma\in\Gamma(\nu)$ under $W_{2}$ . Since the cost function $c=c(x,y)$ is continuous and has quadratic growth, we deduce that

[TABLE]

Thus, $\gamma$ is an optimal plan. ∎

The implication (a) $\implies$ (b) of Theorem 2.2 is proved in Lemma 2.10 and relies on the following first-order optimality condition.

Lemma 2.9.

Let $\gamma$ be an optimal plan for (5). Then

[TABLE]

for every $\eta\in\mathcal{P}_{2}(\mathbb{R}^{2}\times\mathbb{R}^{2})$ such that $\operatorname{supp}\eta\subset\operatorname{supp}\gamma$ .

Proof.

We first establish (12) for a Borel probability measure $\eta$ on $\mathbb{R}^{2}\times\mathbb{R}^{2}$ that has a bounded density with respect to $\gamma$ :

[TABLE]

We choose a non-atom $q\in\mathbb{R}^{2}$ of $\mu(dx)=\gamma(dx,\mathbb{R}^{2})$ and define the probability measure

[TABLE]

where $\delta_{q}$ is the Dirac measure concentrated at $q$ . For sufficiently small $\varepsilon>0$ the probability measure

[TABLE]

is well-defined and has the same $y$ -marginal $\nu$ as $\gamma$ . We define the conditional expectation $\widetilde{X}(x)=\widetilde{\gamma}(y|x)$ and observe that the law of $(\widetilde{X},y)$ under $\widetilde{\gamma}$ belongs to $\Gamma(\nu)$ . The optimality of $\gamma$ for (5) or, equivalently, for (6) implies that

[TABLE]

Standard computations based on Bayes formula show that

[TABLE]

where $U(x)=\gamma(V|x)$ , $R(x)=\gamma(Vy|x)$ , and $m=\int yd\eta$ . Since $\left\lvert V\right\rvert\leq K$ for some constant $K>0$ , we deduce that $\left\lvert U\right\rvert\leq K$ and $\left\lvert R\right\rvert\leq K\gamma(\left\lvert y\right\rvert|x)$ . It follows that

[TABLE]

In view of (13), the first-order term is negative. It can be written as

[TABLE]

and the result follows.

In the general case, where $\eta\in\mathcal{P}_{2}(\mathbb{R}^{2}\times\mathbb{R}^{2})$ and $\operatorname{supp}{\eta}\subset\operatorname{supp}{\gamma}$ , we use the approximation result from Appendix A. By Theorem A.1, there are Borel probability measures $(\eta_{n})$ on $\mathbb{R}^{2}\times\mathbb{R}^{2}$ that have bounded densities with respect to $\gamma$ and converge to $\eta$ under $W_{2}$ . By what we have already proved,

[TABLE]

Since the integrands are continuous functions with quadratic growth, we can pass to the limit as $n\to\infty$ and obtain (12). ∎

Lemma 2.10.

Let $\gamma$ be an optimal plan for (5). Then condition (b) of Theorem 2.2 holds.

Proof.

Lemma 2.9 yields inequality (12) for the probability measure

[TABLE]

where $t\in[0,1]$ , $(x^{i},y^{i})\in\operatorname{supp}{\gamma}$ , and $\delta_{(x^{i},y^{i})}$ is the Dirac measure concentrated at $(x^{i},y^{i})$ , $i=0,1$ . Elementary computations show that for such $\eta$ (12) becomes (7). ∎

The equivalence of assertions (b) and (c) of Theorem 2.2 is a special case of Lemma 2.12, whose proof relies on the following geometric interpretation of (7). Figure 1 visualizes the arguments.

Lemma 2.11.

Let $x^{i}$ and $y^{i}$ , $i=0,1$ , be points in $\mathbb{R}^{2}$ such that $y^{0}_{1}<y^{1}_{1}$ . Then (7) holds if and only if for all $a_{i}<c(x^{i},y^{i})$ , $i=0,1$ , the graphs of the hyperbolas

[TABLE]

do not intersect:

[TABLE]

Proof.

The result follows from the identity:

[TABLE]

where $s\in(y^{0}_{1},y^{1}_{1})$ and $t=(s-y^{0}_{1})/(y^{1}_{1}-y^{0}_{1})$ . ∎

Lemma 2.12.

For a set $A\subset\mathbb{R}^{2}\times\mathbb{R}^{2}$ the following conditions are equivalent:

(i)

If points $(x^{0},y^{0})$ and $(x^{1},y^{1})$ belong to $A$ , then (7) holds. 2. (ii)

There is $G\in\mathfrak{M}$ such that

[TABLE]

Proof.

We observe first that under either (i) or (ii),

[TABLE]

Indeed, under (i) this inequality follows from (7), while under (ii) it holds because $\phi_{G}\leq 0$ . We define the open sets

[TABLE]

where, for $(x,y)\in A$ ,

[TABLE]

The boundaries of $B^{i}$ , $i=0,1$ , are, respectively, upper and lower envelopes of the graphs of increasing hyperbolas and thus, are maximal monotone sets.

By Lemma 2.11, item (i) holds if and only if the sets $B^{0}$ and $B^{1}$ are disjoint:

[TABLE]

On the other hand, item (ii) holds if and only if the closed set

[TABLE]

contains a maximal monotone set $G$ . As

[TABLE]

every such set $G$ separates $B^{0}$ and $B^{1}$ . Hence, its existence yields (14) and then (i). Conversely, if the sets $B^{0}$ and $B^{1}$ are disjoint, then their boundaries belong to $C$ . As the boundaries are maximal monotone sets, we obtain (ii). ∎

The remaining assertions of the theorem follow from Lemma 2.14. A key role is played by inequality (15).

Lemma 2.13.

Let $\gamma\in\Gamma(\nu)$ and $\mu$ be its $x$ -marginal. For every $G\in\mathfrak{M}$ we have that

[TABLE]

and then that

[TABLE]

Proof.

We only need to prove the inequality with conditional expectations. From Lemma B.1 we know that $\phi_{G}\leq 0$ . As $x=\gamma(y|x)$ , we deduce that for every $r\in\mathbb{R}^{2}$ :

[TABLE]

and taking $\inf$ over a dense countable set of $r\in G$ obtain the result. ∎

The following lemma completes the proof of the theorem.

Lemma 2.14.

Let $\gamma\in\Gamma(\nu)$ , $\mu$ be its $x$ -marginal, and $G\in\mathfrak{M}$ be such that

[TABLE]

Then in (16) we actually have the equality, $\gamma$ is an optimal plan for (5), the set $G$ contains $\operatorname{supp}{\mu}$ , and

[TABLE]

Proof.

We shall write $\phi$ for $\phi_{G}$ . From (15) and (16) we deduce that $\gamma$ is a solution to (5), that in (16) we have an equality, and that

[TABLE]

Lemma B.1 states that $\phi\leq 0$ and $G=\left\{{x\in\mathbb{R}^{2}}:\;\phi(x)=0\right\}$ . It follows that $\mu(G)=1$ . Being a closed set, $G$ contains $\operatorname{supp}{\mu}$ . In particular, if $(x,y)\in\operatorname{supp}{\gamma}$ , then $x\in G$ . It follows that

[TABLE]

Accounting for (16), we deduce that

[TABLE]

Hence, for every $(x,y)\in\operatorname{supp}{\gamma}$ we can find a sequence $\left\{{(x_{n},y_{n})}\right\}\subset\operatorname{supp}{\gamma}$ that converges to $(x,y)$ and such that $c(x_{n},y_{n})=\phi(y_{n})$ , $n\geq 1$ . Being a pointwise infinum of continuous functions, the function $\phi$ is upper semi-continuous. It follows that

[TABLE]

and we obtain (17). ∎

3 Dual problem

In view of Theorem 2.2 and Lemma 2.13, a natural dual problem to (5) is to

[TABLE]

Such problem appears in Rochet and Vila (1994) in connection to their study of Kyle-type equilibrium with insider; see Section 6. They use a direct method based on the properties of the space of closed graph correspondences and assume that $\nu$ has a compactly supported density.

We recall that $G=\left\{{x\in\mathbb{R}^{2}}:\;\phi_{G}(x)=0\right\}$ , $G\in\mathfrak{M}$ , and thus the family $\mathfrak{M}$ of all maximal monotone sets in $\mathbb{R}^{2}$ is in one-to-one correspondence with the family of functions

[TABLE]

Hence, (18) is equivalent to the problem, where we

[TABLE]

A technical inconvenience of the set $\Phi$ is the absence of convexity. It turns out that the set of functions dominated by the elements of $\Phi$ is not only convex, but also admits a self-contained description related to item (b) of Theorem 2.2.

Lemma 3.1.

Let $\phi:\;\mathbb{R}^{2}\rightarrow[-\infty,0]$ be a Borel function. Then $\phi\leq\phi_{G}$ for some $G\in\mathfrak{M}$ if and only if

[TABLE]

The set of such functions $\phi$ is convex.

Proof.

The result follows directly from Lemma 2.12, where we take

[TABLE]

Clearly, the family of functions $\phi$ satisfying (19) is convex. ∎

Theorem 3.2.

Let $\nu\in\mathcal{P}_{2}(\mathbb{R}^{2})$ . We have that

[TABLE]

where the lower and upper bounds are attained at respective solutions to (5) and (18). A probability measure $\gamma\in\Gamma(\nu)$ and a maximal monotone set $G$ are such solutions if and only if

[TABLE]

In this case, $G$ contains the support of the $x$ -marginal of $\gamma$ . Moreover, $\phi_{G}$ and $G$ are uniquely defined on $\operatorname{supp}{\nu}$ , that is,

[TABLE]

for any other solution $\widetilde{G}$ to (18). In particular, $\phi_{G}$ and $G$ are unique if $\operatorname{supp}{\nu}=\mathbb{R}^{2}$ .

Proof.

With an exception of the uniqueness part, all other assertions follow directly from Theorem 2.2 and Lemmas 2.13 and 2.14.

Let $\gamma$ be a solution to (5), $G$ and $\widetilde{G}$ be solutions to (18) and denote $\phi=\phi_{G}$ and $\widetilde{\phi}=\phi_{\widetilde{G}}$ . From (20) we deduce that the functions $\phi$ and $\widetilde{\phi}$ coincide on $P_{y}$ , the projection of $\operatorname{supp}{\gamma}$ on $y$ -coordinates. Since every $y\in\operatorname{supp}{\nu}$ is the limit of a sequence $(y_{n})\subset P_{y}$ , Lemma B.3 yields that

[TABLE]

We have proved the uniqueness of $\phi_{G}$ on $\operatorname{supp}{\nu}$ . The uniqueness of $G$ on $\operatorname{supp}{\nu}$ holds as $G=\left\{{x\in\mathbb{R}^{2}}:\;\phi_{G}(x)=0\right\}$ . ∎

4 Optimal maps

For simplicity of notations, we slightly modify the setup. We start with a 2-dimensional random variable $Y=(Y_{1},Y_{2})$ having a finite second moment: $Y\in\mathcal{L}^{2}=\mathcal{L}^{2}(\Omega,\mathcal{F},\mathbb{P})$ . As usual, we identify random variables that differ only on a set of measure zero. Our goal is to

[TABLE]

for the same cost function $c(x,y)=(x_{1}-y_{1})(x_{2}-y_{2})$ and the domain

[TABLE]

We denote $\nu=\operatorname{{Law}}(Y)$ and observe that $\operatorname{{Law}}(X,Y)\in\Gamma(\nu)$ for every $X\in\mathcal{X}(Y)$ . Thus, optimal plan problem (5) may be viewed as a Kantorovich-type relaxation of optimal map problem (21). In general,

[TABLE]

and the inequality may be strict and an optimal map may not exist as Examples 5.2 and 5.3 show.

The main results of this sections are Theorems 4.1, 4.5, and 4.6. Theorem 4.1 yields the equality in (22) provided that $\nu=\operatorname{{Law}}(Y)$ is atomless. Theorem 4.5 shows the existence of an optimal map if $\nu$ is $\mathcal{D}$ -regular in the sense of Definition 4.4. Theorem 4.6 establishes the uniqueness of optimal plan and map if, in addition, every component $Y_{i}$ has a continuous distribution function. The last two theorems play a key role in the study of equilibrium in Section 6.

We shall use the notations from Appendix B related to the function $\phi=\phi_{G}$ , where $G\in\mathfrak{M}$ . In particular, $D^{c}=(D^{c}_{1},D^{c}_{2})$ stands for the differential operator associated with the cost function $c=c(x,y)$ :

[TABLE]

where $\operatorname{dom}{\nabla\phi}$ is the set of points where $\phi$ is differentiable. We denote by $E^{G}=E_{1}^{G}\cup E_{2}^{G}$ the union of the vertical and horizontal line segments of $G$ :

[TABLE]

Clearly, the sets $(\mathcal{T}^{G}_{i})$ are countable. Finally, we define

[TABLE]

The following result is similar to that of Pratelli (2007) obtained for the classical unconstrained optimal transport problem.

Theorem 4.1.

Let $Y=(Y_{1},Y_{2})\in\mathcal{L}^{2}$ and suppose that $\nu=\operatorname{{Law}}(Y)$ is atomless. Then plan and map problems (5) and (21) have identical values:

[TABLE]

The proof of the theorem relies on some lemmas.

Lemma 4.2.

Let $\nu\in\mathcal{P}_{2}(\mathbb{R}^{2})$ , $\gamma\in\Gamma(\nu)$ be an optimal plan for (5), and $G\in\mathfrak{M}$ be a maximizer for (18). If $\nu(G)>0$ , then the probability measure

[TABLE]

has the martingale property: $\eta(y|x)=x$ .

Proof.

We write $\phi$ for $\phi_{G}$ . We shall show that $\eta(y_{1}|x)=x_{1}$ , that is, that

[TABLE]

for every bounded Borel function $f$ on $\mathbb{R}^{2}$ . The martingale property for the second coordinate has a similar proof.

Let $E_{2}=E_{2}^{G}=\cup_{t\in\mathcal{T}_{2}}E_{2}(t)$ be the union of the horizontal line segments of $G$ . If $(x,y)\in\operatorname{supp}{\gamma}$ , then Theorem 3.2 yields that $x\in G$ and $c(x,y)=\phi(y)$ . If, in addition, $y\in G\setminus E_{2}$ , then $c(x,y)=\phi(y)=0$ and, subsequently, $x_{1}=y_{1}$ . Hence, (24) holds if

[TABLE]

Hereafter, we fix $t\in\mathcal{T}_{2}$ . Let $(x,y)\in\operatorname{supp}{\gamma}$ . If $y\in E_{2}(t)$ , then $c(x,y)=\phi(y)=0$ and thus, $x\in E_{2}(t)$ . Conversely, if $x\in\operatorname*{ri}{E_{2}(t)}$ , the relative interior of $E_{2}(t)$ , then Lemma B.4 yields that $y\in G$ and then, as $c(x,y)=\phi(y)=0$ , that $y\in E_{2}(t)$ . Hence,

[TABLE]

where $a(t)$ and $b(t)$ are the boundary points of $E_{2}(t)$ such that $a_{1}(t)<b_{1}(t)$ . Accounting for the martingale property of $\gamma$ , we obtain that

[TABLE]

Let $y\in\mathbb{R}^{2}$ be such that $\phi(y)=c(b(t),y)$ . If $y\not\in G$ , then Lemma B.4 yields that $b_{1}(t)>y_{1}$ . If $y\in G\setminus E_{2}(t)$ , then $c(b(t),y)=\phi(y)=0$ and thus, $b_{1}(t)=y_{1}$ . Finally, if $y\in E_{2}(t)$ , then $b_{1}(t)\geq y_{1}$ . It follows that

[TABLE]

where at the last step we used the martingale property of $\gamma$ . The case of the left boundary $a(t)$ is similar. We have proved (25). ∎

Lemma 4.3.

Let $G\in\mathfrak{M}$ and $X=(X_{1},X_{2})$ and $Y=(Y_{1},Y_{2})$ be random variables such that $X$ takes values in $G$ , $X1_{\left\{{Y\in G}\right\}}=Y1_{\left\{{Y\in G}\right\}}$ , and $c(X,Y)=\phi_{G}(Y)$ . If the law of $Y$ is atomless, then for every $\epsilon>0$ there is a random variable $Z=Z(\epsilon)$ such that $\operatorname{{Law}}(Z)=\operatorname{{Law}}(Y)$ , $\left\lvert Z-Y\right\rvert\leq\epsilon$ , and $X$ is $Z$ -measurable.

Proof.

We fix $\epsilon>0$ and denote $\phi=\phi_{G}$ , $\operatorname{Arg}=\operatorname{Arg}_{G}$ , and

[TABLE]

Theorems B.6 and B.12 show that $D=\cup_{n}D_{n}$ , where $D_{n}$ is either a point or the graph of a strictly decreasing function. Of course, we can choose the sets $(D_{n})$ so that

[TABLE]

For every $n\geq 1$ we shall construct a two-dimensional random variable $Z^{n}=(Z^{n}_{1},Z^{n}_{2})$ and a Borel function $f^{n}:\;D_{n}\rightarrow G$ such that

[TABLE]

Given the sequence of such pairs $(Z^{n},f^{n})$ , $n\geq 1$ , we define

[TABLE]

where $F_{n}=D_{n}\setminus\cup_{k<n}D_{k}$ . We have that $\operatorname{{Law}}(Z)=\operatorname{{Law}}(Y)=\nu$ and $\left\lvert Z-Y\right\rvert\leq\epsilon$ . Moreover, in view of Theorem B.6, $X=f(Z)$ . Hence, (26) is all we need to obtain.

Using the conditional probabilities with respect to events $\left\{{Y\in D_{n}}\right\}$ , we can reduce the general case to the situation where

[TABLE]

for some strictly decreasing function $h=h(t)$ . Since $\nu$ is atomless, every component $Y_{i}$ has a continuous distribution function $a_{i}(t)=\mathbb{\mathbb{P}}\left(Y_{i}\leq t\right)$ , $i=1,2$ . It follows that $U=a_{1}(Y_{1})$ has the uniform distribution on $[0,1]$ and $Y_{1}=a_{1}^{-1}(U)$ , where $a_{1}^{-1}$ is the pseudo-inverse function to $a_{1}$ :

[TABLE]

In particular, $Y=(Y_{1},h(Y_{1}))$ is $U$ -measurable.

Lemma B.16 yields Borel functions $g_{i}:\;D\rightarrow G$ , $i=1,2$ , such that either $X=g_{1}(Y)$ or $X=g_{2}(Y)$ . As the functions

[TABLE]

are continuous and increasing, the random variable

[TABLE]

has the uniform distribution on $[0,1]$ . Clearly, $U$ and the indicators $(1_{\left\{{X=g_{i}(Y)}\right\}})$ are $V$ -measurable. It follows that $Y$ and $X$ are also $V$ -measurable. Setting

[TABLE]

we obtain that $Z=(Z_{1},Z_{2})$ has the same law as $Y$ , that $V=a_{1}(Z_{1})$ , and that $X$ is $Z$ -measurable. ∎

Proof of Theorem 4.1.

Let $\gamma$ be an optimal plan for (5). By extending, if necessary, the underlying probability space we can assume that $\gamma=\operatorname{{Law}}(X,Y)$ for some random variable $X$ . As $\gamma(y|x)=x$ , we have that $X=\mathbb{E}\left(\left.Y\right\lvert{X}\right)$ . Theorem 2.2 yields $G\in\mathfrak{M}$ such that $X\in G$ and $c(X,Y)=\phi_{G}(Y)$ .

We denote $\widetilde{X}=X1_{\left\{{Y\not\in G}\right\}}+Y1_{\left\{{Y\in G}\right\}}$ and observe that $\widetilde{\gamma}=\operatorname{{Law}}(\widetilde{X},Y)$ is another optimal plan. Indeed, by Lemma 4.2,

[TABLE]

and therefore, for a bounded Borel function $g=g(x)$ on $\mathbb{R}^{2}$ ,

[TABLE]

It follows that $\mathbb{E}\left(\left.Y\right\lvert{\widetilde{X}}\right)=\widetilde{X}$ and thus, $\widetilde{\gamma}\in\Gamma(\nu)$ . By the construction of $\widetilde{X}$ , we have that $c(X,Y)=c(\widetilde{X},Y)$ and the optimality of $\widetilde{\gamma}$ follows. This fact allows us to assume from the start that $X1_{\left\{{Y\in G}\right\}}=Y1_{\left\{{Y\in G}\right\}}$ . Then, $X$ and $Y$ satisfy the assumptions of Lemma 4.3.

Let $\epsilon>0$ and $Z=Z(\epsilon)$ be the random variable yielded by Lemma 4.3. As $X$ is $Z$ -measurable, the conditional expectation $V\triangleq\mathbb{E}\left(\left.Z\right\lvert{X}\right)$ is also $Z$ -measurable. Thus, there is a Borel function $f:\;\mathbb{R}^{2}\rightarrow\mathbb{R}^{2}$ such that $V=f(Z)$ . Since $Y$ and $Z$ have identical laws, $U\triangleq f(Y)=\mathbb{E}\left(\left.Y\right\lvert{U}\right)$ . As

[TABLE]

we deduce that

[TABLE]

The result follows, because $\epsilon$ is any positive number. ∎

Let $\mathcal{D}$ be the family of graphs of strictly decreasing functions $f=f(t)$ defined on closed intervals of $\mathbb{R}$ such that both $f$ and its inverse $f^{-1}$ are Lipschitz functions:

[TABLE]

for some constant $K=K(f)>0$ . To make statements shorter we allow for a degenerate case where the domain of $f$ is just a point. Thus, $\mathbb{R}^{2}\subset\mathcal{D}$ .

Definition 4.4.

A Borel probability measure $\mu$ on $\mathbb{R}^{2}$ is $\mathcal{D}$ -regular if $\mu(D)=0$ , $D\in\mathcal{D}$ .

The following theorem establishes the existence of optimal maps that induce optimal plans.

Theorem 4.5.

Let $Y=(Y_{1},Y_{2})\in\mathcal{L}^{2}$ and suppose that $\nu=\operatorname{{Law}}(Y)$ is $\mathcal{D}$ -regular. Let $G\in\mathfrak{M}$ be a maximizer for (18) and denote $\phi=\phi_{G}$ and $E=E^{G}$ . Then

[TABLE]

is an optimal map for (21), $\gamma=\operatorname{{Law}}(X,Y)$ is an optimal plan for (5), and the law of $X$ is $\mathcal{D}$ -regular. Moreover, if $\widetilde{X}$ is an optimal map and $\widetilde{\gamma}$ is an optimal plan, then

[TABLE]

Proof.

Let $\widetilde{\gamma}\in\Gamma(\nu)$ be an optimal plan. From Theorem 3.2 we deduce that if $(x,y)\in\operatorname{supp}{\widetilde{\gamma}}$ , then $y\in\operatorname{dom}{\operatorname{Arg}}=\operatorname{dom}{\operatorname{Arg}_{G}}$ . In particular, $\nu(\operatorname{dom}{\operatorname{Arg}})=1$ . By Theorem B.12, the exception set $\operatorname{dom}{\operatorname{Arg}}\setminus\operatorname{dom}{\nabla\phi}$ belongs to the union of $E=E^{G}$ and of a countable family of sets from $\mathcal{D}$ . Since $\nu=\operatorname{{Law}}(Y)$ is $\mathcal{D}$ -regular, we have that

[TABLE]

It follows that the random variable $X=(X_{1},X_{2})$ is well-defined.

Theorem B.6 shows that if $y\in\operatorname{dom}{\nabla\phi}\setminus E$ , then $D^{c}\phi(y)$ is the only element of $\operatorname{Arg}(y)$ . It follows that $X\in G$ and $\phi(Y)=c(X,Y)$ . In view of Theorem 3.2, $\gamma\triangleq\operatorname{{Law}}(X,Y)$ is an optimal plan if it has the martingale property: $\gamma(y|x)=x$ .

If $(x,y)\in\operatorname{supp}{\widetilde{\gamma}}$ and $y\in\operatorname{dom}{\nabla\phi}\setminus E$ , then Theorems 3.2 and B.6 yield that $x=D^{c}\phi(y)$ . Since $\gamma$ and $\widetilde{\gamma}$ have common $y$ -marginal $\nu$ satisfying (29), they coincide outside of $\mathbb{R}^{2}\times E$ , that is, (28) holds.

Let $f=f(x)$ be a bounded Borel function on $\mathbb{R}^{2}$ . As $(Y-X)1_{\left\{{Y\in G}\right\}}=0$ , we deduce that

[TABLE]

where the last two equalities follow from the martingale property of $\widetilde{\gamma}$ and Lemma 4.2, respectively. Thus, $\gamma(y|x)=x$ . We have proved that $\gamma$ is an optimal plan and, in particular, that $X$ is an optimal map. The uniqueness property (27) for optimal maps follows directly from the corresponding property (28) for optimal plans.

It only remains to be shown that $\mu\triangleq\operatorname{{Law}}(X)$ is $\mathcal{D}$ -regular. As $\operatorname{supp}{\mu}\subset G$ and the intersection of $G$ with any set from $\mathcal{D}$ is a point, $\mu$ is $\mathcal{D}$ -regular if and only if it is atomless. Assume to the contrary, that $\mu(\left\{{r}\right\})>0$ for some $r\in G$ and define the Borel probability measure

[TABLE]

Being $\mathcal{D}$ -regular, the measure $\nu$ is atomless. Hence,

[TABLE]

From the optimality of $\gamma$ we deduce that

[TABLE]

and then that $\eta(D(r)\setminus G)=1>0$ . The martingale property of $\gamma$ yields that $\int yd\eta=r$ . The last two properties of $\eta$ and the fact that $\phi<0$ outside of $G$ imply the existence of $y^{0},y^{1}\in D(r)\setminus G$ such that

[TABLE]

By Lemma B.5, $D(r)$ belongs to the graph of a strictly decreasing linear function and thus, belongs to $\mathcal{D}$ . As $\nu$ is $\mathcal{D}$ -regular, we arrive to a contradiction: $\mu(\left\{{r}\right\})=\gamma(\left\{{r}\right\}\times D(r))\leq\nu(D(r))=0$ . ∎

We now state the main uniqueness result of the paper, which can be viewed as an adaptation of the classical Brenier theorem to our setting. We point out that our regularity assumption on $\nu$ is weaker than the standard condition of the Brenier theorem, which requires $\nu$ to assign zero mass to rotations of the graphs of Lipschitz functions.

Theorem 4.6.

Let $Y=(Y_{1},Y_{2})\in\mathcal{L}^{2}$ and suppose that $\nu=\operatorname{{Law}}(Y)$ is $\mathcal{D}$ -regular and the (one-dimensional) laws of $Y_{1}$ and $Y_{2}$ are atomless. Let $G\in\mathfrak{M}$ be a maximizer for (18) and denote $\phi=\phi_{G}$ . Then $X=D^{c}\phi(Y)$ or, in more detail,

[TABLE]

is the unique optimal map for (21) and the law of $(X,Y)$ is the unique optimal plan for (5). Moreover, the law of $X$ is $\mathcal{D}$ -regular and the laws of $X_{1}$ and $X_{2}$ are atomless.

Proof.

We omit $G$ from the notations (23) related to its vertical and horizontal line segments. As the law of $Y_{i}$ is atomless and the set $\mathcal{T}_{i}$ is countable, we deduce that

[TABLE]

Except the continuity of the distribution functions for $X_{1}$ and $X_{2}$ , all other assertions follow directly from Theorem 4.5.

We shall prove that the law of $X_{2}$ is atomless. If $t\not\in\mathcal{T}_{2}$ , then the set ${E}_{2}(t)$ is a singleton: $E_{2}(t)=\left\{{z}\right\}$ . By Theorem 4.5, the law of $X$ is $\mathcal{D}$ -regular and, in particular, atomless. It follows that $\mathbb{\mathbb{P}}\left(X_{2}=t\right)=\mathbb{\mathbb{P}}\left(X=z\right)=0$ .

Let $t\in\mathcal{T}_{2}$ . Lemma B.4 shows that if $x\in\operatorname*{ri}{E_{2}(t)}$ and $\phi(y)=c(x,y)$ , then $c(x,y)=0$ and subsequently, $y\in E_{2}(t)$ . As $X\in G$ , $\phi(Y)=c(X,Y)$ , and the law of $X$ is atomless, we obtain that

[TABLE]

where the last step holds by the continuity of the law of $Y_{2}$ . ∎

5 Examples

Example 5.1 (Linear optimal map).

Let $Y=(Y_{1},Y_{2})$ be a random variable in $\mathcal{L}^{2}$ such that $\mathbb{E}\left(Y_{i}\right)=0$ , $\mathbb{E}\left(Y_{i}^{2}\right)=\sigma_{i}^{2}>0$ , and

[TABLE]

The latter property holds if the distribution of $Y$ is Gaussian or, more generally, elliptically contoured.

We denote $\lambda=\frac{\sigma_{2}}{\sigma_{1}}>0$ and define $G=\left\{{x\in\mathbb{R}^{2}}:\;x_{2}=\lambda x_{1}\right\}$ and

[TABLE]

Elementary computations show that $\mathbb{E}\left(\left.Y\right\lvert{X}\right)=X=(X_{1},X_{2})$ and

[TABLE]

Being the graph of an increasing linear function, $G\in\mathfrak{M}$ . Setting $\nu=\operatorname{{Law}}(Y)$ , we deduce from Theorem 3.2 that $G$ and $\gamma=\operatorname{{Law}}(X,Y)$ are respective solutions to (18) and (5). Moreover, as $X$ is the only element of $G$ such that $\phi_{G}(Y)=c(X,Y)$ , the characteristic property (20) yields that $\gamma$ is the unique optimal plan. In particular, $X$ is the unique optimal map for (21).

Example 5.2 (Optimal map may not yield optimal plan).

Let $Y$ be a random variable taking values in $y^{0}=(-1,1)$ , $y^{1}=(0,-1)$ , and $y^{2}=(1,0)$ with probability $\frac{1}{3}$ . Direct computations show that the points

[TABLE]

belong to the set $G=\left\{{x\in\mathbb{R}^{2}}:\;c(x,y^{0})=-\frac{8}{9},\;x_{1}>y_{1}^{0}\right\}$ , that

[TABLE]

and that the probability measure

[TABLE]

belongs to $\Gamma(\nu)$ , where $\nu=\operatorname{{Law}}(Y)$ . Being the graph of an increasing hyperbola, $G\in\mathfrak{M}$ . By Theorem 2.2, $\gamma$ is an optimal plan for (5). The value of this problem is

[TABLE]

On the other hand, let $X\in\mathcal{X}(Y)$ , that is, $X$ is $Y$ -measurable and $X=\mathbb{E}\left(\left.Y\right\lvert{X}\right)$ . We write $x^{i}=X(y^{i})$ , $i=1,2,3$ . If all $(x^{i})$ are distinct, then $X=Y$ and $c(X,Y)=0$ . If they are the same point, then $X=\mathbb{E}\left(Y\right)=0$ and

[TABLE]

Finally, if precisely two of the elements of $(x^{i})$ coincide: $x^{k}=x^{l}\not=x^{m}$ , where $(k,l,m)$ is a permutation of $(0,1,2)$ , then $x^{k}=x^{l}=\frac{1}{2}(y^{k}+y^{l})$ , $x^{m}=y^{m}$ , and

[TABLE]

As $c(y^{0},y^{1})=c(y^{0},y^{2})=-2$ and $c(y^{1},y^{2})=1$ , the value function of the optimal map problem (21) is given by $-\frac{1}{3}$ , which is strictly less than $-\frac{4}{9}$ , the value of the optimal plan problem (5).

Example 5.3 (Optimal map may not exist).

Let $U$ and $V$ be independent symmetric random variables in $\mathcal{L}^{2}$ with $U$ having a continuous distribution function and $V$ taking values in $\left\{{-1,1}\right\}$ . We define a 2-dimensional random variable

[TABLE]

The components $Y_{1}$ and $Y_{2}$ have continuous distribution functions and, in particular, $\nu=\operatorname{{Law}}(Y)$ is atomless. By Theorem 4.1, the plan and map problems (5) and (21) have identical values. We shall prove that there is a unique optimal plan, which is not induced by a ( $Y$ -measurable martingale) map, and hence, shall show that an optimal map does not exist.

To this end, we define a 2-dimensional random variable

[TABLE]

We observe that $X$ takes values in the set

[TABLE]

consisting of two upward-slopping lines and thus, belonging to $\mathfrak{M}$ . Direct computations show that $\mathbb{E}\left(\left.Y\right\lvert{X}\right)=X$ and

[TABLE]

By Theorem 2.2, the law of $(X,Y)$ is an optimal plan and $G$ is a dual maximizer. We shall proceed to show that this is the only optimal plan and that it is not induced by a map from $\mathcal{X}(Y)$ .

From the construction of $Y$ we deduce the equality of the sets:

[TABLE]

It follows that

[TABLE]

Hence, $X$ is not $Y$ -measurable.

Let $\gamma\in\Gamma(\nu)$ be an optimal plan and $\mu$ be its $x$ -marginal. By Theorem 3.2, $\operatorname{supp}{\mu}\subset G$ and

[TABLE]

The random variable $Y$ takes values in $F=F_{1}\cup F_{2}$ , where

[TABLE]

Elementary computations show that for $x\in G$ the set of $y\in F$ such that $c(x,y)=\phi_{G}(y)$ consists of two points $g(x)$ and $f(x)$ such that

[TABLE]

For $x\not=0$ , $x\in G$ , the three points $\left\{{g(x),x,f(x)}\right\}$ are distinct and

[TABLE]

On the other hand, by the martingale property of $\gamma$ and the fact that $\nu(\left\{{0}\right\})=0$ , we have that

[TABLE]

and therefore, the conditional probabilities

[TABLE]

For a bounded Borel function $h=h(x,y)$ on $\mathbb{R}^{2}\times\mathbb{R}^{2}$ we then obtain that

[TABLE]

Hence, $\gamma$ is unique if and only if $\mu$ is unique. We observe now that the map $f:\;G\rightarrow F_{2}$ is one-to-one. Thus, for a Borel set $B\in\mathbb{R}^{2}$ ,

[TABLE]

and the uniqueness of $\mu$ follows.

6 Equilibrium with insider

We consider a single-period financial market. There are a bank account with zero interest rate and a stock. The stock value at maturity $t=1$ is represented by a random variable $V$ . The stock price $S$ at initial time $t=0$ is the result of the interaction between noise traders, an insider, and market makers, where

The noise traders place an order for $U$ stocks; $U$ is a random variable. 2. 2.

The insider knows the value of both $U$ and $V$ and places an order for $Q$ stocks. The trading strategy $Q$ is a $(U,V)$ -measurable random variable. 3. 3.

The market makers observe only the total order $R=Q+U$ . They quote the price $S=f(R)$ according to a pricing rule $f=f(r)$ , which is a Borel function $f:\;\mathbb{R}\rightarrow\overline{\mathbb{R}}\triangleq\mathbb{R}\cup\left\{{-\infty}\right\}\cup\left\{{\infty}\right\}$ .

Definition 6.1.

An equilibrium $(Q,f)$ is defined by a trading strategy $Q$ and a pricing rule $f=f(r)$ such that

Given the total order $R=Q+U$ , the price $S=f(R)$ is efficient in the sense that

[TABLE] 2. 2.

Given the pricing rule $f=f(r)$ , the order $Q$ maximizes insider’s profit:

[TABLE]

with the convention $0\times\infty=0$ .

*Remark 6.2**.*

Up to minor technical differences, our notion of equilibrium coincides with the one in Rochet and Vila (1994). It differs from the classical equilibrium from Kyle (1985) in the ability of the insider to observe noise traders’ order flow $U$ . In the model of Kyle (1985), the insider maximizes $\mathbb{E}\left(\left.Q(V-f(Q+U)\right\lvert{V}\right)$ over all $V$ -measurable random variable $Q$ .

The following result links the existence of equilibrium with the existence of an optimal map for (21) that induces an optimal plan for (5).

Theorem 6.3.

Let $Y=(U,V)\in\mathcal{L}^{2}$ and denote $\nu=\operatorname{{Law}}(Y)$ . An equilibrium $(Q,f)$ exists if and only if there is an optimal map $X$ for (21) such that the law of $(X,Y)$ is an optimal plan for (5). Insider’s profit is unique and given by

[TABLE]

where $G\in\mathfrak{M}$ is a maximizer for (18).

Moreover, there are equilibrium $(Q,f)$ and optimal map $X=(R,S)$ such that the pricing rule $f:\;\mathbb{R}\rightarrow\overline{\mathbb{R}}$ is an increasing function, the total order $Q+U=R$ , and the price $f(Q+U)=S$ .

We divide the proof of the theorem into lemmas.

Lemma 6.4.

Let $Y=(U,V)\in\mathcal{L}^{2}$ , $\nu=\operatorname{{Law}}(Y)$ , and $X=(R,S)$ be an optimal map for (21) such that $S$ is $R$ -measurable and the law of $(X,Y)$ is an optimal plan for (5). Then there is an increasing function $f:\;\mathbb{R}\rightarrow\overline{\mathbb{R}}$ such that $S=f(R)$ and $(Q,f)$ is an equilibrium with $Q=R-U$ .

Proof.

By construction, $S=\mathbb{E}\left(\left.V\right\lvert{R}\right)$ . Hence, we only need to verify the profit maximization condition for the order $Q=R-U$ . Theorem 2.2 yields $G\in\mathfrak{M}$ such that $(R,S)\in G$ and

[TABLE]

Let $P_{1}$ be the projection of $G$ on the first or $r$ -coordinate. Clearly, $P_{1}$ is an interval. As $S$ is $R$ -measurable, there is an increasing function $f=f(r)$ on $P_{1}$ such that $S=f(R)$ and $(r,f(r))\in G$ for $r\in P_{1}$ . By construction,

[TABLE]

We now extend $f$ to an increasing function from $\mathbb{R}$ to $\overline{\mathbb{R}}$ by setting its values to $-\infty$ on the left and to $+\infty$ on the right of $P_{1}$ . As $\phi_{G}(U,V)>-\infty$ , Lemma B.2 yields that $U$ takes values in the closure of $P_{1}$ . Under the standing convention: $0\times\infty=0$ , we obtain that

[TABLE]

Hence, $(Q,f)$ is an equilibrium with $Q=R-U$ . ∎

Lemma 6.5.

Let $f:\;\mathbb{R}\rightarrow\overline{\mathbb{R}}$ be a Borel function and

[TABLE]

with the convention: $0\times\infty=0$ . Then there is $G\in\mathfrak{M}$ such that $\phi\leq\phi_{G}$ .

Proof.

Given $y^{0},y^{1}\in\mathbb{R}^{2}$ and $t\in[0,1]$ , we denote $r=y^{0}_{1}+t(y^{1}_{1}-y^{0}_{1})$ and deduce that

[TABLE]

where in the middle we used the negative parts to account for the possibility that $\left\lvert f(r)\right\rvert=\infty$ . The result now follows from Lemma 3.1. ∎

Lemma 6.6.

Let $Y=(U,V)\in\mathcal{L}^{2}$ , $\nu=\operatorname{{Law}}(Y)$ , and $(Q,f)$ be an equilibrium with the total order $R=Q+U$ and the price $S=f(R)$ . Then $X=(\widetilde{R},S)$ with $\widetilde{R}=\mathbb{E}\left(\left.U\right\lvert{R}\right)$ is an optimal map for (21), the law of $(X,Y)$ is an optimal plan for (5), and

[TABLE]

Proof.

From the definition of the equilibrium we obtain that

[TABLE]

where $\phi(u,v)=\inf_{r\in\mathbb{R}}(u-r)(v-f(r))$ , $(u,v)\in\mathbb{R}^{2}$ . We claim that

[TABLE]

As the integrability properties of $R$ are unknown, we use a localization argument. For $n\geq 1$ from the martingale properties $\mathbb{E}\left(\left.V\right\lvert{R}\right)=S$ and $\mathbb{E}\left(\left.U\right\lvert{R}\right)=\widetilde{R}$ we deduce that

[TABLE]

Taking the limit as $n\to\infty$ , we obtain (31) by the dominated convergence theorem.

Lemma 6.5 yields $G\in\mathfrak{M}$ such that $\phi\leq\phi_{G}$ . From Lemma 2.13 we deduce that

[TABLE]

In view of (31) and since $\widetilde{\gamma}=\operatorname{{Law}}(\widetilde{R},S,U,V)$ belongs to $\Gamma(\nu)$ , we obtain that

[TABLE]

It follows that $\widetilde{\gamma}$ is an optimal plan, $(\widetilde{R},S)$ is an optimal map, and $\phi(U,V)=\phi_{G}(U,V)$ . Finally, Theorem 3.2 yields that

[TABLE]

and we obtain (30). ∎

Proof of Theorem 6.3.

If $X=(R,S)$ is an optimal map, then $\widetilde{X}=(R,\widetilde{S})$ with $\widetilde{S}=\mathbb{E}\left(\left.V\right\lvert{R}\right)=\mathbb{E}\left(\left.S\right\lvert{R}\right)$ is an optimal map as well and $\widetilde{S}$ is $R$ -measurable. By Theorem 3.2,

[TABLE]

for every maximizer $G\in\mathfrak{M}$ to (18). In particular, $c(X,Y)$ is the same random variable for every optimal map $X$ . After these observations, the proof follows from Lemmas 6.4 and 6.6. ∎

We now state sufficient conditions for the existence and uniqueness of equilibrium. Theorem 6.7 generalizes a result from Rochet and Vila (1994), where the distribution of $(U,V)$ has a compact support and a continuous density.

Theorem 6.7.

Let $Y=(U,V)\in\mathcal{L}^{2}$ and suppose that the law of ${Y}$ is $\mathcal{D}$ -regular. Then an equilibrium $(Q,f)$ exists.

If, in addition, the laws of $U$ and $V$ are atomless, then insider’s order $Q$ , the total order $R=Q+U$ , and the price $S=f(R)$ are unique. Moreover, $X=(R,S)$ is the unique optimal map for (21) and $\gamma=\operatorname{{Law}}(X,Y)$ is the unique optimal plan for (5).

For the proof we need a lemma.

Lemma 6.8.

Let $f:\;\mathbb{R}\rightarrow\overline{\mathbb{R}}$ be a Borel function and

[TABLE]

with the convention: $0\times\infty=0$ . There is a countable set $A\subset\mathbb{R}$ such that if $u,v\notin A$ and $\phi(u,v)=0$ , then

[TABLE]

Proof.

If $\phi(u,v)=\inf_{r\in\mathbb{R}}(u-r)(v-f(r))=0$ , then

[TABLE]

Clearly,

[TABLE]

Thus, if the increasing functions $g$ and $h$ are continuous and strictly increasing at $u$ , then $f^{-1}(v)=\left\{{u}\right\}$ . To conclude the proof we just observe that the set of arguments, where an increasing function is discontinuous, and the set of values, where it is not strictly increasing, are countable. ∎

Proof of Theorem 6.7.

If $\nu=\operatorname{{Law}}(Y)$ is $\mathcal{D}$ -regular, then Theorem 4.5 yields an optimal map $X$ such that the law of $(X,Y)$ is an optimal plan. By Theorem 6.3, there is an equilibrium $(Q,f)$ .

If the laws of $U=Y_{1}$ and $V=Y_{2}$ are atomless, then, by Theorem 4.6, the optimal map $X=(X_{1},X_{2})$ is unique. Lemma 6.6 shows that $S=X_{2}$ is the unique equilibrium price: $S=f(R)$ , where $R=Q+U$ .

Let $\phi$ be the function defined in Lemma 6.8 and $G\in\mathfrak{M}$ be a maximizer for (18). From the definition of the equilibrium and Theorem 6.3 we deduce that

[TABLE]

If $\phi(U,V)<0$ , then the total order $R$ is clearly unique. If $\phi(U,V)=0$ , then $R=U$ by Lemma 6.8 and the continuity of the distributions of $U$ and $V$ . Thus, the total order $R$ and insider’s order $Q=R-U$ are unique. By Theorem 6.3, the uniqueness of $R$ and $S$ implies that $(R,S)$ is an optimal map. Hence $X_{1}=R$ . ∎

Appendix A Closure of probability measures with bounded densities in

$\mathcal{W}_{p}(\mathbb{R}^{d})$

Let $p\geq 1$ . We denote by $\mathcal{W}_{p}(\mathbb{R}^{d})$ the space of Borel probability measures on $\mathbb{R}^{d}$ with finite $p$ -th moments equipped with the Wasserstein metric:

[TABLE]

where $\Pi(\mu,\nu)$ is the family of Borel probability measures $\gamma$ on $\mathbb{R}^{d}\times\mathbb{R}^{d}=\left\{{(x,y)}:\;x,y\in\mathbb{R}^{d}\right\}$ with $x$ -marginal $\mu$ and $y$ -marginal $\nu$ . We recall that $\mathcal{W}_{p}(\mathbb{R}^{d})$ is a complete separable metric space and that $\mu_{n}\rightarrow\mu$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ if and only if $\int f(x)d\mu_{n}\rightarrow\int f(x)d\mu$ for every continuous function $f$ with polynomial $p$ -th growth:

[TABLE]

Let $\nu\in\mathcal{W}_{p}(\mathbb{R}^{d})$ and $\mathcal{Q}_{\infty}(\nu)$ be the family of Borel probability measures on $\mathbb{R}^{d}$ that have bounded densities with respect to $\nu$ :

[TABLE]

Clearly, $\mathcal{Q}_{\infty}(\nu)\subset\mathcal{W}_{p}(\mathbb{R}^{d})$ . The following result, used in the proof of our main Theorem 2.2, describes the closure of $\mathcal{Q}_{\infty}(\nu)$ under $W_{p}$ .

Theorem A.1.

Let $p\geq 1$ and $\nu\in\mathcal{W}_{p}(\mathbb{R}^{d})$ . Then the closure of $\mathcal{Q}_{\infty}(\nu)$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ has the form:

[TABLE]

Proof.

If $\mu_{n}\to\mu$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ , then $\mu_{n}\to\mu$ weakly and thus,

[TABLE]

for every closed set $C$ . In particular, if $(\mu_{n})\subset\mathcal{S}_{p}(\nu)$ , then

[TABLE]

and hence, $\mu\in\mathcal{S}_{p}(\nu)$ . It follows that $\mathcal{S}_{p}(\nu)$ is closed in $\mathcal{W}_{p}(\mathbb{R}^{d})$ . Clearly, $\mathcal{S}_{p}(\nu)$ is convex.

If $\operatorname{supp}{\nu}$ is compact, then restricted to $\mathcal{S}_{p}(\nu)$ the convergence under $W_{p}$ is equivalent to the weak convergence and thus, the family of probability measures in $\mathcal{S}_{p}(\nu)$ with finite support is dense. Being a closed convex set, $\mathcal{S}_{p}(\nu)$ is then the closure of $\mathcal{Q}_{\infty}(\nu)$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ if and only if every Dirac measure $\delta_{y}=\delta_{y}(dx)$ concentrated at $y\in\operatorname{supp}{\nu}$ is the weak limit of a sequence $(\mu_{n})\subset\mathcal{Q}_{\infty}(\nu)$ . The sequence $(\mu_{n})$ with

[TABLE]

where $B_{r}(y)$ is the ball of radius $r>0$ centered at $y$ , has the required properties.

If $\operatorname{supp}{\nu}$ is not compact, we approximate $\mu\in\mathcal{S}_{p}(\nu)$ by the sequence $(\mu_{n})$ given by

[TABLE]

for some $y\in\operatorname{supp}{\mu}$ . We have that $(\mu_{n})\subset\mathcal{S}_{p}(\nu)$ and $\mu_{n}\to\mu$ under $W_{p}$ . By what we have already proved, each $\mu_{n}$ belongs to the closure of $\mathcal{Q}_{\infty}(\nu_{n})$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ , where

[TABLE]

As $\mathcal{Q}_{\infty}(\nu_{n})\subset\mathcal{Q}_{\infty}(\nu)$ , $n\geq 1$ , we deduce that the sequence $(\mu_{n})$ belongs to the closure of $\mathcal{Q}_{\infty}(\nu)$ in $\mathcal{W}_{p}(\mathbb{R}^{d})$ . Same property holds for its $W_{p}$ -limit $\mu$ and the result follows. ∎

Appendix B Properties of the function $\phi_{G}$

Let $G$ be a maximal monotone set: $G\in\mathfrak{M}$ . In this appendix, we collect the properties of the function

[TABLE]

used throughout the paper.

Lemma B.1.

The function $\phi=\phi_{G}$ and its $c$ -conjugate

[TABLE]

take values in $[-\infty,0]$ , $\phi^{c}\leq\phi$ , and

[TABLE]

Proof.

Let $y\in G$ . As $G\in\mathfrak{M}$ , we have that $c(x,y)\geq 0$ , $x\in G$ , and thus, $\phi(y)=0$ . Conversely, if $y\not\in G$ , then the maximal monotone set $G$ crosses the interior of either the upper-left or the lower-right quadrants relative to $y$ . If $z\in G$ belongs to such intersection, then

[TABLE]

We have shown that $\phi<0$ on $\mathbb{R}^{2}\setminus G$ and $\phi=0$ on $G$ . It follows that

[TABLE]

If $\phi^{c}(x)=0$ , then $\phi(x)=0$ and thus, $x\in G$ . Conversely, if $x\in G$ , then $c(x,y)-\phi(y)\geq 0$ , $y\in\mathbb{R}^{2}$ , and therefore, $\phi^{c}(x)=0$ . ∎

We associate with $\phi$ the closed convex function

[TABLE]

Clearly, $\phi$ and $\psi$ have same domains:

[TABLE]

For a convex set $A\subset\mathbb{R}^{d}$ we denote by $\operatorname*{cl}{A}$ , $\operatorname*{int}{A}$ , $\operatorname*{ri}{A}$ , and $\operatorname{\partial}{A}=\operatorname*{cl}{A}\setminus\operatorname*{ri}{A}$ its respective closure, interior, relative interior and relative boundary.

Lemma B.2.

The domain of $\phi$ is convex. If $G$ is either horizontal or vertical line, then $\operatorname{dom}{\phi}=G$ . Otherwise, $\operatorname{dom}{\phi}$ has a non-empty interior:

[TABLE]

where $P_{i}$ is the projection of $G$ on $x_{i}$ -coordinate, $i=1,2$ . If $y\in\operatorname{\partial}{\operatorname{dom}{\phi}}\cap\operatorname{dom}{\phi}$ , then the relative interiors of the horizontal and vertical parts of $\operatorname{\partial}{\operatorname{dom}{\phi}}$ containing $y$ also belong to $\operatorname{dom}{\phi}$ .

Proof.

Being convex, the function $\psi=\psi_{G}$ has convex domain. As $\operatorname{dom}{\phi}=\operatorname{dom}{\psi}$ , the domain of $\phi$ is also convex.

We observe that $P_{i}$ is either a point or an interval. If $P_{1}=\left\{{a_{1}}\right\}$ , then $G$ is a vertical line: $G=\left\{{x\in\mathbb{R}^{2}}:\;x_{1}=a_{1}\right\}$ . For $y\not\in G$ we have that $\left\lvert y_{1}-a_{1}\right\rvert>0$ and

[TABLE]

Thus, $\operatorname{dom}\phi=G$ . The case where $P_{2}$ is a point and thus, $G$ is a horizontal line is identical.

We assume now that $\operatorname*{int}{P_{i}}=(a_{i},b_{i})$ , where $-\infty\leq a_{i}<b_{i}\leq\infty$ . If $y=(y_{1},y_{2})\in(a_{1},b_{1})\times(a_{2},b_{2})$ , then the set $C\triangleq\left\{{x\in G}:\;c(x,y)\leq 0\right\}$ is bounded and therefore,

[TABLE]

Conversely, suppose that $y$ does not belong to the closure of $P_{1}\times P_{2}$ , say $y_{1}<a_{1}$ ; other cases are covered similarly. Then $a_{2}=-\infty$ and hence,

[TABLE]

We have proved (32).

For the last assertion of the lemma, we assume that $a_{1}>-\infty$ and take $y=(a_{1},y_{2})$ and $z=(a_{1},z_{2})$ with $z_{2}<b_{2}$ . Given that $\phi(y)>-\infty$ , we have to show that $\phi(z)>-\infty$ . Indeed, otherwise there is a sequence $(x^{n})\subset G$ such that

[TABLE]

Since $z_{2}<b_{2}$ , the sequence $(x_{1}^{n})$ is bounded and $x_{2}^{n}\rightarrow a_{2}=-\infty$ . It follows that

[TABLE]

and we obtain a contradiction. ∎

The closed convex function $\psi=\psi_{G}$ is lower semi-continuous on $\mathbb{R}^{2}$ and is continuous on the interior of its domain. The following result shows that $\phi$ and $\psi$ are continuous relative to their full domains.

Lemma B.3.

If $(y^{n})\subset\operatorname{dom}{\psi}=\operatorname{dom}{\phi}$ and $y^{n}\to y$ , then $\psi(y^{n})\to\psi(y)$ and $\phi(y^{n})\to\phi(y)$ .

Proof.

It is sufficient to consider the case of the function $\psi$ and take $y\in\operatorname{\partial}{\operatorname{dom}{\psi}}$ . If $\psi(y)=\infty$ , then the result holds by the lower semi-continuity:

[TABLE]

Thus, we assume that $\psi(y)<\infty$ or, equivalently, that $y\in\operatorname{\partial}{\operatorname{dom}{\psi}}\cap\operatorname{dom}{\psi}$ . By Lemma B.2, the relative interiors of the horizontal and vertical parts of $\partial\operatorname{dom}{\psi}$ containing $y$ belong to $\operatorname{dom}{\psi}$ . Hence, there is a closed triangle in $\operatorname{dom}{\psi}$ that contains $(y^{n})_{n\geq n_{0}}$ , for sufficiently large $n_{0}$ . Being convex, the function $\psi$ is continuous on this triangle and the result follows. ∎

We define a multi-valued function

[TABLE]

taking values in the closed (possibly empty) subsets of $G$ , and denote

[TABLE]

Let $E^{G}_{i}=\cup_{t\in\mathcal{T}^{G}_{i}}E^{G}_{i}(t)$ , $i=1,2$ , be the union of vertical and horizontal line segments of $G$ ; see (23). As the set $G$ is fixed, we write simply

[TABLE]

The following lemma shows that for $y\in\operatorname{dom}{\operatorname{Arg}}\setminus G$ the set $\operatorname{Arg}(y)$ can intersect $E_{i}(t)$ only at $\operatorname{\partial}{E_{i}(t)}$ . We denote $\left\langle x,y\right\rangle\triangleq\sum_{i=1}^{2}x_{i}y_{i}$ , the scalar product of $x,y\in\mathbb{R}^{2}$ .

Lemma B.4.

Let $i\in\left\{{1,2}\right\}$ and $t\in\mathcal{T}_{i}$ . If $y\in\operatorname{dom}{\operatorname{Arg}}\setminus G$ and $x\in E_{i}(t)\cap\operatorname{Arg}(y)$ , then $x$ belongs to the boundary of ${E_{i}(t)}$ and

[TABLE]

Proof.

Without loss of generality we can assume that $i=2$ and that $y$ stays above $G$ . Then the increasing hyperbola

[TABLE]

contains $x$ and lays below $G$ , which is only possible if $x$ is the right boundary of the horizontal line segment $E_{2}(t)$ . In this case,

[TABLE]

and the result follows. ∎

For $x,y\in\mathbb{R}^{2}$ we denote by $L(x,y)$ the line segment connecting $x$ and $y$ :

[TABLE]

Lemma B.5.

Let $y^{0}$ and $y^{1}$ belong to $\operatorname{dom}{\operatorname{Arg}}\setminus G$ and stay above and below $G$ , respectively. If $x\in\operatorname{Arg}(y^{0})\cap\operatorname{Arg}(y^{1})$ , then $x$ belongs to the line segment $L(y^{0},y^{1})$ connecting $y^{0}$ and $y^{1}$ .

Proof.

The conditions of the lemma imply that the increasing hyperbolas

[TABLE]

contain $x$ and stay below and above $G$ , respectively. Hence, they have identical tangent lines at $x$ . Elementary computations show that the slope of the tangent line is given by

[TABLE]

and the result follows. ∎

For $y\in\operatorname*{int}{\operatorname{dom}{\phi}}$ the derivative $\nabla\phi(y)$ is defined in the classical sense. For $y\in\operatorname{\partial}{\operatorname{dom}{\phi}}$ the derivative $\nabla\phi(y)$ exists if it is the limit: $\nabla\phi(y^{n})\to\nabla\phi(y)$ , for every sequence $(y^{n})\subset\operatorname*{int}{\operatorname{dom}{\phi}}\cap\operatorname{dom}{\nabla\phi}$ that converges to $y$ . We write

[TABLE]

By $D^{c}\triangleq(D^{c}_{1},D^{c}_{2})$ we denote the differential operator associated with the cost function $c=c(x,y)$ :

[TABLE]

Finally, let $E=E^{G}=E^{G}_{1}\cup E^{G}_{2}$ be the union of the vertical and horizontal line segments of $G$ and denote

[TABLE]

We observe that

[TABLE]

Theorem B.6.

We have that

[TABLE]

Conversely, the set difference $\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}\setminus\operatorname{dom}{\nabla\phi}$ has at most two points and these points belong to different linear parts of $\operatorname{\partial}{\operatorname{dom}{\phi}}$ . If $y\in\operatorname{dom}{\nabla\phi}\cap\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}$ , then $D^{c}\phi(y)$ is the only element of $\operatorname{Arg}(y)$ and

[TABLE]

We divide the proof of the theorem into lemmas. We write $x\leq y$ if $x_{i}\leq y_{i}$ , $i=1,2$ . If $x,y\in G$ and $x\leq y$ , then $G(x,y)$ denotes the segment of $G$ bounded by $x$ and $y$ :

[TABLE]

Lemma B.7.

Let $y\in\operatorname*{int}{\operatorname{dom}{\phi}}$ . Then $y\in\operatorname{dom}{\nabla\phi}$ if and only if $y\in\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}$ . In this case, $D^{c}\phi(y)$ is the only element of $\operatorname{Arg}(y)$ .

Proof.

From the structure of $\operatorname*{int}{\operatorname{dom}{\phi}}$ in Lemma B.2 we deduce the existence of $x^{0},x^{1}\in G$ such that $x^{0}\leq x^{1}$ , $y\in\operatorname*{int}{R(x^{0},x^{1})}$ , and $R(x^{0},x^{1})\subset\operatorname*{int}{\operatorname{dom}{\phi}}$ , where $R(x^{0},x^{1})$ is the rectangle with the diagonal $L(x^{0},x^{1})$ . Every $x\in G$ such that $c(x,y)\leq 0$ belongs to $G(x^{0},x^{1})=R(x^{0},x^{1})\cap G$ . Hence,

[TABLE]

As $G(x^{0},x^{1})$ is compact, $\operatorname{Arg}(y)$ is non-empty. If $x\in\operatorname{Arg}(y)$ , then

[TABLE]

It follows that $(x_{2},x_{1})$ belongs to $\partial\psi(y)$ , the subdifferential of $\psi=\psi_{G}$ at $y$ . Differentiability of $\phi$ (equivalently, of $\psi$ ) at $y$ then implies that $\operatorname{Arg}(y)$ is a singleton and

[TABLE]

Conversely, let $x$ be the only element of $\operatorname{Arg}(y)$ and $\widetilde{x}=(\widetilde{x}_{1},\widetilde{x}_{2})\in\mathbb{R}^{2}$ be such that $(\widetilde{x}_{2},\widetilde{x}_{1})\in\partial\psi(y)$ . We have to show that $x=\widetilde{x}$ . We take a unit vector $e=(e_{1},e_{2})$ in $\mathbb{R}^{2}$ and define a sequence $(y^{n})$ in $\mathbb{R}^{2}$ such that

[TABLE]

Let $n_{0}$ be an index such that $y^{n}\in\operatorname*{int}{R(x^{0},x^{1})}$ , $n\geq n_{0}$ . By the first part of the proof, for $n\geq n_{0}$ the set $\operatorname{Arg}(y^{n})$ is non-empty and belongs to the compact $G(x^{0},x^{1})$ . Moreover, if $x^{n}\in\operatorname{Arg}(y^{n})$ then $(x^{n}_{2},x^{n}_{1})\in\partial\psi(y^{n})$ . It follows that

[TABLE]

and then that $\left\langle x^{n},e\right\rangle\geq\left\langle\widetilde{x},e\right\rangle$ . As $x$ is the only element of $\operatorname{Arg}(y)$ and

[TABLE]

every convergent subsequence of $(x^{n})$ goes to $x$ and then $x^{n}\to x$ . Hence, $\left\langle x,e\right\rangle\geq\left\langle\widetilde{x},e\right\rangle$ and, as $e$ is an arbitrary unit vector in $\mathbb{R}^{2}$ , we obtain that $x=\widetilde{x}$ . ∎

Lemma B.8.

Let $y\in\operatorname{dom}{\operatorname{Arg}}\setminus G$ and $x\in\operatorname{Arg}(y)$ . Then

[TABLE]

and $D^{c}\phi(z)=x$ , $z\in\operatorname*{ri}{L(x,y)}$ . The slope of the line segment $L(x,y)$ is negative and has the form:

[TABLE]

Proof.

We fix $t\in(0,1)$ and denote $y(t)=ty+(1-t)x$ . From the description of $\operatorname*{int}{\operatorname{dom}{\phi}}$ in Lemma B.2 we deduce that $y(t)\in\operatorname*{int}{\operatorname{dom}{\phi}}$ . Without loss in generality we can assume that $y_{1}<x_{1}$ . Then the hyperbola

[TABLE]

contains $x$ and stays below $G$ , while the hyperbola

[TABLE]

contains $x$ and stays below $H$ . It follows that $x$ is the only element of $\operatorname{Arg}(y(t))$ . Lemma B.7 yields that $D^{c}\phi(y(t))=x$ . The last part of the lemma follows directly from the definition of $D^{c}\phi$ and the fact that $\phi(y(t))=c(x,y(t))=c(D^{c}\phi(y(t)),y(t))$ . ∎

The following corollary of Lemma B.8 will also be used in the proof of Theorem B.12.

Lemma B.9.

Let $y^{0}$ and $y^{1}$ be distinct points in $\operatorname{dom}{\operatorname{Arg}}\setminus G$ and $x^{i}\in\operatorname{Arg}(y^{i})$ , $i=1,2$ . Then either $x^{0}=x^{1}$ or the line segments $L(x^{0},y^{0})$ and $L(x^{1},y^{1})$ do not intersect.

Proof.

If $L(x^{0},y^{0})$ and $L(x^{1},y^{1})$ have common interior point $z$ , then Lemma B.8 yields that $x^{0}=D^{c}\phi(z)=x^{1}$ . ∎

Lemma B.10.

Let $y\in\operatorname{dom}{\nabla\phi}\setminus G$ . Then $D^{c}\phi(y)$ is the only element of $\operatorname{Arg}(y)$ .

Proof.

In view of Lemma B.7 we can further assume that $y\in\operatorname{\partial}{\operatorname{dom}{\phi}}$ . Let $(y^{n})$ be a sequence in $\operatorname*{int}{\operatorname{dom}{\phi}}\cap\operatorname{dom}{\nabla\phi}$ that converges to $y$ . By Lemma B.7, $D^{c}\phi(y^{n})$ is the only element of $\operatorname{Arg}(y^{n})$ . From the construction of $\nabla\phi$ on $\operatorname{\partial}{\operatorname{dom}{\phi}}$ and Lemma B.3 we deduce that

[TABLE]

Hence, $D^{c}\phi(y)\in\operatorname{Arg}(y)$ . On the other hand, if $x\in\operatorname{Arg}(y)$ , then Lemma B.8 allows us to choose the sequence $(y^{n})$ so that $D^{c}\phi(y^{n})=x$ . Hence, $x=D^{c}\phi(y)$ . ∎

Lemma B.11.

The set difference $\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}\setminus\operatorname{dom}{\nabla\phi}$ has at most two points and these points belong to different linear parts of $\operatorname{\partial}{\operatorname{dom}{\phi}}$ .

Proof.

From Lemma B.2 we deduce that $\operatorname*{int}{\operatorname{dom}{\phi}}=(a_{1},b_{1})\times(a_{2},b_{2})$ , where $-\infty\leq a_{i}<b_{i}\leq\infty$ and $(a_{i},b_{i})$ is the interior of the projection of $G$ on the $x_{i}$ -coordinate. Without loss of generality we can assume that $a_{1}>-\infty$ . Let $y^{0}$ and $y^{1}$ be such that $y^{0}_{1}=y^{1}_{1}=a_{1}$ , $y^{0}_{2}<y^{1}_{2}<b_{2}$ and $y^{0}\in\operatorname{dom}{\operatorname{Arg}}$ , $y^{1}\in\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}$ . We are going to show that $y^{1}\in\operatorname{dom}{\nabla\phi}$ . By doing so, we shall prove that the interior of each linear part of $\operatorname{\partial}{\operatorname{dom}{\phi}}$ has at most one element of $\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}/\operatorname{dom}{\nabla\phi}$ .

Let $(z^{n})$ be a sequence in $\operatorname*{int}{\operatorname{dom}{\phi}}\cap\operatorname{dom}{\nabla\phi}$ that converges to $y^{1}$ . Then $\sup_{n}z^{n}_{2}<b_{2}$ and there is $w\in G$ such that $\sup_{n}z^{n}_{2}\leq w_{2}<b_{2}$ . In view of Lemma B.7, $u^{n}=D^{c}\phi(z^{n})$ is the only element of $\operatorname{Arg}(z^{n})$ . If $x^{0}\in\operatorname{Arg}(y^{0})$ , then $y^{1}$ stays strictly above the line segment $L(x^{0},y^{0})$ and, as $z^{n}\to y^{1}$ , we can assume that same property holds for $(z^{n})$ . By Lemmas B.8 and B.10, the line segment $L(z^{n},u^{n})$ has negative slope and can intersect $L(y^{0},x^{0})$ only at $x^{0}$ . It follows that $u^{n}$ belongs to the compact set $G(x^{0},w)$ . Continuity of $\phi=\phi_{G}$ from Lemma B.3 yields that any convergent subsequence of $(u^{n})$ goes to the unique $x^{1}\in\operatorname{dom}{\operatorname{Arg}(y^{1})}$ . Hence, $y^{1}\in\operatorname{dom}{\nabla\phi}$ and $x^{1}=D^{c}\phi(y^{1})$ , by the definition of $\nabla\phi$ on $\operatorname{\partial}{\operatorname{dom}{\phi}}$ .

Similar arguments show that if the “corner” point $\widehat{y}=(a_{1},b_{2})\in\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}$ and there are $z^{0},z^{1}\in\operatorname{dom}{\operatorname{Arg}}$ that belong to the interiors of different linear parts of $\operatorname{\partial}{\operatorname{dom}{\phi}}$ , then $\widehat{y}\in\operatorname{dom}{\nabla\phi}$ . ∎

Proof of Theorem B.6.

From Lemmas B.7 and B.10 we deduce that

[TABLE]

Lemma B.2 shows that the boundary of $\operatorname{dom}{\phi}$ is contained in the union of two lines and that each of these lines is either vertical or horizontal. It follows that

[TABLE]

and we obtain (34). Lemma B.11 states the structure of $\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}/\operatorname{dom}{\nabla\phi}$ . Let $y\in\operatorname{dom}{\nabla\phi}\cap\operatorname{\widehat{\operatorname{dom}}}{\operatorname{Arg}}$ . Accounting for (33) we deduce that

[TABLE]

Lemmas B.7 and B.10 now yield that $D^{c}\phi(y)$ is the only element of $\operatorname{Arg}(y)$ . Finally, identity (35) holds by the definition of $D^{c}\phi$ . ∎

We recall that $\mathcal{D}$ denotes the family of graphs of strictly decreasing functions $h=h(t)$ defined on closed intervals of $\mathbb{R}$ such that $h$ and its inverse $h^{-1}$ are Lipschitz functions. We allow for a degenerate case where the domain of $h$ is just a point. Thus, $\mathbb{R}^{2}\subset\mathcal{D}$ .

Theorem B.12.

The exception set

[TABLE]

where $D$ is a countable union of sets in $\mathcal{D}$ and $E=E^{G}$ is the union of horizontal and vertical line segments of $G$ .

We divide the proof into lemmas. For $y\in\operatorname{dom}{\operatorname{Arg}}\setminus G$ and the points $r\leq s$ in $\operatorname{Arg}(y)$ , we denote by $\Delta(y,r,s)$ the closed curved triangle bounded by the line segments $L(r,y)$ , $L(y,s)$ , and the segment $G(r,s)$ of $G$ ; see Figure 2. If $r=s$ , then $\Delta(y,r,s)=L(r,y)=L(s,y)$ ; otherwise $\operatorname*{int}{\Delta(y,r,s)}\not=\emptyset$ .

Lemma B.13.

Let $y^{0},y^{1}$ be distinct points in $\operatorname{dom}{\operatorname{Arg}}\setminus G$ , let $r^{i}\leq s^{i}$ be in $\operatorname{Arg}(y^{i})$ , and denote $\Delta^{i}\triangleq\Delta(y^{i},r^{i},s^{i})$ , $i=0,1$ .

(a)

If $y^{0}\in\Delta^{1}$ , then $\Delta^{0}\subset\Delta^{1}$ . 2. (b)

If $y^{0}\not\in\Delta^{1}$ and $y^{1}\not\in\Delta^{0}$ , then the intersection of ${\Delta^{0}}$ and ${\Delta^{1}}$ is at most one point, which is then either $r^{1}=s^{0}$ or $s^{1}=r^{0}$ .

Proof.

If either (a) or (b) fails to hold, then there are line segments $L^{i}\in\left\{{L(r^{i},y^{i}),L(s^{i},y^{i})}\right\}$ , $i=0,1$ , that intersect only at an interior point. We obtain a contradiction with Lemma B.9. ∎

Lemma B.13 (a) yields a partial order relation on $\operatorname{dom}{\operatorname{Arg}}\setminus G$ : $y^{0}\prec y^{1}$ if $y^{0}\in\Delta(y^{1},r^{1},s^{1})$ for some $r^{1}\leq s^{1}$ in $\operatorname{Arg}(y^{1})$ .

Lemma B.14.

If $y^{0},y^{1}$ belong to $\operatorname{dom}{\operatorname{Arg}}\setminus G$ and $y^{0}\prec y^{1}$ , then

[TABLE]

that is, $D(y^{0},y^{1})$ is the graph of a strictly decreasing function $h=h(t)$ on $[y^{0}_{1},y^{1}_{1}]$ such that $h$ and its inverse $h^{-1}$ are Lipschitz functions.

Proof.

We illustrate the proof on Figure 2. Without loss of generality we can assume that $y^{0}$ and $y^{1}$ are distinct points that stay above $G$ . Let $r^{i}\leq s^{i}$ be in $\operatorname{Arg}(y^{i})$ . We have that $r^{1}\leq r^{0}\leq s^{0}\leq s^{1}$ . If $y^{0}$ belongs to the line segment $L(y^{1},s^{1})$ , then Lemma B.8 yields that

[TABLE]

and then that $D=L(y^{0},y^{1})$ . Same lemma shows that the line segment $L(y^{0},y^{1})$ has a negative slope and thus, belongs to $\mathcal{D}$ . The case, where $y^{0}\in L(y^{1},r^{1})$ is identical.

Hereafter, we assume that $y^{0}\not\in L(y^{1},s^{1})\cup L(y^{1},r^{1})$ or, equivalently, that $y^{0}\in\operatorname*{int}{\Delta(y^{1},r^{1},s^{1})}$ . Being a chord of the concave hyperbola

[TABLE]

which touches $G$ from below, the line segment $L(r^{1},s^{1})$ stays below $G$ . It follows that $y^{0}$ belongs to the interior of the triangle with vertices $\left\{{r^{1},y^{1},s^{1}}\right\}$ . Hence, there are unique $z^{1}\in\operatorname*{ri}{L(y^{1},s^{1})}$ and $z^{0}\in\operatorname*{ri}{L(r^{1},y^{1})}$ such that the line segments $L(r^{1},z^{1})$ and $L(s^{1},z^{0})$ intersect at $y^{0}$ .

We observe that the convex polygon $P$ with the vertices $\left\{{z^{1},y^{1},z^{0},y^{0}}\right\}$ contains every $y\in\operatorname{dom}{\operatorname{Arg}}\setminus G$ such that $y^{0}\prec y\prec y^{1}$ and thus, contains $D$ . Being convex, $\psi$ is bounded on $P$ . Hence, $\phi$ is bounded on $P$ as well. Moreover, as $P$ stays away from $G=\left\{{x\in\operatorname{dom}{\phi}}:\;\phi(x)=0\right\}$ , same boundedness property holds for $1/\phi$ . If $y\in P\cap\operatorname{dom}{\nabla\phi}$ , then Lemmas B.9 and B.10 show that $D^{c}\phi(y)$ belongs to the union of $G(r^{1},r^{0})$ and $G(s^{0},s^{1})$ . In particular, $D^{c}\phi$ and then also $\nabla\phi$ are bounded on $P\cap\operatorname{dom}{\nabla\phi}$ . From Lemma B.8 we deduce the existence of negative constants $a$ and $b$ such that

[TABLE]

Let $y,z\in D$ be distinct. Lemma B.13 yields that either $y\prec z$ or $z\prec y$ . Assuming that $z\prec y$ we deduce the existence of $r,s\in\operatorname{Arg}(y)$ such that $r\leq s$ and $z\in\Delta(y,r,s)$ . The slope of $L(y,z)$ is then bounded from below by the slope of $L(y,r)$ and from above by the slope of $L(y,s)$ , and thus is bounded in between by $a$ and $b$ :

[TABLE]

Hence, the set $D$ has the required Lipschitz properties.

It remains to be shown that the set $D$ is connected or, equivalently, that for every pair of distinct points $w^{0}\prec w^{1}$ in $D$ there is $w\in D$ , which is different from $w^{0}$ and $w^{1}$ and such that $w^{0}\prec w\prec w^{1}$ . Without loss of generality we can take $w^{0}=y^{0}$ and $w^{1}=y^{1}$ . We shall find the required $w$ in $L(z^{0},z^{1})$ .

Let $z(t)=(1-t)z^{0}+tz^{1}$ , $t\in[0,1]$ . From the non-intersection property of Lemma B.9 and the continuity of $\phi$ on its domain, we deduce that

If $t\in(0,1)$ and $\operatorname{Arg}(z(t))\cap G(r^{1},r^{0})\not=\emptyset$ , then $\operatorname{Arg}(z(s))\subset G(r^{1},r^{0})$ , $0\leq s<t$ . 2. 2.

If $(t_{n})\in[0,1]$ is such that $t_{n}\to t$ and $\operatorname{Arg}(z(t_{n}))\cap G(r^{1},r^{0})\not=\emptyset$ , $n\geq 1$ , then $\operatorname{Arg}(z(t))\cap G(r^{1},r^{0})\not=\emptyset$ .

Similar properties (with obvious modifications in the first item) hold when $G(r^{1},r^{0})$ is replaced with $G(s^{0},s^{1})$ . These properties readily yield the unique $t^{*}\in(0,1)$ such that $\operatorname{Arg}(z(t^{*}))$ intersects with both $G(r^{1},r^{0})$ and $G(s^{0},s^{1})$ . Clearly, $w=z(t^{*})$ is different from both $y^{0}$ and $y^{1}$ and $y^{0}\prec w\prec y^{1}$ , thanks to Lemma B.13. ∎

Lemma B.15.

The set

[TABLE]

if not empty, is a countable union of sets in $\mathcal{D}$ . More precisely,

[TABLE]

for some $u^{n}\leq v^{n}$ in $D$ , $n\geq 1$ .

Proof.

Clearly, $D=\cup_{n\geq 1}D(\frac{1}{n})$ , where

[TABLE]

Let $\epsilon>0$ . We denote by $\widehat{D}(\epsilon)$ the set of minimal elements of $D(\epsilon)$ with respect to the order relation $\prec$ . In other words, $\widehat{y}\in\widehat{D}(\epsilon)$ if any $y\in D(\epsilon)$ such that $y\prec\widehat{y}$ coincides with $\widehat{y}$ . From Lemma B.13 we deduce that $\widehat{D}(\epsilon)$ is countable. Let $y\in D(\epsilon)$ . If $y$ is not a minimal element, then there is $y^{\prime}\in D(\epsilon)$ such that $y^{\prime}\prec y$ , $y\not=y^{\prime}$ . Being contained in $\Delta(y,u,v)$ for some $u\leq v$ in $\operatorname{Arg}{(y)}$ , the set $\left\{{z\in D(\epsilon)}:\;z\prec y^{\prime}\right\}$ is bounded. By the continuity of $\phi=\phi_{G}$ , this set is closed and hence, contains some $\widehat{y}\in\widehat{D}(\epsilon)$ . It follows that

[TABLE]

Finally, for $y\in D$ , Lemmas B.13 and B.14 show that $\left\{{z\in D}:\;y\prec z\right\}$ is the graph of a strictly decreasing function $h$ such that $h$ and $h^{-1}$ are locally Lipschitz. The result readily follows. ∎

Proof of Theorem B.12.

By Theorem B.6 representation (36) holds if we add to the set $D$ given by (37) at most 2 points. Lemma B.15 yields the result. ∎

Lemma B.16.

Let $D$ be given by (37) and

[TABLE]

Then $S$ is countable and there are Borel functions $g_{i}:\;D\rightarrow G$ , $i=1,2$ , such that

[TABLE]

Proof.

In view of Lemma B.15, it is sufficient to prove the result for the sets $D^{\prime}=D(u,v)$ and $S^{\prime}=S\cap D^{\prime}$ , where $u\prec v$ in $D$ . Let $r\leq s$ be distinct elements of $\operatorname{Arg}(u)$ . The functions

[TABLE]

map $D^{\prime}$ to $G$ and are monotone with respect to the order relations $\prec$ on $D^{\prime}$ and $\leq$ on $G$ . Thus, their respective sets $(R_{i})$ of discontinuities are countable. From Lemma B.13 we deduce that $S^{\prime}\subset R_{1}\cup R_{2}$ and from the continuity of $\phi$ that $g_{i}(y)\in\operatorname{Arg}(y)$ , $y\in D^{\prime}$ . The proof readily follows. ∎

Bibliography8

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Ambrosio and Gigli (2013) Luigi Ambrosio and Nicola Gigli. A user’s guide to optimal transport. In Modelling and optimisation of flows on networks , volume 2062 of Lecture Notes in Math. , pages 1–155. Springer, Heidelberg, 2013. doi: 10.1007/978-3-642-32160-3˙1 . URL https://doi.org/10.1007/978-3-642-32160-3_1 . · doi ↗
2Beiglböck and Juillet (2016) Mathias Beiglböck and Nicolas Juillet. On a problem of optimal transport under marginal martingale constraints. Ann. Probab. , 44(1):42–106, 2016. ISSN 0091-1798. doi: 10.1214/14-AOP 966 . URL https://doi.org/10.1214/14-AOP 966 . · doi ↗
3Beiglböck et al. (2017) Mathias Beiglböck, Marcel Nutz, and Nizar Touzi. Complete duality for martingale optimal transport on the line. Ann. Probab. , 45(5):3038–3074, 2017. ISSN 0091-1798. doi: 10.1214/16-AOP 1131 . URL https://doi.org/10.1214/16-AOP 1131 . · doi ↗
4Ghoussoub et al. (2019) Nassif Ghoussoub, Young-Heon Kim, and Tongseok Lim. Structure of optimal martingale transport plans in general dimensions. Ann. Probab. , 47(1):109–164, 2019. ISSN 0091-1798. doi: 10.1214/18-AOP 1258 . URL https://doi.org/10.1214/18-AOP 1258 . · doi ↗
5Henry-Labordère and Touzi (2016) Pierre Henry-Labordère and Nizar Touzi. An explicit martingale version of the one-dimensional Brenier theorem. Finance Stoch. , 20(3):635–668, 2016. ISSN 0949-2984. doi: 10.1007/s 00780-016-0299-x . URL https://doi.org/10.1007/s 00780-016-0299-x . · doi ↗
6Kyle (1985) Albert S. Kyle. Continuous auctions and insider trading. Econometrica , 53:1315–1335, 1985.
7Pratelli (2007) Aldo Pratelli. On the equality between Monge’s infimum and Kantorovich’s minimum in optimal mass transportation. Ann. Inst. H. Poincaré Probab. Statist. , 43(1):1–13, 2007. ISSN 0246-0203. doi: 10.1016/j.anihpb.2005.12.001 . URL https://doi.org/10.1016/j.anihpb.2005.12.001 . · doi ↗
8Rochet and Vila (1994) Jean-Charles Rochet and Jean-Luc Vila. Insider Trading without Normality. The Review of Economic Studies , 61(1):131–152, 01 1994. ISSN 0034-6527. doi: 10.2307/2297880 . URL https://doi.org/10.2307/2297880 . · doi ↗

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

An optimal transport problem with backward martingale

Abstract

1 Introduction

2 A backward martingale optimal transport problem

Remark 2.1*.*

Theorem 2.2**.**

Corollary 2.3**.**

Proof.

Remark 2.4*.*

Corollary 2.5**.**

Proof.

Remark 2.6*.*

Lemma 2.7**.**

Proof.

Lemma 2.8**.**

Proof.

Lemma 2.9**.**

Proof.

Lemma 2.10**.**

Proof.

Lemma 2.11**.**

Proof.

Lemma 2.12**.**

Proof.

Lemma 2.13**.**

Proof.

Lemma 2.14**.**

Proof.

3 Dual problem

Lemma 3.1**.**

Proof.

Theorem 3.2**.**

Proof.

4 Optimal maps

Theorem 4.1**.**

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Proof of Theorem 4.1.

Definition 4.4**.**

Theorem 4.5**.**

Proof.

Theorem 4.6**.**

Proof.

5 Examples

Example 5.1** (Linear optimal map).**

Example 5.2** (Optimal map may not yield optimal plan).**

Example 5.3** (Optimal map may not exist).**

6 Equilibrium with insider

Definition 6.1**.**

Remark 6.2*.*

Theorem 6.3**.**

Lemma 6.4**.**

Proof.

Lemma 6.5**.**

Proof.

Lemma 6.6**.**

Proof.

Proof of Theorem 6.3.

Theorem 6.7**.**

Lemma 6.8**.**

Proof.

Proof of Theorem 6.7.

Appendix A Closure of probability measures with bounded densities in

Theorem A.1**.**

Proof.

Appendix B Properties of the function ϕG\phi_{G}ϕG​

Lemma B.1**.**

Proof.

Lemma B.2**.**

Proof.

Lemma B.3**.**

Proof.

*Remark 2.1**.*

Theorem 2.2.

Corollary 2.3.

*Remark 2.4**.*

Corollary 2.5.

*Remark 2.6**.*

Lemma 2.7.

Lemma 2.8.

Lemma 2.9.

Lemma 2.10.

Lemma 2.11.

Lemma 2.12.

Lemma 2.13.

Lemma 2.14.

Lemma 3.1.

Theorem 3.2.

Theorem 4.1.

Lemma 4.2.

Lemma 4.3.

Definition 4.4.

Theorem 4.5.

Theorem 4.6.

Example 5.1 (Linear optimal map).

Example 5.2 (Optimal map may not yield optimal plan).

Example 5.3 (Optimal map may not exist).

Definition 6.1.

*Remark 6.2**.*

Theorem 6.3.

Lemma 6.4.

Lemma 6.5.

Lemma 6.6.

Theorem 6.7.

Lemma 6.8.

Theorem A.1.

Appendix B Properties of the function $\phi_{G}$

Lemma B.1.

Lemma B.2.

Lemma B.3.

Lemma B.4.

Lemma B.5.

Theorem B.6.

Lemma B.7.

Lemma B.8.

Lemma B.9.

Lemma B.10.

Lemma B.11.

Theorem B.12.

Lemma B.13.

Lemma B.14.

Lemma B.15.

Lemma B.16.