Entropic approximation of $\infty$-optimal transport problems
Guillaume Carlier, Camilla Brizzi, Luigi De Pascale

TL;DR
This paper introduces an entropic approximation method for supremal cost optimal transport problems, proving convergence to $ty$-cyclically monotone plans and demonstrating numerical results with Sinkhorn's algorithm.
Contribution
It develops a novel entropic penalization approach for $ty$-optimal transport problems, establishing $ extGamma$-convergence and plan selection properties.
Findings
Proves $ extGamma$-convergence of the entropic approximation.
Shows the method selects $ty$-cyclically monotone plans.
Provides numerical illustrations using Sinkhorn's algorithm.
Abstract
We propose an entropic approximation approach for optimal transportation problems with a supremal cost. We establish -convergence for suitably chosen parameters for the entropic penalization and that this procedure selects -cyclically monotone plans at the limit. We also present some numerical illustrations performed with Sinkhorn's algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOptimization and Variational Analysis · Differential Equations and Boundary Problems · Nonlinear Differential Equations Analysis
Entropic approximation of -optimal transport problems
Camilla Brizzi Dipartimento di Matematica e Informatica, Università di Firenze, Viale Morgagni 67/a, 50134 Firenze, Italy, [email protected]
Guillaume Carlier CEREMADE, UMR CNRS 7534, Université Paris Dauphine, PSL, Pl. de Lattre de Tassigny, 75775 Paris Cedex 16, France and INRIA-Paris, MOKAPLAN, [email protected]
Luigi De Pascale Dipartimento di Matematica e Informatica, Università di Firenze, Viale Morgagni 67/a, 50134 Firenze, Italy, [email protected]
Abstract
We propose an entropic approximation approach for optimal transportation problems with a supremal cost. We establish -convergence for suitably chosen parameters for the entropic penalization and that this procedure selects -cyclically monotone plans at the limit. We also present some numerical illustrations performed with Sinkhorn’s algorithm.
Keywords: -optimal transport, -cyclical monotonicity, entropic approximation.
MS Classification: 49Q22, 65K10.
1 Introduction
The usual Monge-Kantorovich optimal transport problem consists, given a transportation cost and distribution of sources and targets, in finding a transport plan making the average transport cost minimal. It has attracted a considerable amount of attention in the last three decades, as can be seen from the textbooks of Villani [17], [18] and Santambrogio [15]. In optimal transport probems with a supremal cost (also called optimal transport), one rather looks for transport plans which minimize the essential supremum of the cost. Whereas the usual Monge-Kantorovich problem is linear programming, optimal transport leads to non-convex optimization (eventhough the supremal cost has convex sublevel sets), which to a large extent, explains why there are less theoretical results and numerical methods (with the notable exception of the recent combinatorial approach of Bansil and Kitagawa [1]) to address them. As in the Calculus of Variations with a supremal functional, optimal transport may admit many minimizers and selecting special ones which satisfy tractable optimality conditions is an important issue, which was studied first by Champion, De Pascale and Juutinen in [6]. In contrast with the classical Monge-Kantorovich problem, where restrictions of optimal plans remain optimal between their marginals, this might be false for optimal transport. This is why the authors of [6] have introduced the notion of restrictable optimal and shown that such restrictable solutions are characterized by a remarkable property of -cyclical monotonicity of their support. This was the starting point for the existence of optimal maps for optimal transport under various conditions on the cost and the marginals, see [6], [10], [3].
Among numerical methods for optimal transport (see Cuturi and Peyré [14], Benamou [2], Mérigot and Thibert [12]), the entropic penalization approach and the Sinkhorn algorithm have gained a lot of popularity since Cuturi’s paper [7]. Entropic optimal transport, which has connections with large deviations and the so-called Schrödinger bridge problem, see Léonard [11] has also stimulated an intensive stream of recent theoretical research, see the lecture notes of Nutz [13] and the references therein. A recent breakthrough in the field is the work of Bernton, Ghosal and Nutz [8] where a large deviations principle, related to cyclical monotonicity is established for entropic optimal plans.
The goal of the present paper is to investigate, theoretically and numerically, whether the entropic approximation strategy can be used for optimal transport as well. We will in particular see how the results of [8] can be used to show that this approximation selects at the limit the distinguished restrictable -cyclically monotone minimizers introduced in [6].
The article is organized as follows. The setting is introduced in Section 2. Section 3 is devoted to -convergence towards the supremal cost functional. In Section 4, we study how the entropic penalization selects -cyclically monotone plans in the limit. In Section 5, we give some quantitative convergence estimates and a large deviations upper bound in the spirit of [8]. Finally, we present some numerical illustrations in Section 6.
2 Assumptions and notations
In the sequel, we will always assume that the transportation cost is continuous and nonnegative, , and that the fixed marginals of the problem, are two Borel probability measures on , , with compact support. Let be the set of transport plans between and i.e. the set of Borel probability measures on having and as marginals. More precisely, a Borel probability measure on belongs to when
[TABLE]
for every Borel subset of . Note that every in has its support in and that is uniformly continuous on . We are interested in the following -optimal transport problem (see [6]):
[TABLE]
In contrast with classical optimal transport where one minimizes an integral cost,
[TABLE]
(-OT) is a non-convex and presumably harder problem.
Due to the success of entropic approximation of optimal transport with regularization parameter
[TABLE]
recalled in the introduction, it seems natural to introduce, for and exponent the following functional
[TABLE]
where stands for relative entropy:
[TABLE]
Note that due to strict convexity of the entropy, for every and , admits a unique minimizer. We now denote by , the supremal functional
[TABLE]
Finally, let us set
[TABLE]
Since with an equality exactly when , but also . So, roughly speaking both approximations play in opposite directions: adding the entropic term is an approximation from above but approximating by is an approximation from below.
We also observe that letting and is not enough to ensure that minimizers of converge to a minimizer of (i.e. a solution of -OT). Indeed, if and the minimizer of satisfies
[TABLE]
hence converges (actually strongly by Pinsker’s inequality, see e.g. Lemma 2.5 in [16]) to which in general is not a minimizer of . On the one hand, this suggests that -convergence of the regularizations above to require conditions relating to . On the other hand, in the previous example, we see that the range of compared to the size of the entropic penalization is crucial. But the solutions of the -optimal transport problem are invariant when one replaces by an increasing function of , in particular one can replace by (say) so that will typically dominate the entropic term and one can expect -convergence as for a fixed (or even large) value of (see the next section for more details).
3 -convergence
Our first result concerns the -convergence of to :
Theorem 3.1**.**
Under the general assumptions of Section 2 we have:
* -converges (for the weak star topology of ) to as provided as ,* 2. 2.
if, in addition, with , then -converges to as provided
[TABLE]
In particular, in this case, and -converge to as .
Proof.
- Let converge weakly star to . By nonnegativity of , we have
[TABLE]
Hence, for fixed , since for , we have
[TABLE]
taking the supremum with respect to thus yields the desired -liminf inequality
[TABLE]
Let us now prove the -limsup inequality. For any we consider , the block approximation of at scale defined by (3.3) below, whose convergence to is guaranteed by the first inequality in (3.4). By concavity, we first have for ,
[TABLE]
Denoting by a modulus of continuity of on , thanks to the first inequality in (3.4), we have
[TABLE]
being the diameter of the cubes of the approximation. Moreover, by the second inequality in (3.4), we have
[TABLE]
so if we define as the block approximation of at scale (say), we obtain
[TABLE]
since we have assumed that as .
- Let us now assume that , the proof of the -liminf inequality for is exactly as above. For and the block approximation of at scale , we have
[TABLE]
so that, as soon as (3.1) holds, one has
[TABLE]
∎
Remark 3.2*.*
Notice that in case for some , -convergence of to is guaranteed even for fastly increasing like with . On the contrary, in the general case, the condition requires to choose values of way too small to be used in practice for numerical computations. This suggests in practice to rescale the cost so that it is bounded from below by .
Remark 3.3*.*
We observe that in (3.2) it is sufficient that , therefore the conclusion of case 2. in Theorem 3.1 remains valid under the weaker assumption that .
For the -limsup inequality, we have used the block approximation introduced in [4], which is defined as follows:
Definition 3.4**.**
Let . For and , we denote by the cube . The block approximation of at scale is then defined by
[TABLE]
where and are defined by
[TABLE]
for every Borel subset of .
For the sake of completeness, we give a short proof of the properties of the block approximation that we have used in the proof of Theorem 3.1 (see [4] and [5] for related results):
Lemma 3.5**.**
Let and be the block approximation of at scale , then and
[TABLE]
where is a constant depending only on (actually on its diameter).
Proof.
The fact that is easy to check by construction (see [4]). Now observe that by (3.3) the density of with respect to is
[TABLE]
Therefore
[TABLE]
where the inequality is due to the fact that , while the last equality is obtained summing over . If is such that is contained in a cube of side , the number of cubes with positive -measure is not greater than . Therefore, applying Jensen’s inequality to the concave function , we have
[TABLE]
which proves the second inequality in (3.4).
By construction , for any . Let be the set of pairs of indices such that and set , for any . We define
[TABLE]
where and . By construction , thus
[TABLE]
∎
4 Selection of plans with -cyclically monotone support
As shown in [6] and [10], restrictable minimizers of are supported on -cyclically monotone sets, such sets are defined as follows:
Definition 4.1**.**
A set is said to be -cyclically monotone if we have that
[TABLE]
for all and , where . A transport plan is said to be -cyclically monotone if is an -cyclically monotone set.
Since every permutation can be obtained as composition of cycles on disjoint sets and trivial cycles on fixed points, one can see that -cyclical monotonicity of a set is equivalent to the fact that for every , every and every (where is the permutation group of ), one has
[TABLE]
Usually, in the literature, the previous definition is called --cylical monotonicity, to keep notations simple, we have omitted the dependence on the cost ; let us remark that -cyclical monotonicity is invariant by replacing by a strictly increasing transformations of (like with ), contrarily to the usual notion of -cyclical monotonicity. We recall that a nonempty subset of is called -cyclically monotone when for every , every and every permutation , one has
[TABLE]
Our goal in this section is to investigate the convergence of the entropic approximation to -cyclically monotone plans. We shall make use of the analysis of the landmark recent article [8]. Let us first recall the notion of -cyclically invariance introduced in [8]:
Definition 4.2**.**
Let be a measurable function. A coupling is called -cyclically invariant if and its density admits a representative such that
[TABLE]
for all and , where .
In [8] (Proposition 2.2), it is shown that whenever (-EOT) is finite, the (unique) solution of (-EOT) is characterized by being -cyclically invariant. The next lemma, which is a part of Lemma 3.1 in [8], provides an estimate for -cyclically invariant couplings, which will be useful for our purpose. For the reader’s convenience we provide also here the proof.
Lemma 4.3**.**
Let and be -cyclical invariant. For every fixed , , and , let be the set defined by
[TABLE]
where . Let be Borel. Then satisfies
[TABLE]
Proof.
By Definition 4.2 of -cyclical invariance, for a.e. we have that
[TABLE]
In one defines the set , by integrating over with respect to we obtain
[TABLE]
∎
The fact that the entropic approximation procedure selects -cyclically monotone plans is then ensured by the following:
Theorem 4.4**.**
Under the general assumptions of Section 2, further assume that everywhere, and let be the minimizer of . Then, any weak star cluster point as of the family is -cyclically monotone, provided
* as ,* 2. 2.
* if, in addition, with .*
Proof.
Up to extracting a subsequence, let us assume that weakly star converges to . We proceed by contradiction assuming that there exist and a finite sequence of points contained in , such that
[TABLE]
By the continuity of the cost function and by the uniform convergence of to , as , we deduce that for every there exists an open neighborhood of and , such that
[TABLE]
for every (again with the convention that ) and . We now observe that
[TABLE]
where the last inequality follows from the convexity of , with . Since there exists some such that on each , , hence, for every and
[TABLE]
We thus have , where is defined as in (4.2) with replaced by . Applying Lemma 4.3, we thus get:
[TABLE]
so that if as , for large enough one has , which yields
[TABLE]
On the other hand, since the points belong to , we have that , which yields the desired contradiction. This shows the first assertion. Now, if with , we can replace by in (4.5) and the same conclusion will be reached as soon as , proving the second assertion.
∎
Remark 4.5*.*
Despite what we observed in Remark 3.3 regarding Theorem 3.1, in the proof of the second assertion of Theorem 4.4, it does not seem that the condition for every can be weakened to . Note also that the condition is stronger than condition (3.1) that guarantees -convergence when .
5 Some estimates on the speed of convergence
Our aim in this Section is to give some error estimates for where
[TABLE]
where (i.e. for the sake of simplicity we take as entropic penalization parameter).
5.1 Upper bounds
Proposition 5.1** (Upper bounds on the speed of convergence).**
Let , with and let us assume that for some . Then we have
[TABLE]
Proof.
Let be a minimizer of and be the block approximation of at scale , as defined in (3.3). We observe that, by construction and by the Hölder condition on , denoting by the semi-norm of , we first have
[TABLE]
Then
[TABLE]
where the last inequality follows from Lemma 3.5. For , choosing , (5.2) becomes (setting )
[TABLE]
then, we observe that for large , one has
[TABLE]
Therefore, for large enough,
[TABLE]
for some and .
Now if , we choose in (5.2) which gives
[TABLE]
which ends the proof. ∎
5.2 Upper and lower bounds in the discrete case
Let us now consider the discrete case where there exist and points in such that
[TABLE]
with (strictly, without loss of generality) positive weights and summing to . To shorten notations let us set . In this setting, transport plans will simply be denoted as matrices with entries . We also recall that in the discrete setting is a convex polytope and the constraint is equivalent to
[TABLE]
In the discrete setting transport plans have a finite entropy with respect to , with the (crude) bound
[TABLE]
for every . So if with , taking a minimizer of , we obtain
[TABLE]
which gives (in a straightforward way, i.e. without using block approximation) an exponentially decaying upper bound for for and an algebraic upper bound if . The fact that therefore ensures that is bounded from above. It turns out, that in the discrete setting, this condition also guarantees that we also have an algebraically decaying lower bound for the error. To see this, we first need the following:
Lemma 5.2**.**
Let and be discrete measures i.e. of the form (5.3) and define
[TABLE]
and for every ,
[TABLE]
then there is some such that , for every .
Proof.
Since is the minimum of over , one can write as the set of transport plans for which
[TABLE]
or equivalently
[TABLE]
In other words, is the facet of where the linear form (which is nonnegative on ) achieves its minimum and it is therefore a convex polytope, whose extreme points belong to the (finite) set of extreme points of . Let us then denote by with a finite index set the set of extreme points of . Thanks to Minkowski’s theorem, we can write any as
[TABLE]
for some weights summing to . In particular we may pick with (with denoting the cardinality of ). Then we have
[TABLE]
where the strict positivity of then follows from the fact that is finite and for every . ∎
We are now ready to prove the announced lower bound.
Proposition 5.3** (Lower bound on the speed of convergence, discrete case).**
Assume that and are discrete measures i.e. of the form (5.3) and that , then is bounded from below. Hence
[TABLE]
Proof.
Let us argue by contradiction and assume that is unbounded from below, then there is a sequence as such that
[TABLE]
Letting be the minimizer of , passing to a subsequence if necessary, we may assume that converges to some which belongs to (as defined in Lemma 5.2) since . In particular, there exists such that
[TABLE]
where is the lower bound from Lemma 5.2. Since converges to we have, for large enough , , hence, using the fact that and again the nonnegativity of the entropy
[TABLE]
which is the desired contradiction to (5.4).
∎
5.3 A large deviations upper bound
In this (somehow independent) paragraph, our goal is to discuss a (partial) extension of the large deviations results of [8] to the -optimal transport framework. Considering the Monge-Kantorovich problem (OT) it is well-known (see [9], [15]) that the optimality for (OT) of a plan is characterized by a property of -cyclical monotonicity of its support , where -cyclical monotonicity is defined by (4.1). To analyze fine convergence properties of the entropic approximation of (OT), defined by (-EOT), assuming convergence (taking a subsequence if necessary) as , of the minimizer of (-EOT) to some and denoting by the -cyclically monotone set , the authors of [8] introduced
[TABLE]
with . They proved that is a good rate function for the family of optimal entropic plans, in the sense that it obeys, under very general conditions, the large deviations principle
[TABLE]
for every compact and every open included in . Denoting by the minimizer of , the results of [8] (using instead of ) of course apply to the convergence of as for a fixed exponent . For optimal transport, it makes more sense to rather consider the situation where is fixed and tends to . More precisely, we know from Theorem 4.4, that if , is fixed, the family weakly star converges (again possibly after an extraction) to some as , is -cyclically monotone. In addition to the general assumptions of Section 2, we shall further assume throughout this paragraph that
- •
,
- •
being fixed, the sequence of minimizers weakly star converges as to some , with (-cyclically monotone) support .
Let us define for every
[TABLE]
where . Also define
[TABLE]
where and . In our supremal optimal transport setting, we cannot really expect that is a good rate function for ; indeed, is unchanged when replacing with a strictly increasing function of , while the same does not hold for the function . However it can be interesting to have a better understanding of the function , which still provides an upper bound for the family (see Proposition 5.6).
Lemma 5.4**.**
Let and be defined as above, then
- •
* and are related by ,*
- •
* and are lower semicontinuous, , on ,*
- •
* and coincide on .*
Proof.
The fact that is obvious as well as the fact that on .
We now prove the converse inequality. Fix now , , in and . We can then partition into the (possibly empty) set of fixed-points of and disjoint (empty if is the identity) orbits on each of which is a cycle, this means that for , we may denote as and as with the convention . We now observe that
[TABLE]
where the max with respect to is taken on indices for which is nonempty. To shorten notations, for such a let us set
[TABLE]
Of course if is nonempty, , now if and is nonempty
[TABLE]
So, if , and if , then , hence by the definition of and the fact that is -cyclically monotone. In other words, we can bound from above each by . Taking suprema with respect to , in and , we thus get . Moreover, since on , on
Lower semi continuity of and follows from the continuity of . Finally assume that and , since is compact and , there exists such that . Taking , as a competitor in the definition of we see that hence . The same argument shows that and coincide on . ∎
Lemma 5.5**.**
Let us fix . Suppose that for some , , and , we have
[TABLE]
Then there exist , and such that
[TABLE]
where is the minimizer of .
Proof.
Of course if , one can just take so we may assume that . Reasoning as in the proof of Theorem 4.4 (recall that we have assumed ), we know that there exist and such that
[TABLE]
for every and . Then so, thanks to Lemma 4.3,
[TABLE]
Moreover , for all , for some since then
[TABLE]
for all (possibly replacing with a larger one). ∎
Proposition 5.6**.**
Under the assumptions of this paragraph, for any compact set , one has
[TABLE]
Proof.
First note that since is supported on ,
[TABLE]
and there is noting to prove if is disjoint from . Therefore we can assume that . It then follows from Lemma 5.4 that
[TABLE]
Now let and . By definition of there exist and , such that (setting as usual and )
[TABLE]
Note that the truncation is used to handle the case where . By Lemma 5.5 we know that there exist such that
[TABLE]
Then
[TABLE]
and, by compactness of ,
[TABLE]
which, letting , yields the desired upper bound. ∎
6 Numerical results
In this section, we present several numerical examples, with the aim of illustrating the discussions and theoretical analysis of the previous sections. We shall consider discrete marginals; let , with a slight abuse of notation, we will denote by and both the measures and the vectors of weights and and will denote both the transport plan and the matrix . For fixed , in this discrete setting, the minimization of reads
[TABLE]
Raising the above cost to the power , which does not change the minimizer, leads to a standard entropic transport problem. For such problems, we used in all our examples Sinkhorn’s algorithm (see for instance Chapter 4 in [14]) to find a good approximation (with error smaller than ) of the solution.
If , in light of Theorem 3.1, we expect the output of the Sinkhorn algorithm to be, for suitable and , also a good approximation of an optimal plan for the discretized - optimal transport problem
[TABLE]
Furthermore, if , thanks to Theorem 4.4, we expect to find a plan close to an -cyclically monotone one.
Remark 6.1*.*
As the set of transport plans is a convex polytope, for any there exists a finite set of indices , such that , with , and an extreme point of . If and , the set is the set of the so-called bi-stochastic matrices, whose extreme points, by Birkhoff’s theorem, form the set of pemutation matrices. We observe that, by definition of , and thus the minimum of is attained at some permutation matrix. Therefore, if and
[TABLE]
This can be in principle used to compute exactly. However this is not particularly useful in practice; regarding for instance the example on bottom of Figure 4, even if the size of and is the same, in order to calculate the exact value of we should be able to perform evaluations, which is infeasible in practice!
All the examples in this section, will be in dimension , will be represented by blue points, by red points and the plan will be represented by arrows: the black ones indicate that a blue point is sent to a red point with high probability, while the gray ones indicate that a blue point is sent to a red point with lower probability (but still not negligible).
In the first example, as shown by Figure 1, we consider , for , which is uniformly concentrated on the blue points
[TABLE]
and on the red points
[TABLE]
Note that with this choice of and , everywhere and therefore, thanks to Theorem 3.1 and Theorem 4.4, -convergence and convergence of the outputs towards -cm plans still hold choosing . We observe that for , every transport plan is optimal. Indeed, by the orthogonality of the two supports, any plan is concentrated on a cyclically monotone set (see (4.1)) and, as recalled in Section 5.3 (see for instance [9, 15]), this is a sufficient optimality condition. Here, since we look for a plan which minimizes the regularized problem which involves the entropy, the Sinkhorn algorithm selects the most diffuse one, as evidenced by the picture on the upper left of Figure 1. The other three pictures in Figure 1 show that convergence towards an -cm plan is really fast and it occurs already for .
Regarding the accuracy, Figure 2 shows that for and the distance between the first marginal of the output and the distance between the second marginal of and is of the order of after only iterations.
We have also considered the same example (see Figure 3) with the cost function . In this case the convergence is still fast and the error is small after few iterations (of order after about iterations).
Remark 6.2*.*
When , on the one hand, we don’t need to be small and we can even take it large as grows (by case 2. in Theorem 3.1 we can even choose for instance ). On the other hand, we can encounter some difficulties when computing the Gibbs kernel : if is large it can happen that, for some , making impossible to perform the division in the iterations of the primal version of the Sinkhorn algorithm. Fortunately, this problem can be overcome using the Log-Domain version (see for instance Section 4.4 in [14]), as we did in the following example, represented by Figure 4.
Figure 4, which shows a comparison among three different examples, considered for on the left and for on the right and . The two pictures on top in Figure 4 show the representation by arrows of the output when is uniformly concentrated on points which discretize the unitary square and is uniformly concentrated on the points and . This is a discretization of the case uniform on the square , where (see also Example 2.2 in [6]) every is optimal for the problem
[TABLE]
Indeed
[TABLE]
for every Since every plan is optimal, when is smaller, as shown in the picture on the left, the role of the entropy is more important and the algorithm selects the most diffuse plan. While increasing the value of the entropy becomes more and more negligible and output becomes sparser: already for (on the right) the output is a good approximation of the -cyclically monotone plan, which in this case is unique (see Theorem 5.6 in [6]). A small variation, represented by the two figures in the middle, is to consider which is not uniformly concentrated on the points and . Here we have taken . Finally, on the bottom, we have implemented the case in which also is the discretization of an absolutely continuous measure. Here approximates the square and the rectangle and both measures are supported on points. As previously, one can notice that for the entropy plays an important role and the algorithm selects the most diffuse plan, while, already for the plan is considerably sparser.
We are now interested in the asymptotic behavior of and we want to numerically represent the upper and lower bounds on the speed of convergence of towards proved in Proposition 5.1 and Proposition 5.3. In order to apply Proposition 5.1 and Proposition 5.3 it is enough to assume a lower bound on and not a pointwise one on .
Figure 5 provides an example of the asymptotic behavior of and of the speed of convergence in the case of and as the ones represented in the two pictures on top of Figure 4. In light of what we have just remarked, we have re-scaled the cost in order to have . For the image on top of Figure 5 shows in blue how changes varying , while is constant and is represented by the orange line. On bottom of Figure 5 we have represented in blue , in green the upper bound and in orange the lower bound , where by Proposition 5.1 (indeed in this case is Lipschitz so ) and have been estimated by a linear regression method (by least squares).
Finally, an example in which it is possible (even if it is really slow!) to compute exactly (see Remark 6.1) is represented in Figure 6. Here is concentrated on points, given by
[TABLE]
and is concentrated on equidistant points of the segment starting from the point to the point of the line . We have computed for the cost applying Remark 6.1, and we have obtained that and that the points which are at the minimal-maximal distance are and , connected by the purple segment in the picture. Regarding the speed of convergence we rescaled the cost in order to decrease further . As shown in Figure 7, is calculated varying in the interval , with . We observe that in this case, as shown in the picture on top, is initially smaller than , then it increases becoming greater and finally it starts decreasing converging to .
Acknowledgments: G.C. acknowledges the support of the Lagrange Mathematics and Computing Research Center. The research of C.B. and L.DP is partially financed by the “Fondi di ricerca di ateneo, ex 60 ” of the University of Firenze and is part of the project ”Alcuni problemi di trasporto ottimo ed applicazioni” of the GNAMPA-INDAM. C.B. and G.C. also acknowledge the support of the French Agence Nationale de la Recherche through the project MAGA (ANR-16-CE40-0014).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Mohit Bansil and Jun Kitagawa. 𝒲 ∞ subscript 𝒲 {\cal{W}}_{\infty} -transport with discrete target as a combinatorial matching problem. Arch. Math. (Basel) , 117(2):189–202, 2021.
- 2[2] Jean-David Benamou. Optimal transportation, modelling and numerical simulation. Acta Numer. , 30:249–325, 2021.
- 3[3] Camilla Brizzi, Luigi De Pascale, and Anna Kausamo. l ∞ superscript 𝑙 l^{\infty} -optimal transport for a class of quasiconvex cost functions, 2021.
- 4[4] Guillaume Carlier, Vincent Duval, Gabriel Peyré, and Bernhard Schmitzer. Convergence of entropic schemes for optimal transport and gradient flows. SIAM J. Math. Anal. , 49(2):1385–1418, 2017.
- 5[5] Guillaume Carlier, Paul Pegon, and Luca Tamanini. Convergence rate of general entropic optimal transport costs, 2022.
- 6[6] Thierry Champion, Luigi De Pascale, and Petri Juutinen. The ∞ \infty -Wasserstein distance: local solutions and existence of optimal transport maps. SIAM J. Math. Anal. , 40(1):1–20, 2008.
- 7[7] Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems , 26, 2013.
- 8[8] Marcel Nutz Espen Bernton, Promit Ghosal. Entropic optimal transport: Geometry and large deviations. ar Xiv:2102.04397 , 2021.
