Gaussian approximation for penalized Wasserstein barycenters
Nazar Buzun

TL;DR
This paper establishes a Gaussian approximation for regularized Wasserstein barycenters in Fourier basis, providing explicit convergence rates depending on sample size and parameter dimension.
Contribution
It introduces a Gaussian approximation for Fourier-based Wasserstein barycenters and derives finite-sample convergence rates with explicit dependence on key parameters.
Findings
Random Fourier parameters of barycenters converge to a Gaussian distribution.
Explicit convergence rates depend on the number of measures ($n$) and parameter dimension ($p$).
Theoretical results facilitate understanding of statistical properties of Wasserstein barycenters.
Abstract
In this work we consider regularized Wasserstein barycenters (average in Wasserstein distance) in Fourier basis. We prove that random Fourier parameters of the barycenter converge to some Gaussian random vector by distribution. The convergence rate has been derived in finite-sample case with explicit dependence on measures count () and the dimension of parameters ().
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeometric Analysis and Curvature Flows · Point processes and geometric inequalities
Gaussian approximation for penalized Wasserstein barycenters
Nazar Buzun
Affiliation
Skolkovo Institute of Science and Technology, Moscow.
Abstract.
In this work we consider regularized Wasserstein barycenters (average in Wasserstein distance) in Fourier basis. We prove that random Fourier parameters of the barycenter converge to some Gaussian random vector by distribution. The convergence rate has been derived in finite-sample case with explicit dependence on measures count () and the dimension of parameters ().
Key words and phrases:
barycenters, Wasserstein distance, Gaussian approximation, multivariate central limit theorem, high-dimensional statistics, convex analysis.
2000 Mathematics Subject Classification:
62H10.
1. Introduction
Monge-Kantorovich distance or Wasserstein distance is a distance between measures. It represents a transportation cost of measure into the other measure .
[TABLE]
where the condition means that has two marginal distributions: and . We focus on regularized distance with probabilistic space
[TABLE]
where is a relatively small addition which improves differential properties of the distance. Namely without we can only bound the first derivative, with it we can bound the second derivative as well. There is a notion of mean in Wasserstein distance, called barycenter. And it is the main object in this research. Consider a set of random measures . By definition in regularised case the empirical and reference barycenters are
[TABLE]
and
[TABLE]
Barycenters are center-of-mass generalization. If we look at the barycenter of a set of uniform measures it extracts the common “shape” form of these measures. If the measures are sampled from some distribution then their barycenter can be treated as an empirical approximation of the distribution mean. A simple example is a circles set with means and radius’s .
[TABLE]
[TABLE]
[TABLE]
Their barycenter is also a circle with mean and radius . We refer to papers BinWas, PenWas, bc_coords_proj for an overview of the barycenters and related study.
barycenter in the previous example doesn’t have an explicit formula but we will show below that it has Gaussian approximation. It is well known that the center-of-mass in norm converges to a Gaussian random vector. As for the barycenter (), it is also expected to have some Gaussian properties. For example, if the measures are Gaussian themselves or one-dimensional or circles set then the Gaussian approximation of the barycenter is proven in papers BinWas, g_was_clt. In circles set case the mean and radius converge to some Gaussian variables as a sum of independent observations according to Central Limit Theorem. In one-dimensional case, denoting distribution functions by
[TABLE]
one gets
[TABLE]
In the case of Gaussian measures with zero mean and variances
[TABLE]
and for some non-random matrix (ref. LimitWasG) the corresponded barycenter variance is
[TABLE]
In both last examples one deals with a mean of independent random variables. Being multiplied by factor, they converge to a Gaussian variable (or to a Gaussian process in case of by Donsker’s Theorem). In general case it appears to be very difficult to reveal such convergence because the barycenter doesn’t have an explicit equation and it is an infinite-dimensional object. In order to handle with this difficulty we propose an approximation of the barycenter by a sum of independent variables using projection into Fourier basis and involve some novel results from statistical learning theory. The perspective of Fourier Analysis provides a suitable representation of the Wasserstein distance and it is already studied in the literature FouWas. Denote a range of size of the barycenter Fourier coefficients by
[TABLE]
The first our result states that for some non-random matrix , independent random vectors , and some depending on constant
[TABLE]
Further we show that for some Gaussian vector
[TABLE]
and :
[TABLE]
Statistical Application: The last statement allows us to obtain the confidence region of parameter and describe the distribution inside the region. Besides, the bootstrap procedure validity LimitWas follows from our proof as well. If one samples using bootstrap it would be close by quantiles to the random variable , which also relates to the formation of the confidence region. In bc_coords_proj the authors demonstrate application of barycentric coordinates that allow to infer missing geometry of an input mesh using a set of 3D models. Recent article brule shows empirically that barycenters may be helpful as a loss function in unsupervised face landmarks detection task.
The Structure of this paper is the following. The main Theorems are in Section 2. Section 3 deals with independent parametric models and describes how one can approximate parameter deviations by a sum of independent random vectors . In Section 4 we explore the barycenters model, compute derivatives of the Wasserstein distance using infimal convolution of support functions and check the required assumptions from the 3-rd Section. Section 5 contains some useful properties of the support functions. The final part, Gaussian approximation of the parameter , is completed in Section 6, where we prove that is close to by distribution.
2. The main result
Definition 2.1** (W-dual).**
For two random variables and with densities and define Wasserstein distance in dual form as
[TABLE]
where means that function is -Lipshits.
Note that if is a joint distribution with marginals and then this definition is equivalent to the original (1.1), which follows from Kantorovich-Rubinstein duality KRdual.
