A bound on the 2-Wasserstein distance between linear combinations of independent random variables
Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly, Yvik Swan

TL;DR
This paper establishes a bound on the 2-Wasserstein distance between linear combinations of independent random variables, aiding in quantifying convergence rates to second Wiener chaos elements using Malliavin-Stein techniques.
Contribution
It introduces a new bound on the 2-Wasserstein distance applicable to sequences in the second Wiener chaos, extending Malliavin-Stein methods for quantitative convergence analysis.
Findings
Bound effectively estimates Wasserstein distance for linear combinations of independent variables.
Application to second Wiener chaos yields explicit convergence rates.
Illustrative examples demonstrate the bound's versatility in various probabilistic settings.
Abstract
We provide a bound on a natural distance between finitely and infinitely supported elements of the unit sphere of , the space of real valued sequences with finite norm. We use this bound to estimate the 2-Wasserstein distance between random variables which can be represented as linear combinations of independent random variables. Our results are expressed in terms of a discrepancy measure which is related to Nourdin and Peccati's Malliavin-Stein method. The main area of application of our results is towards the computation of quantitative rates of convergence towards elements of the second Wiener chaos. After particularizing our bounds to this setting and comparing them with the available literature on the subject (particularly the Malliavin-Stein method for Variance-gamma random variables), we illustrate their versatility by tackling three examples:…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMathematical Approximation and Integration · Random Matrices and Applications · Point processes and geometric inequalities
A bound on the 2-Wasserstein distance between linear
combinations of independent random variables
Benjamin Arras, Ehsan Azmoodeh, Guillaume Poly and Yvik Swan
Abstract
We provide a bound on a natural distance between finitely and infinitely supported elements of the unit sphere of , the space of real valued sequences with finite norm. We use this bound to estimate the 2-Wasserstein distance between random variables which can be represented as linear combinations of independent random variables. Our results are expressed in terms of a discrepancy measure which is related to Nourdin and Peccati’s Malliavin-Stein method. The main area of application of our results is towards the computation of quantitative rates of convergence towards elements of the second Wiener chaos. After particularizing our bounds to this setting and comparing them with the available literature on the subject (particularly the Malliavin-Stein method for variance-gamma random variables), we illustrate their versatility by tackling three examples: chi-squared approximation for second order -statistics, asymptotics for sequences of quadratic forms and the behavior of the generalized Rosenblatt process at extreme critical exponent.
Keywords: Second Wiener chaos, variance-gamma distribution, 2-Wasserstein distance, Malliavin Calculus, Stein discrepancy
MSC 2010: 60F05, 60G50, 60G15, 60H07
Contents
-
2.4 A lower bound on the -Wasserstein distance in the second Wiener chaos
-
2.5 Comparison with the Malliavin–Stein method for the variance-Gamma
-
3.3 The generalized Rosenblatt process at extreme critical exponent
1 Introduction
In this paper, we provide bounds on the Wasserstein-2 distance (see Definition 1.1) between random variables and a target which satisfy the following assumption.
Assumption: There exist non-zero and pairwise distinct real numbers as well as sequences such that for all and
[TABLE]
where the is a sequence of i.i.d. random variables with mean 0, variance 1, finite moments of orders and non-zero th cumulant for all .
In light of the coupling imposed by our Assumption it seems intuitively evident that ought to be governed solely by the convergence rate of the approximating sequence of coefficients towards . The main difficulty is to identify the correct norm for this convergence and, following on [2], we consider the quantity
[TABLE]
The main theoretical contribution of the paper is Theorem 2.4, where we prove, in essence, that under technical conditions on the limiting coefficients we have the bound
[TABLE]
with is a constant depending only on .
We comment briefly on the general strategy we adopt in order to obtain a bound such as (1.3). Due to the structure imposed by our Assumption on the random variables we consider, it is natural to bound the -Wasserstein metric by a quantity based on re-indexing couplings. This leads us to considering a taylor-made norm (see (2.1)) in a purely Hilbertian context. Then, based on the careful analysis of minimization problems associated with , we are able to identify bounding quantities which depend polynomially on the coordinates of the sequences we want to compare (see Theorems 2.1 and 2.3). Recasting these quantities in the probabilistic context we are interested in, we are able to link them to the cumulants of the random variables and and finally to obtain our main result.
The most important application of a bound such as (1.3) is that it provides quantitative rates of convergence towards elements of the second Wiener chaos. Indeed it is a classical result that all such random variables can be written as a linear combination of centered chi-squared random variables, i.e. satisfy (1.1) for and i.i.d. standard normal random variables. In Section 2.3 we particularize our general bounds to this setting and obtain the first rates of convergence in Wasserstein-2 distance of sequences of elements belonging to the second Wiener chaos, hereby complementing recent contributions [25, 2] (see also [17] whose results are posterior to a first version of this paper). Moreover, in Section 2.4, we obtain a general lower bound on the Wasserstein-2 distance between elements in the second Wiener chaos using the quantity . The rate exponent for this lower bound is leaving open the question of optimality of our bounds. We provide as well example where this lower bound can be refined (tightening the gap towards optimality). More importantly, these results emphasize the fact that the quantity is the right one to study quantitative convergence results in 2-Wasserstein distance on the second Wiener chaos. Since the intersection between the second chaos and the class of variance-gamma distributed random variables is not empty it is also relevant to detail our bounds in these cases. We perform this in Section 2.5; this permits also direct comparison with [8] where a similar setting was tackled - by entirely different means.
Finally, in Section 3, we apply our bounds to three illustrative and relevant examples. First we consider chi-squared approximation for second order U-statistics. We obtain among other results the bound
[TABLE]
for a second order U-statistics which has a degeneracy of order 1 (see Section 3.1).
Next we consider the problem of obtaining quantitative asymptotic results for sequences of quadratic forms. Letting and we deduce a general bound for {\bf\rm W}_{2}\big{(}\tilde{Q}_{n}(Z)-\mathbb{E}[\tilde{Q}_{n}(Z)],\tilde{Q}_{\infty}\big{)}. In particular, for specific instances of the real-valued symmetric matrix \big{(}\tilde{a}_{i,j}(n)\big{)}, we obtain explicit rates of convergence:
[TABLE]
where (see Section 3.2, Corollary 3.2). Moreover, combining Corollary 3.2 and an approximation rate in Kolmogorov distance (Corollary 3.3) we are able to derive a quantitative universality type result for quadratic forms defined by:
[TABLE]
with a sequence of i.i.d. random variables centered with unit variance and finite fourth moment (see Theorem 3.1).
Finally, inspired by [3], we consider the generalized Rosenblatt process at extreme critical exponent. Letting
[TABLE]
with and and
[TABLE]
we prove that
[TABLE]
(see Lemma 3.2).
In order to understand the significance of our general bounds and also to contextualize the crucial quantity , it is necessary at this stage to make a short digression into Malliavin-Stein (a.k.a. Nourdin-Peccati) analysis. Let be standard Gaussian and consider a sequence of normalized random variables with sufficiently regular density with respect to the Lebesgue measure. The Stein kernel of is the random variable uniquely defined through the probabilistic integration by parts formula
[TABLE]
which is supposed to hold for all smooth test functions . The classical Stein identity, according to which for all smooth , implies in particular that the standard Gaussian distribution as a Stein kernel which is constant and equal to 1. Hence
[TABLE]
necessarily captures some aspect of non-Gaussianity of . As it turns out this quantity – called the Stein (kernel) discrepancy – plays a crucial role in Gaussian analysis. In particular, it has long been known that measures non-Gaussianity quite precisely. First, see e.g. [31, Lesson VI] or [4, 16, 5], it is equal to zero if and only if (equality in distribution). Second, Stein’s method also implies that metrizes convergence in distribution, i.e.
[TABLE]
for any class of sufficiently regular test functions and a finite constant depending only on ; see [22, Chapter 3] or [19] for more detail. The breakthrough from [23] is the discovery that is the linchpin of the entire theory of “fourth moment theorems” ensuing from the seminal paper [26]. More precisely, Nourdin and Peccati were the first to realize that the integration by parts formula for Malliavin calculus could be used to prove
[TABLE]
whenever is an element of the th Wiener chaos. Combining (1.7) and (1.6) thus provides quantitative fourth moment theorems for chaotic random variables in integral probability metrics including Total Variation, Kolmogorov and Wasserstein-1. We refer to [23] and the monograph [22] for a detailed account; see also [24] for an optimal-order bound (without a square root), and [18] for a general abstract version.
Stein kernels are not inherently Gaussian objects and are well identified and tractable for a wide family of target distributions, see e.g. [31, Lesson VI]. It is therefore not unreasonable to study, for having kernel and satisfying general assumptions, the kernel discrepancy in order to reap the corresponding estimates from (1.6). This plan was already carried out in [23] for a centered gamma random variable and pursued in [7] and [32] for targets which were invariant distributions of diffusions. Many useful target distributions do not, however, bear a tractable Stein kernel and in this case the kernel discrepancy no longer captures relevant information on the discrepancy between and . There is, for instance, an enlightening discussion on this issue in [8, pp 8-9] about the “correct” identity for the Laplace distribution which turns out to be
[TABLE]
for smooth . Identities involving second (or higher) order derivatives of the test functions lead to considering higher order versions of the Stein kernel, namely defined through and defined through where both identities are expected to hold for all smooth test functions (higher order gamma’s are defined iteratively). Applying the intuition from Nourdin-Peccati analysis for Gaussian convergence then leads to a version of (1.6) of the form
[TABLE]
where the constants depend only on and provide a comparison of the with the coefficients of the derivatives appearing in the second order identities (e.g. (1.8) in the case of a Laplace target). Good bounds on the constants are crucial for (1.9) to be of use; such bounds require being able to solve specific (second order) differential equations (called Stein equations) and providing uniform bounds on these solutions and their derivatives. This is exactly the plan carried out in [8] for variance-gamma distributed random variables, and their approach rests on the preliminary work of [9] who provides unified bounds on the solutions to the variance-gamma Stein equations.
Aside from the variance-gamma case discussed in [12, 9], there are several other recent references where versions of (1.4) and (1.8) are proposed for complicated probability distributions such as the Kummer- distribution [27], or the distribution of products of independent random variables [11, 10, 13]. The common trait of all these is that the resulting identities all involve second or higher order derivatives of the test functions. In [1] – which is essentially based on the first part of a previous version of this work – we use Fourier analysis to obtain identities for random variables of the form (1.1) when is a sequence of gamma distributed random variables. The resulting identities involve as many derivatives of the test functions as there are different coefficients in the decomposition (1.1). Applying the intuition outlined in the previous paragraph leads to the realization that the quantity that shall play the role of a Stein discrepancy in the context of random variables of the form (1.1) is exactly defined in (1.2). We are therefore, in principle, in a position to use a bound such as (1.6) or (1.9) to obtain rates of convergence in integral probability metrics . The problem with this roadmap for as general a family as that described by our Assumption is that the corresponding constants are elusive save on a case-by-case basis for specific choices of . This means in particular that Nourdin and Peccati’s version of Stein’s method shall not provide relevant bounds, at least at the present state of our knowledge on the constants , in one sweep for such a large family as that concerned by our assumption (1.1).
In this paper we propose to only keep the relevant quantity whose importance to the problem was identified thanks to the Nourdin-Peccati intuition, but then bypass the difficulties inherited from the Stein methodology entirely. To this end we choose to study the problem of providing bounds in terms of an important and natural distance which is moreover better adapted to our Assumption: the Wasserstein-2 distance which we now define.
Definition 1.1**.**
*Fix . The Wasserstein metric is defined by *
[TABLE]
where the infimum is taken over all joint distributions of the random variables and with respective marginals and , and stands for the Euclidean norm on .
Relevant information about Wasserstein distances can be found, e.g. in [34]. We conclude this introduction by noting that, as is well-known, convergence with respect to is equivalent to the usual weak convergence of measures plus convergence of the first th moments. Also, a direct application of Hölder inequality implies that if then . Finally, we mention that the 2-Wasserstein distance is not of the family of integral probability metrics (recall (1.6) for a definition).
2 Wasserstein-2 distance between linear combinations
2.1 A general result on Hilbert spaces
We denote by the space of real valued sequences such that . It is a Hilbert space endowed with the natural inner product and induced Euclidean norm . We aim to measure distances between elements of the unit sphere of where is a finitely supported sequence and is arbitrary. Denoting the set of permutations of , we introduce the following distance between and :
[TABLE]
Now we define the polynomial . Then, we have the following Theorem.
Theorem 2.1**.**
Suppose that are rationally independent. Then there exists a constant which only depends on such that for any in the unit sphere of we get
[TABLE]
Proof.
We first notice that
[TABLE]
As a result, for any real number , at least one of the following inequalities is true.
[TABLE]
Although several of the aforementionned inequalities can hold simultaneously, one may always associate to any integer some index in such that holds for . Hence, one may build a partition of such that
[TABLE]
Note that for we have . Indeed, if one assumes, for example, that , then one necessarily has that (which is a contradiction). This entails the following bound
[TABLE]
For any integer , we set if for and we set when . Using triangle inequality and (2.3) we may infer that
[TABLE]
We need to introduce the following quantity
[TABLE]
Since we do not let in the above minimization, and owing to the assumption of rational independence of , it follows that . Relying on the bound (2.4), one has the following implication
[TABLE]
for being any permutation of satisfying
[TABLE]
Finally, it holds
[TABLE]
which implies that (given the trivial bound )
[TABLE]
The proof is then achieved with the constant ∎
Let us now deal with the case when are not anymore rationally independent. In this situation, one might write for several choices of vectors . We must introduce the set of all these choices, namely:
[TABLE]
Besides, for any we define the following element of the unit sphere of :
[TABLE]
We then have the following Theorem.
Theorem 2.2**.**
There exists a constant only depending on such that for any in the unit sphere of we get:
[TABLE]
Proof.
We proceed as in the proof of Theorem 2.1, from its begining until the bound (2.5). The only difference is that we must now consider
[TABLE]
Similarly, since we removed from the above minimization problem, it follows that . Relying on the bound (2.4), one has the following implication
[TABLE]
for being any permutation of satisfying
[TABLE]
Finally, it holds
[TABLE]
which can also be written
[TABLE]
The proof is then achieved with the constant ∎
In the above situation, the quantity is not sufficient anymore to ensure the uniqueness of the limit for the convergence for the metric . There may be several adherence values and some additional information is then required. Set
[TABLE]
We have the following Theorem.
Theorem 2.3**.**
There exists a constant which only depends on such that, for any with , we get
[TABLE]
Proof.
Relying on Theorem 2.2, it holds that
[TABLE]
Note that it is not assumed that the real numbers are pairwise distinct. We can extract a subsequence with by removing the possible repetitions. For any , let us also denote by the number of repetitions of among the sequence and by the number of repetitions in the sequence . Thus, we have
[TABLE]
Suppose that , by the triangle inequality get for all ,
[TABLE]
Finally, set the Vandermonde matrix associated to the pairwise distinct real numbers and . The above inequality reads as
[TABLE]
Now, we set
[TABLE]
since is invertible we must have . That is why,
[TABLE]
In the latter situation we also get , and of course the desired bound
[TABLE]
The proof is now achieved with . ∎
2.2 A probabilistic interpretation
Let us give an i.i.d. sequence of random variables admitting moments of orders and which satisfies . We shall further assume that all cumulants of orders are not zero. We set
[TABLE]
for and two sequences of real numbers. We also assume that:
[TABLE]
Using standard properties of cumulants one has for any :
[TABLE]
Lemma 2.1**.**
For any we have
[TABLE]
where the coefficients are the coefficients of the polynomial
[TABLE]
From a probabilistic point of view, Theorems 2.1 and 2.3 take the following form:
Theorem 2.4**.**
If the real numbers are rationnally independent then
[TABLE]
if they are not, one instead gets
[TABLE]
where the constant depends only, in both cases, of the target .
Proof.
The proof is a direct consequence of Theorems 2.1 and 2.3. Indeed, set and , by definition of the 2-Wasserstein distance, we get . As before, we set . Finally, recalling that , the result follows. ∎
Remark 2.1**.**
An important question concerning the sharpness of the estimate (2.14) was raised by referees on a previous version of this paper. We first notice that for some appropriate constant and for all , one gets and for all , . Hence, we may deduce that
[TABLE]
and the result follows since one gets, for appropriate constants that
[TABLE]
Unfortunately, at present, we are unable to say whether distance is equivalent to the 2-Wasserstein distance. Nonetheless, in the context of second Wiener chaos, we provide a general lower bound on the 2-Wasserstein distance in Section 2.4 as well as a simple example which refines this lower bound.
2.3 Specializing to the second Wiener chaos
In this section, we apply our main results in a desirable framework when the approximating sequence are elements of the second Wiener chaos of the isonormal process over a separable Hilbert space . We refer the reader to [22] Chapter 2 for a detailed discussion on this topic. Recall that the elements in the second Wiener chaos are random variables having the general form , with . Notice that, if , where is such that , then using the multiplication formula one has (equality in distribution), where . To any kernel , we associate the following Hilbert-Schmidt operator
[TABLE]
We also write and , respectively, to indicate the (not necessarily distinct) eigenvalues of and the corresponding eigenvectors. We remind that is defined by:
[TABLE]
where is a collection of i.i.d. standard normal random variables. The next proposition gathers some relevant properties of the elements of the second Wiener chaos associated to .
Proposition 2.1** (See Section 2.7.4 in [22] and Lemma 3.1 in [2] ).**
Let , , be a generic element of the second Wiener chaos of , and write for the set of the eigenvalues of the associated Hilbert-Schmidt operator .
The following equality holds: F=\sum_{k\geq 1}\alpha_{f,k}\big{(}N^{2}_{k}-1\big{)}, where is a sequence of i.i.d. random variables that are elements of the isonormal process , and the series converges in and almost surely. 2. 2.
For any ,
[TABLE] 3. 3.
For polynomial as in we have . In particular .
The next corollary is a direct application of our main finding, namely Theorem 2.4, and provides quantitative bounds for the main results in [25, 2].
Corollary 2.1**.**
Assume that the normalized sequence F_{n}=\sum_{k\geq 1}\alpha_{n,k}\big{(}N^{2}_{k}-1\big{)} belongs to the second Wiener chaos associated to the isonormal process , and the target random variable as in (2.11) with where is a sequence of i.i.d. random variables. Then there exists a constant depending only on the target random variable (and hence independent of ) such that
- (a)
[TABLE] 2. (b)
if moreover , then . This implies that the sole convergence is sufficient for convergence in distribution towards the target random variable .
Remark 2.2**.**
The upper bound in Corollary 2.1, part (a) requires the separate convergences of the first cumulants for the convergence in distribution towards the target random variable as soon as . This is consistent with a quantitative result in [8], see also Section 2.5 below. In fact, when and , then the target random variable , where are independent and equality holds in law) belongs to the class of Variance–Gamma distributions with parameters and . Then, [8, Corollary 5.10, part (a)] reads
[TABLE]
Therefore, for the convergence in distribution of the sequence towards the target random variable in addition to convergence one needs also the convergence of the third cumulant . Note that in this case we have . **
Example 2.1**.**
The aim of this simple example is to show that the requirement of separate convergences of the first cumulants is essential in Theorem 2.4 as soon as . Assume that and . Consider the fixed sequence
[TABLE]
Then for all , in particular , and . However, it is easy to see that the sequence does not converges in distribution towards the target random variable , because . Note that in this example, we have . **
2.4 A lower bound on the -Wasserstein distance in the second Wiener chaos
In this subsection, we detail how to upper bound the quantity with the 2-Wasserstein distance between and when and belong to the second Wiener chaos. First of all we recall some notations. The random variables and are defined by:
[TABLE]
where is a sequence of iid standard normal random variables, a collection of non-zero real numbers such that:
[TABLE]
Similarly, we have:
[TABLE]
From the previous assumptions, it is clear that . It is also standard that the characteristic functions of and are analytic in the strips of the complex plane defined respectively by and . In particular, by (2.19) and (2.20), the characteristic functions of and are analytic in the strip . Moreover, in this strip of regularity, they admit the following integral representations:
[TABLE]
where and are the probability laws of and respectively. First, we give two technical lemmas.
Lemma 2.2**.**
For any and such that :
[TABLE]
Proof.
The proof is standard. ∎
Lemma 2.3**.**
Let be a random variable belonging to the second Wiener chaos with unit variance. Then, we have:
[TABLE]
for all .
Proof.
Since is in the second Wiener chaos, we have by hypercontractivity, for any :
[TABLE]
Then, by Markov inequality, we have, for :
[TABLE]
We choose and we obtain:
[TABLE]
∎
We are now ready to the state the proposition linking the pointwise difference of the characteristic functions and of their derivatives with the 2-Wasserstein distance of and .
Proposition 2.2**.**
For any , there exists a strictly positive constant such that, for all and for all with , we have:
[TABLE]
Proof.
By optimal transportation on the real line (Brenier Theorem), there exists a map such that we have:
[TABLE]
Moreover, the push forward measure is equal to so that, we have also:
[TABLE]
for such that . Let and such that . We have:
[TABLE]
Moreover, by Lemma 2.2, we have the following upper bound:
[TABLE]
Using Cauchy-Schwarz inequality, we obtain:
[TABLE]
Next, we need to prove that:
[TABLE]
By Lemma 2.3, we have that:
[TABLE]
as soon as . Since , (2.32) follows. To conclude the proof of the proposition, we need to bound similarly the pointwise difference of the derivatives of the characteristic functions. Since and are centered, we have:
[TABLE]
Then, we have:
[TABLE]
with:
[TABLE]
For the first term, using Lemma 2.2 and Cauchy-Schwarz inequality, we have the following bound:
[TABLE]
Moreover, as previously, we have:
[TABLE]
for . For the second term, we have:
[TABLE]
Finally, we note that for :
[TABLE]
Taking
[TABLE]
We obtain:
[TABLE]
∎
In order to upper bound the quantity with the 2-Wasserstein distance, we are going to use complex analysis together with Proposition 2.2. First of all, recall the following inequality for the cumulants of and :
[TABLE]
and similarly for . Therefore the following series are convergent as soon as :
[TABLE]
We are now ready to link the quantity with a certain functional on the difference of the characteristic functions.
Proposition 2.3**.**
Let . There exists a strictly positive constant, , such that:
[TABLE]
Proof.
Let us fix . First of all, it is not difficult to see that we have the following identity as soon as :
[TABLE]
By orthogonality, we have the following identity:
[TABLE]
Then, we obtain the following lower bound:
[TABLE]
for some . This concludes the proof of the proposition. ∎
We are now ready to state the main the result of this sub-section.
Proposition 2.4**.**
For any , there exists a strictly positive constant such that for all , we have:
[TABLE]
Proof.
First of all, we note that for any such that , we have:
[TABLE]
Moreover, it is clear that the function is bounded away from [math] on the disk centered at the origin and with radius . Regarding the function , we have the following uniform bound (with on the disk centered at the origin and with radius ):
[TABLE]
Therefore, it is clear that:
[TABLE]
for some strictly positive constants and (independent of ). Thus, using Proposition 2.2, we obtain:
[TABLE]
Then, using Proposition 2.3 concludes the proof of the proposition. ∎
Remark 2.3**.**
- •
Combining Proposition 2.4 together with part (b) of Corollary 2.1, we obtain the fact that the convergence of to [math] is equivalent to the convergence of to [math] when . This complements the results contained in **[25, 2]** (see in particular Theorem of **[2]**). Moreover, recall that convergence of to [math] is equivalent to convergence in distribution and convergence of the second moments. Therefore, when and and have unit variances, convergence in distribution of towards is equivalent to convergence of to [math].
- •
This justifies why we choose to study quantitative convergence result with respect to the 2-Wasserstein distance instead of other probability metrics such as Kolmogorov distance or 1-Wasserstein distance.
In the sequel, we provide a simple example for which it is possible to refine the previous lower bound. Let be a sequence of positive real numbers strictly less than which converges to [math] when tends to infinity and such that:
[TABLE]
Then, we consider the following random variables:
[TABLE]
We note that:
[TABLE]
First of all, let us find an asymptotic equivalent for . By definition, we have:
[TABLE]
Since and , we obtain:
[TABLE]
Then, one can prove that:
[TABLE]
In order to find a comparable lower bound for the Wasserstein-2 distance, we need the following technical lemma.
Lemma 2.4**.**
We denote by and the characteristic functions of and respectively. We have the following inequality:
[TABLE]
Proof.
Let be as in the proof of Proposition 2.2 (given by Brenier theorem). We have:
[TABLE]
where we have used Cauchy-Schwarz inequality in the last inequality and the definition of . This concludes the proof of the lemma. ∎
Therefore, we have the following lower bound.
Lemma 2.5**.**
There exists a strictly positive constant such that we have, for large enough:
[TABLE]
Proof.
By straightforward computations, we have the following formula for and :
[TABLE]
Therefore, we have, for all :
[TABLE]
Now we select . We obtain:
[TABLE]
The previous lower bound then implies:
[TABLE]
But it is clear that there exists a strictly positive constant (independent of ) such that:
[TABLE]
Then, we obtain that:
[TABLE]
Using Lemma 2.4 concludes the proof of the lemma. ∎
Remark 2.4**.**
Modifying the proof of Lemma 2.5 by choosing for some produces lower bounds with different rates of convergence to [math]. Indeed, one can check that the exponent of the resulting lower bound (denoted by ) is defined in the following way:
[TABLE]
Thus, corresponds to the scale which reduces the most the gap between the lower and the upper scaling exponents.
2.5 Comparison with the Malliavin–Stein method for the variance-Gamma
We recall that the target distributions of our interest laying in the second Wiener chaos takes the form
[TABLE]
where , are i.i.d. random variables, and the coefficients are non-zero and distinct. We stress that in representation cannot be infinity. The aim of this section is to study the connections between the class of our target distributions given as , and the so called variance-gamma class of probability distributions, and to compare our quantitive bound in Corollary 2.1 with the bounds recently obtained in [8] using the Malliavin–Stein method. First, we recall some basic facts that we need on the variance-gamma probability distributions. For detailed information, we refer the reader to [9, 12] and references therein. The random variable is said to have a variance-gamma probability distribution with parameters if and only if its probability density function is given by
[TABLE]
where , and is a modified Bessel function of the second kind, and we write . Also, it is known that for (see for example relation in [12])
[TABLE]
Lemma 2.6**.**
(a) Let be independent, and take two arbitrary . Then
[TABLE]
*(b) Let . Then the target random variable as so that (or similarly when ) cannot belong to the variance-gamma class.
(c) Let . Then the target random variable as cannot belong to the variance-gamma class.
Proof.
(a) Set
[TABLE]
Then the claim follows directly from part (v) in [9, Proposition 3.8]. (b,c) These also follow directly using a straightforward comparison between the characteristic function of and the one of the variance-gamma random variable (see, for example, [20, page ]). ∎
Next, we want to compare our bound in Corollary 2.1 with the bound in [8] obtained using the Malliavin–Stein method. A good starting point for such comparison is the right hand side of equation in [8, Theorem 4.1]. This is because the bound in [8, Corollary 5.10, part (a)] is obtained from the right hand side of equation in [8, Theorem 4.1] by norms of contraction operators. In virtue of Lemma 2.6, in order for as in to belong to the variance-gamma class, it is necessary to have and . Letting in the right hand side of equation in [8, Theorem 4.1], and taking into account that , and , for an element in the second Wiener chaos associated to the underlying isonormal process , we arrive at
[TABLE]
The last inequality is derived from the Cauchy-Schwarz inequality together with [2, Lemma 3.1] where we used the fact that belongs to the second Wiener chaos.
3 Applications
3.1 An example from -statistics
Under some degeneracy conditions, it is possible to observe the appearance of limiting distributions of the form \sum_{k\geq 1}\alpha_{\infty,k}\big{(}N^{2}_{k}-1\big{)} in the context of -statistics. In this example, we restrict our attention to second order -statistics. We refer the reader to [29, Chapter Section ] or to [15, Chapter Corollary ] for full generality. Let be a sequence of i.i.d. standard normal random variables supported by the isonormal Gaussian process , where is the Wiener-It integral of order and is an orthonormal basis of . Let be a real number. We consider the following second order -statistic which has a degeneracy of order 1:
[TABLE]
A direct application of Theorem in [29] allows one to obtain:
[TABLE]
Using Corollary 2.1, we have the following result:
Corollary 3.1**.**
For any , we have:
[TABLE]
Namely, for large enough:
[TABLE]
Proof.
By Corollary 2.1, we have:
[TABLE]
But,
[TABLE]
In order to obtain an explicit rate of convergence, we have to compute the cumulants of order and of the random variable . Since is in the second order Wiener chaos, we can apply the following formula:
[TABLE]
where there are copies of in . We note that:
[TABLE]
Let us compute the third and the fourth cumulants of . By formula (3.1), we have:
[TABLE]
with,
[TABLE]
By standard computations, we have:
[TABLE]
We denote by , , and the four associated double sums. The scalar product of with gives:
[TABLE]
The three other terms contribute in a similar way. Thus, we have:
[TABLE]
Similar computations for the fourth cumulants of lead to the following formula:
[TABLE]
Using the facts that , and , we obtain:
[TABLE]
The result then follows. ∎
3.2 Application to some quadratic forms
In this example, we are interested in the asymptotic distributions of sequences of some specific quadratic forms. More precisely, we consider the following sequence of random variables:
[TABLE]
where A_{n}=\big{(}a_{i,j}(n)\big{)} is a real-valued symmetric matrix and an i.i.d. sequence of standard normal random variables. A full description of the limiting distributions for this type of sequences is contained in [30]. In particular, it is possible to observe the appearance of limiting distributions of the form \sum_{k\geq 1}\alpha_{\infty,k}\big{(}N^{2}_{k}-1\big{)}. Sufficient conditions for such an appearance have been introduced in [33]. Let be distinct non-zero real numbers. We make the following assumptions:
- •
Let be a sequence of real numbers such that:
[TABLE]
- •
For each , we assume that:
[TABLE]
- •
Finally, we assume that:
[TABLE]
In order to fit the assumptions of Corollary 2.1, we renormalize the quadratic form . We denote by the quadratic form associated with the matrix defined by:
[TABLE]
In particular, we have:
- •
for each ,
[TABLE]
- •
and,
[TABLE]
By Theorem of [33], we have the following result:
[TABLE]
If we assume that the is a sequence of standard normal random variables supported by a Gaussian isonormal process, we have the following representation:
[TABLE]
with . Applying Corollary 2.1, we will obtain an explicit rate of convergence for the previous limit theorem in -Wasserstein distance. For this purpose we need to compute the cumulants of order of for . Using the fact that the ’s are i.i.d. standard normal, we have:
[TABLE]
Combining the previous formula together with Corollary 2.1, we obtain the following bound on the 2-Wasserstein distance between and :
[TABLE]
Thanks to this bound, we can obtain explicit rates of convergence for some more specific examples. In the sequel, we denote by \mathcal{C}^{\alpha}\big{(}[0,1]\big{)} the space of H lder continuous real-valued functions of order on . We have the following result.
Corollary 3.2**.**
Let be distinct orthonormal functions of such that e_{m}\in\mathcal{C}^{\alpha}\big{(}[0,1]\big{)} for some . Let be the square integrable kernel defined by
[TABLE]
and let be the matrix defined by:
[TABLE]
Then, we have, for large enough:
[TABLE]
Proof.
First of all, choosing , we note that the assumptions of the non-central limit theorem are verified so that the corresponding quadratic form converges in law towards \sum_{m=1}^{q}\tilde{\lambda}_{m}\big{(}Z_{m}^{2}-1\big{)}. Let us work out the bound (3.3) in order to obtain an explicit rate of convergence. By standard computations, we have for all :
[TABLE]
where means that we have excluded the hyper diagonal . Thus, we have:
[TABLE]
Note that the second term tends to [math] as tends to since we have excluded the hyper diagonal and that:
[TABLE]
Since e_{m}\in\mathcal{C}^{\alpha}\big{(}[0,1]\big{)}, we have the following asymptotic for every :
[TABLE]
Similarly, using the fact that e_{m}\in\mathcal{C}^{\alpha}\big{(}[0,1]\big{)}, it is straightforward to see that the second term is . Now, we note that:
[TABLE]
since for ,
[TABLE]
The result then follows. ∎
Remark 3.1**.**
Theorem of [33] is actually more general than the particular instance we have displayed since it holds for quadratic forms defined by:
[TABLE]
where is an i.i.d. sequence of centered random variables such that and . Furthermore, since the works of Rotar’ [28], it is known that exhibits the same asymptotic behavior than and explicit rates of approximation have been obtained in Kolmogorov distance (see e.g. [14] and more generally [21] Theorems and ).
We end this subsection with a universality result as announced in the previous remark. We assume that is a i.i.d. sequence of centered random variables such that and . First of all, as a direct application of Theorem of [21], we obtain an explicit bound of approximation between and in Kolmogorov distance.
Corollary 3.3**.**
Under the previous assumptions, there exists such that:
[TABLE]
Proof.
Since the sequence is a i.i.d. sequence of centered random variables with unit variance and finite moment we have in particular that . Moreover, we have:
[TABLE]
with,
[TABLE]
It is clear that . Moreover, we have, for any :
[TABLE]
where we have used the fact that the are bounded on . Now, we note that:
[TABLE]
Thus, we have, for some :
[TABLE]
Applying directly Theorem of [21], we obtain:
[TABLE]
∎
In order to obtain a rate for the Kolmogorov distance, we combine the previous corollary with Corollary 3.2 and with the fact that the Kolmogorov distance admits the following bound when the density of the target law is bounded (see e.g. Theorem of [6]):
[TABLE]
admits a bounded density as soon as is large enough. In this regard, we have the following lemma.
Lemma 3.1**.**
Let . Let be a random variable such that:
[TABLE]
with non-zero real numbers. Then, has a bounded density.
Proof.
The proof is standard so that we only sketch it. The characteristic function of is given by the following formula:
[TABLE]
We introduce . Then,
[TABLE]
Since , we deduce from the previous inequality that is in . Thus, we can apply Fourier inversion formula to obtain the following bound:
[TABLE]
This concludes the proof of the lemma. ∎
Therefore, we have the following result:
Theorem 3.1**.**
Under the previous assumptions, we have:
[TABLE]
Remark 3.2**.**
We would like to mention that it is possible to combine the inequality (3.3) together with Theorem of [21] to obtain a general bound in Kolmogorov distance for . We introduce the following quantity:
[TABLE]
Then,
[TABLE]
3.3 The generalized Rosenblatt process at extreme critical exponent
We conclude this section with a more ambitious example, providing rates of convergence in a recent result given by [3, Theorem 2.4]. Let be the random variable defined by:
[TABLE]
with and . By Proposition 3.1 of [3], we have the following formula for the cumulants of :
[TABLE]
where,
[TABLE]
and,
[TABLE]
Let and be the random variable defined by:
[TABLE]
with independent standard normal random variables and and defined by:
[TABLE]
For simplicity, we assume that and . Then [3, Theorem 2.4] implies that as tends to :
[TABLE]
Note that, in this case, automatically tends to as well. To prove the previous result, the authors of [3] prove the following convergence result:
[TABLE]
Now, using Corollary 2.1, Lemma 2.1 and applying Lemma 3.2, we can present the following quantative bound for convergence , namely as tends to :
[TABLE]
where is some strictly positive constant depending on uniquely. In order to apply Corollary 2.1 to obtain an explicit rate for convergence , we need to know at which speed \kappa_{m}\big{(}Z_{\gamma_{1},\gamma_{2}}\big{)} converges towards \kappa_{m}\big{(}Y_{\rho}\big{)}. For this purpose, we have the following lemma:
Lemma 3.2**.**
Under the above assumptions, for any , we have, as tends to :
[TABLE]
Proof.
First of all, we note that, as tends to :
[TABLE]
where is the Euler constant, is the Digamma function and some strictly positive constant depending on uniquely. Note that -3+2\gamma+2\psi\big{(}1/2\big{)}<0. Moreover, we have:
[TABLE]
Note that -\gamma-\psi\big{(}\frac{1}{2}\big{)}>0. The diverging terms in are and . At and fixed, the only possible values are:
[TABLE]
Moreover, we have, for fixed:
[TABLE]
Developing the product in the right hand side of (3.11), we obtain:
[TABLE]
This leads to the following asymptotic for the cumulants of ,
[TABLE]
where we have used similar computations as in the proof of Theorem of [3] for the last equality. ∎
Acknowledgments
BA’s research was supported by a Welcome Grant from the Université de Liège. YS gratefully acknowledges support by the Fonds de la Recherche Scientifique - FNRS under Grant MIS F.4539.16.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Arras, B., Azmoodeh, E., Poly, G. and Swan, Y. (2017) A Fourier approach to Stein characterizations in preparation .
- 2[2] Azmoodeh, E., Peccati, G., and Poly, G. (2014) Convergence towards linear combinations of chi-squared random variables: a Malliavin-based approach. Séminaire de Probabilités XLVII (Special volume in memory of Marc Yor), pp.339-367.
- 3[3] Bai, S., Taqqu, M. (2015) Behavior of the generalized Rosenblatt process at extreme critical exponent values. Ann. Probab. 45(2), pp.1278–1324, 2017.
- 4[4] Borovkov, A.A. and Utev, S.A. (1984) On an inequality and a related characterization of the normal distribution. Theory Probab. Appl. , 28(2), pp.219-228.
- 5[5] Cacoullos, T. and Papathanasiou, V. (1989). Characterizations of distributions by variance bounds. Statistics & Probability Letters , 7(5), pp.351-356.
- 6[6] Chen, L.HY, Goldstein, L. and Shao, Q.M. (2010) Normal approximation by Stein’s method. Springer Science Business Media.
- 7[7] Eden, R., Viquez, J. (2015) Nourdin-Peccati analysis on Wiener and Wiener-Poisson space for general distributions. Stoch. Proc. Appl. 125 (1), pp.182-216.
- 8[8] Eichelsbacher, P., Thäle, C. (2015) Malliavin-Stein method for variance-gamma approximation on Wiener space. Electron. J. Probab. 20 (123), pp.1-28.
