Optimal Gamma Approximation on Wiener Space
Ehsan Azmoodeh, Peter Eichelsbacher, Lukas Knichel

TL;DR
This paper establishes an optimal rate of convergence for Gamma approximation on Wiener space using a novel operator approach to Stein's method, extending previous cumulant-based characterizations.
Contribution
It introduces a new operator theory approach to Stein's method for Gamma approximation, achieving optimal convergence rates in the $d_2$-distance.
Findings
Derived an optimal convergence rate in $d_2$-distance for Gamma approximation.
Extended cumulant-based characterization to include rate of convergence.
Applied the method to quadratic forms as illustrative examples.
Abstract
In \cite{n-p-noncentral}, Nourdin and Peccati established a neat characterization of Gamma approximation on a fixed Wiener chaos in terms of convergence of only the third and fourth cumulants. In this paper, we provide an optimal rate of convergence in the -distance in terms of the maximum of the third and fourth cumulants analogous to the result for normal approximation in \cite{n-p-optimal}. In order to achieve our goal, we introduce a novel operator theory approach to Stein's method. The recent development in Stein's method for the Gamma distribution of D\"obler and Peccati (\cite{d-p}) plays a pivotal role in our analysis. Several examples in the context of quadratic forms are considered to illustrate our optimal bound.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Optimal Gamma Approximation on Wiener Space
E. Azmoodeh Ruhr University Bochum, Faculty of Mathematics, IB 2/101, 44780 Bochum, Germany. E-mail: [email protected]
P. Eichelsbacher and L. Knichel Ruhr University Bochum, Faculty of Mathematics, IB 2/115, 44780 Bochum, Germany. E-mail: [email protected] University Bochum, Faculty of Mathematics, IB 2/95, 44780 Bochum, Germany. E-mail: [email protected]. Lukas Knichel has been supported by the German Research Foundation (DFG) via Research Training Group RTG 2131 High dimensional phenomena in probability – fluctuations and discontinuity
Abstract
In [NP09a], Nourdin and Peccati established a neat characterization of Gamma approximation on a fixed Wiener chaos in terms of convergence of only the third and fourth cumulants. In this paper, we provide an optimal rate of convergence in the -distance in terms of the maximum of the third and fourth cumulants analogous to the result for normal approximation in [NP15]. In order to achieve our goal, we introduce a novel operator theory approach to Stein’s method. The recent development in Stein’s method for the Gamma distribution of Döbler and Peccati ([DP18]) plays a pivotal role in our analysis. Several examples in the context of quadratic forms are considered to illustrate our optimal bound.
Keywords: Gamma approximation, Wiener chaos, Cumulants/Moments, Weak convergence, Malliavin Calculus, Berry–Esseen bounds, Stein’s method, Wasserstein distances, Quadratic form
MSC 2010: 60F05, 60G50, 60H07
Contents
1 Introduction and Main Result
Let be an isonormal Gaussian process over a separable Hilbert space on a suitable probability space . In the landmark article [NP05] Nualart and Peccati discovered an astonishing central limit theorem (CLT) known nowadays as the fourth moment theorem for a sequence of normalized random variables inside a fixed Wiener chaos associated to . It states that the convergence in distribution towards a standard Gaussian distribution is equivalent to the sole requirement that the fourth moments converge to . A few years later, their findings have created a fertile line of research, culminating in the popular article [NP09b], introducing the so called Malliavin-Stein approach, an elegant combination of two probabilistic techniques namely Stein method [Ste72, CGS11] and Malliavin calculus [Nua06, NN18] in order to quantify the probability distance between a square integrable Wiener functional and a normal distribution. The reader may consult the excellent monograph [NP12a], as well as the constantly updated web resource https://sites.google.com/site/malliavinstein/home for a huge amount of applications and generalizations of the aforementioned results. Our study is mainly inspired by the following discovery (item (b) of the forthcoming theorem), which presents an optimal version of the fourth moment theorem. For every real-valued random variable the quantity stands for the th cumulant of , see section 2.3.
Theorem 1.1** ((Optimal) fourth moment theorem** [NP05, NP09b, NP15]).
Fix . Let be a sequence of random variables in the th Wiener chaos associated to such that for every . Then
(a)
* in distribution if and only if . Also, the following quantitative estimate is in order: for ,*
[TABLE]
(b)
Under the assumptions of item (a) there exist two constants and (independent of ) such that the following optimal rate of convergence in total variation distance holds:
[TABLE]
Fix a parameter . In this paper, the target distribution of interest is the so called centered Gamma distribution denoted by . This means that , where is a standard Gamma random variable with density . Here denotes the Euler Gamma function. The centered Gamma distribution frequently appears as a natural limiting distribution in the context of the fourth moment theorems in several studies, see for example [ACP14, AMMP16, APP15, KT18, KT12, AS17, ET15, Led12, NR14, NP12b, AMPS17, EV15]. Our principal goal is to provide an optimal rate (analogous to that of item (b) Theorem 1.1) for the Gamma approximation on a fixed Wiener chaos. The statement of the next result is an up-to-date significant improvement over the years of the findings in [NP09a, NP09b, NPR10, DP18].
Theorem 1.2**.**
Let . Fix an even number (see [NP09a, Remark 1.3, item 3] when is odd). Assume is a random element in the th Wiener chaos such that . Then there exists a constant (may depend on and ) such that
[TABLE]
where
[TABLE]
Here stands for the so called -Wasserstein metric (see below for definition). As a consequence, for a sequence of random variables in the th Wiener chaos such that for every , the following remarkable equivalence of asymptotic statements are in order:
(a)
* in distribution.*
(b)
, and .
The exact shape of the constant can be found in the aforementioned references. Note that unlike the case of normal approximation. We also recall the following natural generalization of the -Wasserstein metric that we will make use of throughout the paper. Let and be two real-valued random variables. For , define
[TABLE]
where the class of the test functions is . Here, denotes the smallest Lipschitz constant of , see (17). A significant and also very challenging question, which we will deal with in this paper, is whether one can either provide an optimal rate or improve the rate (2) available in Theorem 1.2. For a general sequence and a suitable probability metric (often we assume that the topology induced by metric is stronger than convergence in distribution), following [NP12a, Definition 9.2.1], we say that a numerical sequence of strictly positive real numbers, decreasing to [math], yields an optimal rate with respect to the metric , if there exist two constants and (independent of ) such that
[TABLE]
Our main result is the following non asymptotic optimal Gamma approximation within the second Wiener chaos that improves upon the rate (2) by a square power.
Theorem 1.3** (Non asymptotic optimal Gamma approximation).**
Let , and . Assume that is a random variable in the second Wiener chaos associated with , such that . Then there exist two constants (possibly depending on the parameter ) such that
[TABLE]
where the quantity is given by (3).
Remark 1.4**.**
(a)
A significant feature of the optimal rate (4), unlike the one in item (b) of Theorem 1.1 in the normal approximation case, is that it is non asymptotic and a priori does not assume the law of the chaotic random variable to be close to that of .
(b)
For the upper bound, the starting point is an adaption of the technique developed in [NP15]. However, in order to achieve the optimal upper bound we introduce a novel technique within Stein’s method to split test functions relying on tools from operator theory. This is the topic of section 3.
(c)
Our methodology to obtain the optimal lower bound is based on complex analysis and differs from that in [NP15]. Up to our knowledge this method is new.
(d)
Theorem 1.3 has to be seen as a full generalization of the main findings of [AEK18], where we assumed some additional technical conditions.
The outline of our paper is as follows: In section , we give a brief introduction to Malliavin calculus on the Wiener space and specify the notation used in the paper. Section gathers the essential ingredients of Stein’s method for the centered Gamma distribution, developed recently in [DP18]. Section contains the main theoretical findings of this paper – an upper bound for the distance between a general element living in a finite sum of Wiener chaoses and the target distribution in terms of iterated Gamma operators, as well as the optimal Gamma approximation rate. The end of this section is devoted to applications of our main findings. Lastly, we close the paper with an appendix section with focus on the newly introduced Gamma operators.
2 Preliminaries: Gaussian Analysis and Malliavin Calculus
In this section, we provide a brief introduction to Malliavin calculus and define some of the operators used in this framework. For more details, see for example the textbooks [NP12a, Nua06, NN18].
2.1 Isonormal Gaussian Processes and Wiener Chaos
Let be a real separable Hilbert space with inner product , and be an isonormal Gaussian process, defined on some probability space . This means that is a family of centered, jointly Gaussian random variables with covariance structure . We assume that is the -algebra generated by . For an integer , we will write or to denote the -th tensor product of , or its symmetric -th tensor product, respectively. If is the -th Hermite polynomial, then the closed linear subspace of generated by the family is called the -th Wiener chaos of and will be denoted by . For , let be the -th multiple Wiener-Itô integral of (see [NP12a, Definition 2.7.1]). An important observation is that for any with we have that . As a consequence provides an isometry from onto the -th Wiener chaos of . It is a well-known fact, called the Wiener-Itô chaotic decomposition, that any element admits the expansion
[TABLE]
where and the , are uniquely determined. An important result is the following isometry property of multiple integrals. Let and , where . Then
[TABLE]
2.2 The Malliavin Operators
We denote by the set of smooth random variables, i.e. all random variables of the form , where , and is a -function, whose partial derivatives have at most polynomial growth. For these random variables, we define the Malliavin derivative of with respect to as the -valued random element defined as
[TABLE]
The set is dense in and using a closure argument, we can extend the domain of to , which is the closure of in with respect to the norm . See [NP12a] for a more general definition of higher order Malliavin derivatives and spaces . The Malliavin derivative satisfies the following chain-rule. If is a continuously differentiable function with bounded partial derivatives and is a vector of elements of for some , then and
[TABLE]
Note that the conditions on are not optimal and can be weakened. For , with chaotic expansion as in (5), we define the pseudo-inverse of the infinitesimal generator of the Ornstein-Uhlenbeck semigroup as
[TABLE]
The following integration by parts formula is one of the main ingredients to proving the main theorem of section 4.1. Let . Then
[TABLE]
2.3 Gamma Operators and Cumulants
Let be a random variable with characteristic function . We define its -th cumulant, denoted by , as
[TABLE]
Let be a random variable with a finite chaos expansion. We define the operators , via and
[TABLE]
This is the Gamma operator used in the proof of the main theorem in [NP15], although it is defined differently there. Note that there is also an alternative definition, which can be found in most other papers in this framework, see for example Definition 8.4.1 in [NP12a] or Definition 3.6 in [BBNP12]. For the sake of completeness, we also mention the classical Gamma operators, which we also call alternative Gamma operators, which we shall denote by . These are defined via
[TABLE]
The classical Gamma operators are related to the cumulants of by the following identity from [NP10]: For all , we have
[TABLE]
If , this does not hold anymore for our new Gamma operators. Instead, in our next result, we will list some useful relations between the classical and the new Gamma operators.
Proposition 2.1**.**
Let be a centered random variable admitting a finite chaos expansion. Then
- (a)
,
- (b)
\mathbb{E}\big{[}\Gamma_{j}(F)\big{]}=\mathbb{E}\big{[}\Gamma_{alt,j}(F)\big{]}=\frac{1}{j!}\kappa_{j+1}(F)* for .*
- (c)
\mathbb{E}\big{[}\Gamma_{3}(F)\big{]}=2\,\mathbb{E}\big{[}\Gamma_{alt,3}(F)\big{]}-\operatorname{Var}\big{(}\Gamma_{1}(F)\big{)}=\frac{1}{3}\kappa_{4}(F)-\operatorname{Var}\big{(}\Gamma_{1}(F)\big{)},
- (d)
When , for some , is an element of the second Wiener chaos, then
[TABLE]
The proofs of these statements can be found in the appendix along with an explicit representation of the Gamma operators in terms of contractions.
2.4 Useful facts on Second Wiener Chaos
Let , for some be a generic element in the second Wiener chaos. It is a classical result (see [NP12a, section 2.7.4]) that these kind of random variables can be analyzed through the associated Hilbert-Schmidt operator that maps . Denote by the set of eigenvalues of . We also introduce the following sequence of auxiliary kernels \Big{\{}f\mathbin{\otimes_{1}^{(p)}}f:p\geqslant 1\Big{\}}\subset\mathfrak{H}^{\odot 2}, defined recursively as , and, for , f\mathbin{\otimes_{1}^{(p)}}f=\Big{(}f\mathbin{\otimes_{1}^{(p-1)}}f\Big{)}\mathbin{\otimes_{1}}f.
Proposition 2.2**.**
*(see e.g. [NP12a, p. 43])
The random element admits the representation
[TABLE]
where the are i.i.d. and the series converges in and almost surely. 2. 2.
For every
[TABLE]
*where stands for the trace of the *th power of operator .
It is known that when is an integer, is a centered chi-squared random variable with degrees of freedom, and (11) shows that is itself an element of the second Wiener chaos, where -many of the eigenvalues are and the remaining ones are [math]. Hence, in this case, we deduce from (12) that . Perhaps not surprisingly, this is also the case when is any positive real number.
Lemma 2.3**.**
Let and . Then
[TABLE]
Proof.
Since the cumulant generating function of a Gamma random variable is well-known, we can easily compute that of to be . By simple induction over , we obtain
[TABLE]
The result now follows by letting . ∎
Lemma 2.4**.**
Let for some , and denote by the corresponding Hilbert-Schmidt operator with eigenvalues . Then for every ,
[TABLE]
Proof.
From [APP15] equation (24), which follows by induction on , we have the representation
[TABLE]
Using the isometry property (6), we obtain
[TABLE]
The result now follows with (12). ∎
3 Stein’s Method for the centered Gamma distribution
Let be distributed according to a Gamma distribution with shape parameter . It means that random variable admits the density
[TABLE]
Consider the centered Gamma random variable . Stein’s method for has first been studied in [Luk94] and then later been refined in [Pic04]. It is well known (see e.g. [DP18, equation (24)]) that the Stein equation for the centered Gamma random variable associated to the test function is given by the following first order ODE with polynomial coefficients
[TABLE]
where is measurable and . The following result is taken from [DP18, Theorem 2.3] and plays a crucial role in our analysis. For the reader’s convenience we restate it here. We also need the following convention that for every function the quantity stands for the smallest Lipschitz constant, i.e.
[TABLE]
It is worth pointing out that coincides with the uniform norm of the derivative of whenever is differentiable.
Theorem 3.1**.**
([DP18, Theorem 2.3]) (a) Let be a Lipschitz-continuous function on the whole real line . Then there exists a unique bounded Lipschitz-continuous solution to the equation (16) on the whole real line satisfying the bounds
[TABLE]
*where the constant .
(b) Suppose that the function is continuously differentiable on such that both and are Lipschitz-continuous. Then there is a continuously differentiable solution of equation (16) on whose derivative is Lipschitz-continuous, and moreover*
[TABLE]
3.1 Explicit Formula for the Solution of the Stein Equation
This section is entirely based on [DP18]. It is known that a Stein equation for the distribution is given by
[TABLE]
where is a measurable test function with . Döbler and Peccati [DP18, p. 3406] showed that if , then there exists a unique Lipschitz-continuous function on solving (18), given by
[TABLE]
where for , f_{h}^{-}(x)=\frac{1}{xq_{l}(x)}\int_{0}^{x}\Big{(}h(t)-\mathbb{E}\big{[}h(X_{r})\big{]}\Big{)}q_{l}(t)dt and . Also f_{h}^{+}(x)=\frac{1}{xp_{r}(x)}\int_{0}^{x}\Big{(}h(t)-\mathbb{E}\big{[}h(X_{r})\big{]}\Big{)}p_{r}(t)dt for . Furthermore, one can extend and continuously by setting . Now, for a given test function , set . Following [DP18, p. 3399], if is the solution of (18) (with ), where is replaced by , then solves (16). Therefore, the unique bounded solution of the Stein equation (16) admits the following explicit representation
[TABLE]
where is the density of the centered Gamma distribution given by
[TABLE]
and \hat{q}(x):=\frac{1}{2}\,q_{l}\left(\frac{x+\nu}{2}\right)=-\,2^{-\frac{\nu}{2}}\big{(}-(x+\nu)\big{)}^{\frac{\nu}{2}-1}\,e^{-\frac{x+\nu}{2}}. Also note that
[TABLE]
The following lemma will be used in the proof of Proposition 3.7. Using a simple adaptation, a similar statement also holds for the solution corresponding to the Stein equation (16) of the centered Gamma distribution .
Lemma 3.2**.**
Let with cumulative distribution function , and be a Lipschitz-continuous function. Then there exist two non-negative bounded functions on , and on such that as , and the following estimates are in order:
- (a)
for it holds that \Big{|}f^{\prime}_{h}(x)\Big{|}\leq 2\|h^{\prime}\|_{\infty}U^{+}(x), 2. (b)
for it holds that \Big{|}f^{\prime}_{h}(x)\Big{|}\leq 2\|h^{\prime}\|_{\infty}U^{-}(x).
Proof.
Let . Consider
[TABLE]
It is known that both estimates in parts (a) and (b) take place with instead of (see [Döb15, Corollary 3.15. Part (b)], and [DP18, relation (35), page 4304]). Moreover, for , the function satisfies
[TABLE]
Also, it is straightforward to check that as , the function is decreasing to [math]. (It is also true that for [DP18, see the top of page 3403]). Part (b) is similar. ∎
3.2 An Operator Theory Approach
Let . Define
[TABLE]
Lemma 3.3**.**
Let . For every given , define . Then is a norm on the real vector space , and furthermore the pair is a Banach space, the so-called Lipschitz-space.
Proof.
It is straightforward to see that the pair is a normed space. Furthermore, it is a classical fact that it is a Banach space, see for example [Wea99, Proposition 6.1.2]. ∎
Lemma 3.4**.**
Consider the mapping such that for every , the action is defined as the unique bounded solution to the centered Gamma Stein equation (16), which is guaranteed to exist by Theorem 3.1 item (a). Then , and is a bounded linear operator from the Banach space to itself.
Proof.
Let . Then a direct application of Theorem 3.1 item (a) yields that . To show linearity of , take , and . Then using the Gamma Stein equation (16), together with the fact that is the unique bounded solution to the latter, we infer that . For the boundedness of we apply Theorem 3.1 part (a) to obtain
[TABLE]
Hence ∎
Proposition 3.5**.**
Consider the bounded linear operator defined as in Lemma 3.4. Then the following statements are in order.
- (a)
The operator does not admit any non-zero eigenvalue, i.e. if for some non-zero constant , then necessary . 2. (b)
For every non-zero scalar , the operator is a one to one map, where stands for the identity operator.
Proof.
(a) By contrary assume that there exists a non-zero scalar such that
[TABLE]
We claim that . Otherwise introduce the auxiliary test function . Then, obviously, , and moreover by virtue of relation (21), we have . Furthermore, we have , because . Therefore, the function satisfies the first order non-homogeneous ode
[TABLE]
Then general solutions of the ode (22) on the interval are given by
[TABLE]
where . Now, if , then as , we have
[TABLE]
This implies that as , which is a contradiction to the fact that must be a bound function. When , i.e. as , we obtain that for some finite constant that
[TABLE]
which is either an infinite number or a finite number depending on whether is a negative integer or not. Therefore, in any case, we have obtained that as , which is a contradiction. Hence always . This implies that by using (20). On the other hand, satisfies the first order ode (16), and therefore
[TABLE]
The general solutions of the ordinary differential equation (24) on the interval are given by
[TABLE]
for some constant . If , then this is a contradiction to the fact that is a bounded function over the whole real line. Hence it must hold that . Similarly, the general solutions of the ordinary differential equation (24) on the interval are given by
[TABLE]
where is a general constant. Now if , we infer that is unbounded on the domain , which leads to a contradiction. Therefore , and as a direct consequence we get .
(b) Assume that is a non-zero scalar. Then the mapping is a linear operator. Hence, is a one to one map if and only if , and the latter follows at once from part (a). ∎
Lemma 3.6**.**
Let be a sequence of -Lipschitz continuous functions for every : i.e. for all , and every ,
[TABLE]
Assume further that pointwise as tends to infinity. Then is also an -Lipschitz function and uniformly.
Proof.
It is elementary. ∎
Proposition 3.7**.**
The bounded linear operator defined as in Lemma 3.4 is a compact operator.
Proof.
Let denote the unit ball of the Banach space . We need to show that the image of the unit ball is a precompact set in , or equivalently, that every sequence has a convergent subsequence in the topology of the Banach space . We divide the rest of the proof in three steps.
Step (1): First we show that there exists a subsequence such that pointwise for some . Moreover , and pointwise. Note that is a bounded subset of . It is well known (see for example [Wea99, Chapter 2] or [Wea18, Theorem 2.4, and Proposition 2.1] as well as the survey [God15]) that the Banach space is a predual space, i.e. there exists a (unique) Banach space , the so called Arens-Eells space, such that . On the other hand, the Banach-Alaoglu theorem implies that the unit ball is weak-∗ compact. Moreover, is a separable Banach space, so the Arens-Eells Banach space is, too [God15]. Hence the weak-∗topology on is metrizable. Therefore, weak-∗ compact is the same as weak-∗ sequentially compact on the unit ball . It follows that the sequence contains a subsequence that converges in the weak-∗ topology to an element . Without loss of generality, we assume that the subsequence is given by the sequence itself. Hence there exists an element such that in the -topology. Furthermore, the weak-∗ topology on the bounded subsets of coincides with the topology of pointwise convergence, see [Wea18, Proposition 2.1]. As a consequence, pointwise (here one should not expect that weakly; otherwise this implies that the unit ball is weakly sequentially compact, and therefore the Banach space is reflexive which is a contradiction). An application of the Lebesgue dominated convergence theorem implies that pointwise. Taking into account these observations together with the fact that for every we have
[TABLE]
there exists a function such that pointwise. On the other hand, for every we have that
[TABLE]
Recall that . Hence, the function satisfies the Gamma Stein equation
[TABLE]
Hence , and also pointwise.
Step (2): In this step, we show that is a family of functions having the equivanishing at infinity property, i.e. for every given , there exists a compact interval such that \big{|}f(x)\big{|}<\varepsilon for all and for all . To do this, we use the explicit integral representation (19). Note that since , we have for all . When , then (recall that is the density of ):
[TABLE]
Now if , then and thus
[TABLE]
When , set . We have
[TABLE]
where is a polynomial of degree . Since we always have , it follows that . When , again using (19) of the explicit representation of the solution function , we get
[TABLE]
Hence, the case can now be discussed similarly. Note that the upper bounds for that we found do not depend on the choice of the test function . Therefore, we have shown that, in addition to , the collection is a family of functions that are equivanishing at infinity.
Step (3): Next we show that as ,
[TABLE]
By Step , for a given , there exists a compact interval such that
[TABLE]
On the other hand, the family consists of -Lipschitz-continuous functions (see part (a), Theorem 3.1), and by step (1) converges pointwise to on the compact interval . Hence, Lemma 3.6 yields that
[TABLE]
Finally relations (28) and (29) readily imply that uniformly on the real line. Now, we are left to show that . To this end, first note that for every , and every it holds that . Hence, the family consists of -Lipschitz continuous functions. On the other hand, Lemma 3.2 yields that the family is equivanishing at infinity. The result now follows.
∎
Theorem 3.8**.**
Let be a non-zero scalar. Then for every there exists a unique solution to the functional equation
[TABLE]
Proof.
This is a direct application of Propositions 3.5, 3.7, and the classical Fredholm alternative Theorem [Meg98, 3.4.24, page 329]. ∎
For , let denote the ball of radius .
Proposition 3.9**.**
Let , and be a non-zero scalar. Then there exists a universal constant (may depend on , , and ) such that for every the unique solution of the functional equation (30) satisfies .
Proof.
From Proposition 3.5 and Theorem 3.8, the linear bounded operator is a bijective map. Hence the result follows at once using the inverse mapping Theorem [Meg98, 1.6.6 Corollary]. ∎
4 Optimal Gamma Approximation
4.1 A General Stein-Malliavin Upper Bound
In the following, we present a general Malliavin-Stein upper bound that constitutes the cornerstone to achieve our final optimal goal. We start with the following useful result. Sometimes, we will use centered versions of the Gamma-operators, i.e.
[TABLE]
Proposition 4.1**.**
Let be a centered random variable admitting a finite chaos expansion with . Let . Then there exists a constant (only depending on ), such that
[TABLE]
where recall that \mathcal{B}_{1,1}:=\big{\{}h:\mathbb{R}\to\mathbb{R},\,\text{Lipschitz-continuous}\,:\,\|h\|\leq 1,\,\|h^{\prime}\|_{\infty}\leq 1\big{\}}.
Proof.
Consider the centered Gamma Stein equation (16). Let be an arbitrary test function (note that ). Then by using the Malliavin integration by parts formula (8), we get
[TABLE]
Now the claim follows at once by a direct application of Theorem 3.1. ∎
To simplify computations, we continue with the following useful Lemmas.
Lemma 4.2**.**
Let be a Lipschitz-continuous function, where and are bounded by a constant only depending on . Consider the solution of the Gamma Stein equation (16) associated to the test functions . Assume that is a centered random variable with variance . Then for any :
[TABLE]
Proof.
First note that . Thus
[TABLE]
Now, we use the integration-by-parts formula (8) in combination with the chain rule (7) to obtain
[TABLE]
and similarly
[TABLE]
Hence, putting everything together, the result follows. ∎
Lemma 4.3**.**
Let be a Lipschitz-continuous function, where and are bounded by a constant only depending on . Assume that and stand for the solutions of the Gamma Stein equation (16) associated to the test functions and respectively. Let be a centered random variable with variance . Then the following identities take place.
- (a)
[TABLE]
- (b)
[TABLE]
Proof.
We apply Lemma 4.2 twice to obtain
[TABLE]
Note that we cannot translate directly into the fourth cumulant, but instead by Proposition 2.1 part (c), we have . The variance term can be written as
[TABLE]
Putting everything together, the claim follows. ∎
Remark 4.4**.**
We point out that for both linear cumulant combinations appearing in the right hand sides of parts (a) and (b) in Lemma 4.3 it holds that
[TABLE]
Now, we are ready to state the main result of this section.
Theorem 4.5**.**
Let be a centered random variable admitting a finite chaos expansion with . Let . Then there exists a constant (only depending on ), such that
[TABLE]
Proof.
Using Proposition 4.1, Theorem 3.8 with , and Proposition 3.9 we obtain that
[TABLE]
where stands for a general constant depending only on the parameter . Now, we apply Lemma 4.3 item (b) on \mathbb{E}\left[h(F)\big{(}\operatorname{\overline{\Gamma}}_{1}(F)-2F\big{)}\right], and item (a) on \mathbb{E}\left[S(h)(F)\big{(}\operatorname{\overline{\Gamma}}_{1}(F)-2F\big{)}\right]. Then putting everything together the result follows by applying Cauchy-Schwarz inequality, Theorem 3.1, as well as using the fact that , and , see (13). ∎
Remark 4.6**.**
The splitting technique implemented in the proof of Theorem 4.5 by using operator theory is vital to obtain an optimal upper bound. In fact, not doing it, instead of estimate (4.5), the best estimate one can achieve (under the assumption in Theorem 4.5) is a similar bound as (4.5) with the quantity instead of
[TABLE]
On the other hand, it is not difficult to see that for a sequence in the second Wiener chaos with a finite number of non-zero spectral coefficients such that for every , as it holds that
[TABLE]
resulting in a suboptimal rate. See also illustrating Example 4.13 for further clarifications.
4.2 The Upper Bound: Second Wiener Chaos
In the present section, in order to handle the variance quantities of the Gamma operators appearing in the right hand side of estimate (4.5) in terms of cumulants, we consider the case of second Wiener chaos. In this setting, the connection is apparent thanks to Lemma 2.4.
Proposition 4.7**.**
Let , and be in the second Wiener chaos such that . Then, for every , with constant , we have
[TABLE]
In particular, by choosing , we obtain
[TABLE]
Proof.
Let’s prove the first estimate in (33). Then the second estimate could be proven by iteration using similar arguments. Let . Denote by the associated Hilbert-Schmidt operator. As in the proof of Lemma 2.4, we can write
[TABLE]
where in the third step, we have used the trace inequality for non-negative operators , see [Liu07]. ∎
Remark 4.8**.**
The estimates in (33) can also deduce from representation (14) together with the classical estimate in [BBNP12, Lemma 4.2].
Proposition 4.9**.**
Let , and in the second Wiener chaos such that . Assume . Then there exists a general constant (possibly depending on the parameters and ) such that
[TABLE]
In particular, by choosing , we obtain the crucial estimate
[TABLE]
Proof.
For the first estimate, using representation (14) we can write
[TABLE]
where we have used the classical estimate in [BBNP12, Lemma 4.2]. The second estimate is a direct application of [Dra16, Corollary 1] with combined with for every , see the proof of Lemma 2.4. ∎
4.3 The Lower Bound: Second Wiener Chaos
Proposition 4.10**.**
Let , and be in the second Wiener chaos such that . Then there exists a general constant (possibly depending on the parameter ) such that
[TABLE]
*where the quantity is given by (3). *
Proof.
Fix a real number whose range of values will be determined later on. Taking into account the second moment assumption, it is a classical result (see [Luk70, Chapter ]) that the characteristic functions and are analytic inside the strip . Moreover, in the strip of regularity , they follow the integral representations
[TABLE]
where and stand for the probability measures of and respectively. Recall that all elements in the second Wiener chaos have exponential moments, see [NP12a, Proposition , item (iii)]. Denote by the domain
[TABLE]
Then for any , together with a Fubini’s argument, we have that
[TABLE]
Hence for every . Let such that the disk with the origin as center and radius is contained in the domain (note that depends only on , since is a free parameter. For example, one can choose ). Now for any , and using the fact that
[TABLE]
one can readily conclude that the function is bounded away from [math] on the disk . Also, for any ,
[TABLE]
Therefore, for any ,
[TABLE]
Hence the function is also bounded away from [math] on the disk . Also, relation (36) implies that the following power series (complex variable) converge to some analytic function as soon as ;
[TABLE]
Thus we come to the conclusion that the functions and are analytic on the disk . Moreover, there exists a constant such that for every . This implies that on the disk there exist two analytic functions and such that
[TABLE]
i.e. and , for . In fact, the functions and are given by the power series (37). Since the derivative of the analytic branch of the complex logarithm is (see [Con95, Corollary ]), one can infer that for some constant whose value may differ from line to line and for every , we have
[TABLE]
Now, using Cauchy’s estimate for the coefficients of analytic functions, for any , we obtain that
[TABLE]
Therefore, \max\Big{\{}\Big{\lvert}\kappa_{3}(F)-\kappa_{3}(G(\nu))\Big{\rvert},\Big{\lvert}\kappa_{4}(F)-\kappa_{4}(G(\nu))\Big{\rvert}\Big{\}}\leq_{C}d_{2}(F,G(\nu)).
∎
4.4 Main Result: Non Asymptotic Optimal Gamma Approximation
Now we are ready to present a non asymptotic optimal Gamma approximation in full generality on the second Wiener chaos in terms of the maximum of the third and fourth cumulants. The following result provides an analogous counterpart to the same phenomenon in the case of normal approximation, see [NP15, Theorem 1.2] or Theorem 1.1 item (b).
Theorem 4.11**.**
Let , and . Assume that belongs to the second Wiener chaos such that . Then there exist two general constants (possibly depending on the parameter ) such that
[TABLE]
Recall that
[TABLE]
Proof.
For the upper bound combine Theorem 4.5 with Proposition 4.7 estimate (34), Proposition 4.9 estimate (35) as well as Lemma 2.4 with . The lower bound directly follows from Proposition 4.10. ∎
Remark 4.12**.**
In this remark we shortly comment on a natural thought relating to the generalization of the optimal rate (38) to higher order Wiener chaoses. In addition a complete lack of any non-artificial example of a sequence of random variables in a fixed Wiener chaos of order converging towards the distribution, our investigations imply that such an extension would come at the cost of very complicated computations involving norms of contraction operators to verify estimate (35) (possibly with a different constant). Furthermore, our method to achieve the optimal lower bound, relying on complex analysis, cannot be used anymore in higher order chaoses, and hence one requires the introduction of new ideas.
4.5 Examples
We start with the following naive example that illustrates the essential role of our operator theory technique to achieve the optimal rate. It is worth mentioning that all the rates achieved in the forthcoming examples are better (by a square power) than those that can be obtained by the Malliavin-Stein bound [NP09b, Theorem 1.5]. In the following, when and are two non-negative real number sequences, we write if , for some constant .
Example 4.13**.**
Let be independent. Consider the sequence
[TABLE]
First note that for every . Also, using Proposition 2.2 item , and relation (13), simple computations yield that . Similarly . Therefore, our main Theorem 4.11 implies
[TABLE]
The following important remarks are in order. (a) This example represents a typical scenario, in which, in order to obtain the optimal upper bound, one needs to join together two Gamma quantities and . In fact, it is not difficult, using Lemma 2.4, to see that
[TABLE]
And now consider Remark 4.6. (b) It is classical that the density function of the random variable admits the following explicit representation in terms of confluent hypergeometric functions,
[TABLE]
Also recall that the density of the target is given by . Using rather long and tedious computations, one can show that the optimal estimate (39) continues to hold in the stronger distance of total variation, namely that
[TABLE]
Example 4.14**.**
(U-statistics) In this example, we consider a second order U-statistic with degeneracy order inspired by [AAPS17, section 3.1]. The reader may consult the excellent textbook [Ser80] for a general asymptotic theory of -statistics. Let be an orthonormal basis of and for set . Consider
[TABLE]
Then as with parameter . Furthermore to fix the variance to , define
[TABLE]
We consider the associated Hilbert-Schmidt operator . Using the fact that we can explicitly compute the non-zero eigenvalues of . They are
[TABLE]
Therefore, as , gathering Proposition 2.2 item , relation (40) and Theorem 4.11 we get that
[TABLE]
In the next example we consider the important problem of the asymptotic behavior of the least squares estimators in the autoregressive models in the nearly non-stationary regime, where the target distribution shows up. For more details on this fascinating subject, we refer the reader to [CW87, CW88, Whi58, Rao78, BC13, LLQM11] and references therein when the noise is a martingale difference, and [BC07] when the innovation process exhibits long-range dependence. We also refer to [GT05, Proposition 2] for a study of optimal rates in a general context of quadratic forms.
Example 4.15**.**
(Least square estimator in nearly non stationary model) Let . Let . We consider the first order autoregressive process , where , for all and is a white noise, i.e. a sequence of i.i.d. random variables. It is classical that the least squares estimator of the unknown parameter , based on discrete observations , is given by
[TABLE]
Define
[TABLE]
Then [CW87, Theorem 1] implies that as :
[TABLE]
where is a standard Brownian motion. In particular when , we observe that (equality in law), and hence we obtain that . Now, apply Example 4.14 to deduce that d_{2}\big{(}W_{n},G(1)\big{)}\approx_{C}\frac{1}{n}.
Example 4.16**.**
(Least square estimator in model) In this example, we consider the second order autoregressive model:
[TABLE]
where is a white noise, and . Further, assume that the roots of the associated characteristic polynomial are and , and lie on the unit disk. Under this condition it is easy to see that and . The least square estimator of the parameter for is given by
[TABLE]
In [CW88], the asymptotic behavior of has been derived where
[TABLE]
Following [CW88, Corollary 3.3.8], as , one can deduce that
[TABLE]
Note that the sequence belongs to the second Wiener chaos. An interesting feature of the previous limit theorem is that although the sequence does depend on the parameter in the model, the target distribution is independent of . On the other hand, relation (41) together with the assumption yields that
[TABLE]
Therefore,
[TABLE]
By elementary combinatorics, we have for any function that . Using this, and evaluating the sums of sine functions (which are just geometric sums after writing them in terms of complex exponentials), we get
[TABLE]
Note that \big{|}\kappa_{2}(W^{\theta}_{n})-4\big{|}\approx_{C}1/n as . Now we scale so that it has variance equal to for every . Set , and let . Using (12), and after some tedious computations, we get that
[TABLE]
Using that as , we see that (note that ), and furthermore,
[TABLE]
Similar computations yield that . Therefore, Theorem 4.11 can be applied to deduce that .
Example 4.17**.**
(Quadratic forms [dWV73] and [AAPS17, section 3.2]) In this example, we consider a general quadratic form in independent standard normal random variables
[TABLE]
where is an symmetric matrix, and is a sequence of i.i.d standard normal random variables. Let be an integer number. Now, we make the following assumptions:
- (a)
The second moment assumption: .
- (b)
There exists a sequence of real numbers such that as :
[TABLE]
- (c)
For every , as it holds that: .
Now a direct application of [dWV73, Theorem 2] implies that . Note that for every relying on condition (a). Moreover, one can write , where stands for an orthonormal basis of , and for ,as before, we set . Therefore our main Theorem 4.11 entails that
[TABLE]
Depending on the particular choice of the matrix in the original quadratic form , we can provide explicit rates (in terms of suitable powers of ) in the asymptotic relation (43). For example, following [dWV73, remark after Theorem 2] and [AAPS17, Corollary 3.2], assume that is a sequence of distinct orthonormal functions in such that for some . Here denotes the space of all Hölder continuous functions with Hölder exponent . Consider the square integrable kernel defined as
[TABLE]
Finally, for and we set
[TABLE]
Now consider the sequence associated to the symmetric matrix belonging to the second Wiener chaos. Then, it is straightforward to check that the conditions (a)-(b)-(c) are in order with . On the other hand, it has been shown [AAPS17, Corollary 3.2] that:
[TABLE]
Putting together the asymptotic estimates (43) and (44), we obtain the optimal rate . Also, the example presented on page in [NP09b] can be treated in this framework, and resulting in an improved optimal rate of .
5 Appendix
The following lemma provides an explicit representation of the new Gamma operators used in this paper in terms of contractions. Recall that these are not the same as e.g. in [NP10], but rather the new ones introduced in (9).
Lemma 5.1**.**
For , lets , for some be an element of the -th Wiener chaos. Then
[TABLE]
where the constants are recursively defined via , and for ,
[TABLE]
Proof.
It follows by induction on and similar lines of arguments as in [NP10, Proof of Theorem 5.1].
∎
Proof of Proposition 2.1.
Part (a) is clear from the definition. Part (b) for is also trivial. For , we use the fact that , as well as the integration by parts formula (8), to get
[TABLE]
For part (c), consider
[TABLE]
For part (d), we consider the representation of given in equation (5.25) of [NP10]. The representation is exactly the same as for (Lemma 5.1), except for the recursive formula of the constants . For they are given by , and for ,
[TABLE]
Comparing this with our formula (46), we see that only the first factor is different, namely instead of . But now for , the indicator dictates that . Hence . Therefore, the two notions of Gamma operators coincide when . ∎
Acknowledgments
The authors would like to thank Simon Campese for pointing out a mistake in the proof of Theorem 4.5.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AAPS 17] B. Arras, E. Azmoodeh, G. Poly, and Y. Swan. A bound on the 2-Wasserstein distance between linear combinations of independent random variables. 2017, ar Xiv:1704.01376 v 2. To appear in Stochastic processes and their Applications .
- 2[ACP 14] E. Azmoodeh, S. Campese, and G. Poly. Fourth Moment Theorems for Markov diffusion generators. J. Funct. Anal. , 266(4):2341–2359, 2014.
- 3[AEK 18] E. Azmoodeh, P. Eichelsbacher, and L. Knichel. On the Rate of Convergence to a Gamma Distribution on Wiener Space, 2018, ar Xiv:1806.03878 v 2.
- 4[AMMP 16] E. Azmoodeh, D. Malicet, G. Mijoule, and G. Poly. Generalization of the Nualart-Peccati criterion. Ann. Probab. , 44(2):924–954, 2016.
- 5[AMPS 17] B. Arras, G. Mijoule, G. Poly, and Y. Swan. A new approach to the Stein-Tikhomirov method: with applications to the second Wiener chaos and Dickman convergence, 2017, ar Xiv:1605.06819 v 2.
- 6[APP 15] E. Azmoodeh, G. Peccati, and G. Poly. Convergence towards linear combinations of chi-squared random variables: a Malliavin-based approach. In In memoriam Marc Yor—Séminaire de Probabilités XLVII , volume 2137 of Lecture Notes in Math. , pages 339–367. Springer, Cham, 2015.
- 7[AS 17] B. Arras and Y. Swan. A stroll along the gamma. Stochastic Process. Appl. , 127(11):3661–3688, 2017.
- 8[BBNP 12] H. Biermé, A. Bonami, I. Nourdin, and G. Peccati. Optimal Berry-Esseen rates on the Wiener space: the barrier of third and fourth cumulants. ALEA Lat. Am. J. Probab. Math. Stat. , 9(2):473–500, 2012.
