The Breuer-Major Theorem in total variation: improved rates under minimal regularity
Ivan Nourdin, David Nualart, Giovanni Peccati

TL;DR
This paper improves the total variation distance estimates in the Breuer-Major theorem by combining Malliavin-Stein methods with Gebelein's inequality, under minimal regularity assumptions on the function involved.
Contribution
It provides new bounds for the total variation distance in the Breuer-Major theorem using weaker regularity conditions and novel Malliavin operator estimates.
Findings
Enhanced total variation bounds under minimal regularity
Novel combination of Gebelein's inequality with Malliavin techniques
Applicable to functions with weak differentiability and finite moments
Abstract
In this paper we prove an estimate for the total variation distance, in the framework of the Breuer-Major theorem, using the Malliavin-Stein method, assuming the underlying function to be once weakly differentiable with and having finite moments of order four with respect to the standard Gaussian density. This result is proved by a combination of Gebelein's inequality and some novel estimates involving Malliavin operators.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
The Breuer-Major Theorem in total variation:
improved rates under minimal regularity
Ivan Nourdin
Ivan Nourdin, Université du Luxembourg, Unité de Recherche en Mathématiques, Maison du Nombre, 6 avenue de la Fonte, L-4364 Esch-sur-Alzette, Grand Duché du Luxembourg
,
David Nualart
David Nualart, Department of Mathematics, University of Kansas, 405 Snow Hall, Lawrence, Kansas, 66045, USA
and
Giovanni Peccati
Giovanni Peccati, Université du Luxembourg, Unité de Recherche en Mathématiques, Maison du Nombre, 6 avenue de la Fonte, L-4364 Esch-sur-Alzette, Grand Duché du Luxembourg
Abstract.
In this paper we prove an estimate for the total variation distance, in the framework of the Breuer-Major theorem, using the Malliavin-Stein method, assuming the underlying function to be once weakly differentiable with and having finite moments of order four with respect to the standard Gaussian density. This result is proved by a combination of Gebelein’s inequality and some novel estimates involving Malliavin operators.
**Keywords: ** Breuer-Major theorem; Integration by Parts; Rate of Convergence; Malliavin-Stein approach.
I. Nourdin was supported by the FNR grant APOGee at Luxembourg University, D. Nualart by the NSF grant DMS 1811181, and G. Peccati by the FNR grant FoRGES (R-AGR-3376-10) at Luxembourg University.
1. Introduction
1.1. Overview and main findings
Let be a real-valued centered stationary Gaussian sequence with unit variance, that we assume to be defined on an appropriate probability space . For , set if , and if . Denoting by the standard Gaussian measure on the real line, we say that a function has Hermite rank if
[TABLE]
where , is the th Hermite polynomial (to be formally defined in Section 2.1), and the series converges in . The forthcoming Theorem 1.1 — known as the Breuer-Major Theorem (see [3], as well as [27]) — establishes a sufficient condition for the sequence
[TABLE]
to verify a Central Limit Theorem (CLT).
Remark on notation. From now on, we write ) to indicate a generic random variable with mean and variance . We also put and for , denotes a standard normal Gaussian variable. The symbol denotes convergence in distribution of random elements. Given two real-valued random variables , the total variation distance between the distributions of and is defined as
[TABLE]
where the supremum runs over the class of all Borel subsets of . Depending on notational convenience, given a numerical sequence , we will often write to indicate the full sum , whenever it is well-defined. Finally, given a random variable , we use the notation , for every .
Theorem 1.1** **(Breuer-Major).
Let have Hermite rank , and assume moreover that
[TABLE]
Then, as ,
[TABLE]
where
[TABLE]
Theorem 1.1 is one of the staples of modern Gaussian analysis, with far-reaching applications ranging from stochastic geometry to mathematical statistics and information theory — see e.g. [7, 12, 25, 28] for a general discussion, as well as [1, 4, 5, 10, 12, 13, 14, 16] for a sample of recent extensions and ramifications.
Using the fact that the limiting random variable has a density, it is straightforward to deduce from the second Dini’s theorem that the convergence (1.5) always takes place in the sense of the Kolmogorov distance, that is: with the notation ,
[TABLE]
On the other hand, determining wether (1.5) takes place in the sense of the total variation distance (1.3) is a more delicate matter, for which no exhaustive criterion is currently known. The difficulty of such an issue is demonstrated by considering the following two facts, corresponding to choices of the function in the Breuer-Major Theorem yielding contrasting behaviours with respect to :
- (a)
according to the main results of [18], if in Theorem 1.1 is a polynomial, then necessarily , as ;
- (b)
if takes values in a discrete set, then (trivially) for every .
The aim of the present paper is to deduce new explicit bounds on the total variation distance
[TABLE]
and a standard normal random variable , in the case where has Hermite rank . We will see that our estimates imply minimal regularity conditions on , in order for the limiting relation (or, equivalently, ) to take place. Moreover, under comparable regularity assumptions on , the rates of convergence provided by our bounds are better than or commmensurate to the best estimates to date, obtained in [9, 17, 23]. The main tool exploited in our analysis is a non-trivial combination of Gebelein’s inequality (recalled in Section 2.4 below, and already used in [17]), and some novel estimates involving Malliavin operators — see e.g. the forthcoming Lemma 2.2.
Our main findings are contained in the following statement, in which we use the notation , , , to denote the Sobolev space given by the closure of the class of polynomials mappings with respect to the norm
[TABLE]
where denotes the th derivative of as a function of .
The following is the main result of this paper.
Theorem 1.2**.**
Assume that has Hermite rank and belongs to . Suppose that (1.4) holds and that defined by (1.6) is strictly positive. Let be the random variable defined in (1.7). Then, there exists a constant independent of such that
[TABLE]
Note that the right-hand side of (1.8) (as well as those of the forthcoming bounds (1.10) and (1.11)) converges to zero, as , by virtue of Lemma 3.2.
1.2. Comparison with existing results
We will now compare Theorem 1.2 with three relevant papers in the recent literature. Such a comparison exploits the log-convexity of norms, see e.g. [26, Lemma 1.11.5]:
[TABLE]
- (1)
In [17], the following two facts are proved: (1a) if and has Hermite rank equal to 1, then there exists an absolute constant such that , and (1b) if and is even, then
[TABLE]
In view of the usual CLT, the estimate at Point (1a) cannot be improved. On the other hand, since an even function has Hermite rank equal to 2, the estimate at (1.10) can be meaningfully compared with our Theorem 1.2. A direct use of (1.9) shows that, if (that is, is absolutely summable), then the right-hand sides of (1.8) and (1.10) are both bounded by a multiple of , while (1.8) is systematically smaller than (1.10) when .
- (2)
Given , we define , that is, is the element of obtained by taking the absolute value of the coefficients appearing in the Hermite expansion of . In [9], the following results are proved: (2a) the bound
[TABLE]
holds whenever and has Hermite rank 2, and (2b) one has the estimate
[TABLE]
if and has Hermite rank 2. The estimate at Point (2a) is the same as the one appearing in our bound (1.8), but is obtained under the strictly stronger assumption that . On the other hand, one can use the results of [13] to show that a multiple of the sequence also constitutes a lower bound for in the case .
- (3)
In [23], the following is proved: (3a) if and has Hermite rank 1, then , (3b) if , and has Hermite rank , then the bound (1.8) holds true, and (3c) if and has Hermite rank 2, then
[TABLE]
As observed at Point (2), the upper bound (1.12) cannot be improved.
We would like to emphasize that, unlike in previous works, the bound (1.8) for functions of Hermite rank is obtained here assuming only that is once weakly differentiable. In particular this bound holds for for any .
1.3. Plan
The paper is organized as follows. Section 2 contains some preliminaries on the Malliavin calculus associated with a Gaussian family of random variables and on the Malliavin-Stein method for estimating the total variation distance. We also include in this section two basic inequalities that play an important role in the proofs: a version of the Brascamp-Lieb inequality and Gebelein’s inequality. Section 3 is devoted to the proof of Theorem 1.2.
2. Preliminaries
In this section, we briefly recall some elements of the Malliavin calculus of variations associated with a Gaussian family of random variables. We refer the reader to [12, 19, 20] for a detailed account of this topic. We will also recall a crucial estimate for the total variation distance proved using the Malliavin-Stein approach, and prove two inequalities which will be used in the proof of Theorem 1.2.
2.1. Malliavin calculus
Let be a real separable Hilbert space; in order to simplify our discussion, we will assume for the rest of the paper that , where is a -finite measure space such that has no atoms. For any integer , we use the symbols and to denote the -th tensor product and the -th symmetric tensor product of , respectively. We now let denote an isonormal Gaussian process over the Hilbert space . This means that is a centered Gaussian family of random variables defined on , with covariance
[TABLE]
Without loss of generality, we can assume that is generated by .
We denote by the closed linear subspace of generated by the random variables , where is the -th Hermite polynomial defined by
[TABLE]
and . The space is the Wiener chaos of order associated with . The -th multiple integral of is defined by the identity for any with . The map provides a linear isometry between (equipped with the norm ) and (equipped with norm). By convention, and .
The space can be decomposed into the infinite orthogonal sum of the spaces . Namely, for any square integrable random variable , we have the following expansion,
[TABLE]
where , and are uniquely determined by . The representation (2.1) is known as the Wiener chaos expansion of .
For a smooth and cylindrical random variable , with and (meaning that and its partial derivatives are bounded), we define its Malliavin derivative as the -valued random variable given by
[TABLE]
By iteration, we can also define the -th derivative , which is an element in the space . For any real and any integer , the Sobolev space is defined as the closure of the space of smooth and cylindrical random variables with respect to the norm defined by
[TABLE]
Notice that if is an element in the first Wiener chaos with , then (using the notation introduced before Theorem 1.2) if and only if .
We define the divergence operator as the adjoint of the derivative operator . Namely, an element belongs to the domain of , denoted by , if there is a constant depending on and satisfying
[TABLE]
for any . If , the random variable is defined by the duality relationship
[TABLE]
which is valid for all . In a similar way, for each integer , we define the iterated divergence operator through the duality relationship
[TABLE]
valid for any , where .
Let be the standard Gaussian measure on . The Hermite polynomials form a complete orthonormal system in and any function admits an orthogonal expansion of the form (1.1). If has Hermite rank , for any integer , we define the operator by
[TABLE]
To simplify the notation we will write .
Suppose that is a random variable in the first Wiener chaos of of the form , where has norm one. Then one can check that has the representation
[TABLE]
Moreover, if for some and , then ; in particular, for some constant only depending on , one has that
[TABLE]
We refer to [23] for the proof of these results.
The family of operators is defined for random variables of the form (2.1) via the relation , and is called the Ornstein-Uhlenbeck semigroup associated with . The operator is defined as , and can be shown to be the infinitesimal generator of . The domain of is and the following Meyer inequality holds (see [19, Theorem 1.5.1]): for any , there exists a constant such that, for any ,
[TABLE]
We also define the operator , which is the inverse of , as follows: for every of the form (2.1), we set .
Remark 2.1**.**
Fix an integer , and consider a generic element of the class . Then, in view of the fact that by our initial assumption, it is a standard fact that admits a (parametrized) chaotic expansion of the form
[TABLE]
where the (–almost everywhere uniquely defined) kernels are square-integrable and symmetric in the first variables, and
[TABLE]
Using such a representation one can canonically define as the element of given by
[TABLE]
In what follows, given and , the symbol stands for the symmetrization of , that is
[TABLE]
where the sum runs over the group of all permutations of . Note that, for every ,
[TABLE]
by the triangle inequality. Also, one has trivially that .
We will make repeated use of the following lemma, focussing on the boundedness of .
Lemma 2.2**.**
Let be such that .
- (1)
Suppose that and . Then belongs to the domain of viewed as an -valued operator, and
[TABLE] 2. (2)
Suppose that and . Then belongs to the domain of viewed as an -valued operator, and
[TABLE]
Proof.
The proof is subdivided into several steps.
- (i)
First of all we observe that, by a direct application of the multiplier theorem (see [19, Theorem 1.4.2]), the operator is bounded from to itself. Moreover, one can suitably modify the proof of such a result to show that, for every , is also bounded as an operator from to itself (see Remark 2.1). 2. (ii)
Let be the operator defined by for . Again by a direct application of the multiplier theorem (see [19, Theorem 1.4.2]), the operator is bounded from to itself. On the other hand, one has (according to [12, Prop. 2.9.3]) as well as the existence of such that, for any ,
[TABLE]
(according111The statement of [21, Prop. 5.1.5] contains the factor instead of , but an inspection of the proof given therein actually provides the estimate stated in (2.11). to [21, Prop. 5.1.5]); these two facts plus the Minkowski inequality imply that the operator is bounded from to . As a conclusion, using that , we obtain that is bounded from to . 3. (iii)
Since and , we have that and . We can therefore write
[TABLE]
One one hand (see point (ii) above):
[TABLE]
On the other hand (see point (i) above):
[TABLE]
This completes the proof of (2.9). 4. (iv)
We now suppose that and . We can write
[TABLE]
where the involved symmetrization is defined in Remark 2.1. Let be the operator defined by for . By a direct application of the multiplier theorem (see [19, Theorem 1.4.2]), the operator is bounded from to itself. Thus, using on one hand that and on the other hand that is bounded from to itself (by (2.7)), we obtain that is bounded from to . As a consequence
[TABLE]
On the other hand (see points (i) and (ii) above, as well as (2.8)):
[TABLE]
This completes the proof of (2.10).
∎
2.2. Stein’s method
We refer to [6] for a complete discussion of this topic. Let be a Borel function such that and let . The ordinary differential equation
[TABLE]
is called the Stein’s equation associated with . The function
[TABLE]
is the unique solution to the Stein’s equation satisfying . Moreover, if is bounded by , then satisfies and . We refer to [12] and the references therein for a complete proof of these results.
We recall the total variation distance between the laws of two random variables defined in (1.3). Substituting by in Stein’s equation (2.12) and using the estimate for lead to the fundamental estimate
[TABLE]
In the framework of an isonormal Gaussian process , we can use Stein’s equation to estimate the total variation distance between a random variable and . A basic result is given in the next proposition (see [21, 12]), which is an easy consequence of (2.13) and the duality relationship (2.2).
Proposition 2.1**.**
Assume that , and . Then,
[TABLE]
2.3. Brascamp-Lieb inequality
In this subsection we recall some inequalities proved in [23] (see Lemmas 6.6 and 6.7 therein), which can be deduced from the Brascamp-Lieb inequality (see [2]) or just using Hölder’s and Young’s convolution inequalities.
Lemma 2.3**.**
Fix an integer . Let be a non-negative function on the integers and set . Then, for any vector whose components are or , we have
[TABLE]
Lemma 2.4**.**
Fix an integer and assume . We have
[TABLE]
where and is a fixed vector whose components are [math], or and it has at least two nonzero components.
2.4. Gebelein’s inequality
In the proof of Theorem 1.2, we will need the following Gaussian inequality.
Lemma 2.5**.**
Let be an isonormal Gaussian process over some real separable Hilbert space , and let , be two Hilbert subspaces of . Define and , respectively, to be the restriction of to and . Now consider two measurable mappings , , and assume that each is centered and , , with . Then,
[TABLE]
where
[TABLE]
Lemma 2.5 follows from the forthcoming Proposition 2.2, and can be shown by adopting almost verbatim the strategy of proof of [29, Theorem 3.4] – details are left to the reader.
Proposition 2.2**.**
Let , two independent isonormal Gaussian processes over some real separable Hilbert space . Consider two measurable mappings , , and assume that each is centered and , , with . Then, for any ,
[TABLE]
for some constant depending uniquely on .
Proof of Proposition 2.2.
Without loss of generality, we can assume that . Using Mehler’s formula (see e.g. [19, formula (1.67), p. 55]) together with the properties of conditional expectations, we infer that
[TABLE]
where is the Ornstein-Uhlenbeck semigroup introduced above. The conclusion now follows from a standard application of the Cauchy-Schwarz inequality, as well as from the following estimate: for every and every ,
[TABLE]
for some constant uniquely depending on , which follows from a direct application of [19, Lemma 1.4.1], as well as from the fact that is centered by assumption. ∎
3. Proof of Theorem 1.2
We are now ready for the proof of Theorem 1.2. In what follows, we use the letter to indicate a constant that may depend on the norm of , but which is always independent of . Its exact value is immaterial and may vary from one line to another. The main difficulty of the proof is to show the forthcoming inequality (3.4).
Step 1: Preparing the proof*. We shall use the Malliavin-Stein approach. In order to be in a position to do so, consider a centered stationary Gaussian family of random variables with unit variance and covariance for . We put for . Suppose that is a Hilbert space and let be a family of such that for each . In this situation, if is an isonormal Gaussian process, then the sequence has the same law as and we can assume, without any loss of generality, that .*
Consider the sequence introduced in (1.2), where has Hermite rank and let . Under condition (1.4), it is well known that as , where has been defined in (1.6). Set . Notice that implies that is bounded below for large enough. Taking into account (2.5), we have the representation , where
[TABLE]
and is the shifted function introduced in (2.4). As a consequence of Proposition 2.1, we have the estimate
[TABLE]
for an absolute constant . We now observe that there exists a sequence such that in the topology. For such a sequence of functions it is easily checked that, as
[TABLE]
Moreover, denoting by the quantity obtained from by replacing with one has that, as and for each fixed ,
[TABLE]
This follows from the fact that for each , the sequences and converge in , as tends to infinity, to and , respectively, due to the convergences (3.3).
The rest of the proof will then consist in showing that, for every function ,
[TABLE]
for constants that only depend on the norm of and on the norm of (recall that, by (2.6), ).
Step 2: Bounding **. We have
[TABLE]
We can write
[TABLE]
We will make use of the following estimate
[TABLE]
which can be justified as follows. First, by the isometry formula one has . Then, one can write and then apply Poincaré formula to the first term in the right-hand side to obtain (3.5). We will now proceed with the estimation of each member of the right-hand side of (3.5).
Step 3: Estimating **. We first note that for any , as is immediately seen by expanding into chaos. We then have
[TABLE]
Notice that we have three covariance factors. We need two additional factors that will be produced by the representation as a divergence of and . That is, we can write
[TABLE]
and
[TABLE]
We claim that the expectations and are bounded. Indeed, using the expansion of in Hermite polynomials, we have
[TABLE]
which is finite because the last quantity is precisely .
The term can be handled in the same way. As a consequence,
[TABLE]
Step 4: Estimating **. We have
[TABLE]
where is the operator defined by for . Since is bounded in for all (see [19, Theorem 1.4.2]), we obtain
[TABLE]
We have
[TABLE]
Therefore,
[TABLE]
We split the analysis on the different values of .
Case . We have
[TABLE]
We can write
[TABLE]
As a consequence,
[TABLE]
This quantity is uniformly bounded by a constant times , due to Lemma 2.2 (1) applied to and and taking into account that and
[TABLE]
Therefore,
[TABLE]
where the last inequality follows from Lemma 2.3.
Case . We have
[TABLE]
We know that is centered and belongs to . Moreover,
[TABLE]
belongs to . Indeed, using Hölder inequality, we can write
[TABLE]
Therefore, by Gebelein’s inequality (see Lemma 2.5), we deduce
[TABLE]
Therefore,
[TABLE]
For , we have
[TABLE]
The terms and are similar. For , we have
[TABLE]
where we have applied Lemma 2.4 in the last inequality.
Case . We have
[TABLE]
We know that is centered and
[TABLE]
Moreover, the random variable belongs to . Indeed, its -norm can be estimated as follows
[TABLE]
By Lemma 2.2 (2) applied to and , we have
[TABLE]
Then Meyer inequalities (see (2.7)) imply that
[TABLE]
We can write
[TABLE]
Then, a further application of Lemma 2.2 (2) to and , yields
[TABLE]
Thus, from (3.7), (3.8) and (3.9) we deduce
[TABLE]
Therefore, by Gebelein’s inequality (see Lemma 2.5), and the bounds (3.6) and (3.10), we obtain
[TABLE]
As a consequence,
[TABLE]
For , we have
[TABLE]
The terms and are similar. For , we have
[TABLE]
where we have applied Lemma 2.4 in the last inequality.
Step 5: end of the proof*. From Step 1, it suffices to show that*
[TABLE]
By Step 2, we have . In Step 3, it is shown that . Finally, it is shown in Step 4 that . The proof of Theorem 1.2 is thus complete.
∎
Remark 3.1**.**
We can show that both bounds in (1.8) are not comparable. In the particular case as , with , we obtain:
[TABLE]
Appendix
The following elementary result is used in the Introduction.
Lemma 3.2**.**
Let , and let and be such that
[TABLE]
Then,
[TABLE]
Proof.
Write , and let . A straightforward application of Hölder inequality yields that, for some finite constant independent of ,
[TABLE]
where in the second inequality we have used (3.11). Now fix and observe that, since , there exists an integer such that
[TABLE]
Setting in (3.12) and letting , we eventually conclude that for every , and the conclusion follows. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] H. Biermé, A. Bonami, I. Nourdin and G. Peccati (2012). Optimal Berry-Esseen rates on the Wiener space: the barrier of third and fourth cumulants. ALEA 9 , no. 2, pp. 473-500.
- 2[2] H. J. Brascamp and E. H. Lieb (1976). Best constants in Young’s inequality, its converse, and its generalization to more than three functions. Adv. Math. 20 , pp. 151-173.
- 3[3] P. Breuer and P. Major (1983). Central limit theorems for non-linear functionals of Gaussian fields. J. Mult. Anal. 13 , pp. 425-441.
- 4[4] S. Campese, I. Nourdin and D. Nualart (2019+). Continuous Breuer-Major Theorems: tightness and non-stationarity. Ann. Probab., to appear.
- 5[5] D. Chambers and E. Slud (1989): Central limit theorems for nonlinear functionals of stationary Gaussian processes. Probab. Theory Related Fields 80 , no. 3, pp. 323–346.
- 6[6] L. H. Y. Chen, L. Goldstein and Q.-M. Shao (2011). Normal approximation by Stein’s method. Springer-Verlag, Berlin.
- 7[7] P. Doukhan (2018): Stochastic Models for Time Series. Springer, 308 pages.
- 8[8] H. Gebelein (1941): Das statistische Problem der Korrelation als Variations- und Eigenwertproblem und sein Zusammenhang mit der Ausgleichsrechnung. Z. Angew. Math. Mech. 21 , pp. 364–379.
