Total variation distance between stochastic polynomials and invariance   principles

Vlad Bally; Lucia Caramellino

arXiv:1705.05194·math.PR·December 3, 2019

Total variation distance between stochastic polynomials and invariance principles

Vlad Bally, Lucia Caramellino

PDF

TL;DR

This paper develops bounds for the total variation distance between stochastic polynomials, leading to an invariance principle that generalizes existing results and applies to U-statistics and quadratic forms.

Contribution

It introduces a general method to estimate total variation distances between stochastic polynomials, extending previous invariance principles and CLT applications.

Findings

01

Established bounds for total variation distance between stochastic polynomials

02

Derived an invariance principle generalizing known results

03

Applied results to U-statistics and quadratic forms

Abstract

The goal of this paper is to estimate the total variation distance between two general stochastic polynomials. As a consequence one obtains an invariance principle for such polynomials. This generalizes known results concerning the total variation distance between two multiple stochastic integrals on one hand, and invariance principles in Kolmogorov distance for multi-linear stochastic polynomials on the other hand. As an application we first discuss the asymptotic behavior of U-statistics associated to polynomial kernels. Moreover we also give an example of CLT associated to quadratic forms.

Equations707

Q_{N, k_{*}} (c, X)

Q_{N, k_{*}} (c, X)

Φ_{m} (c, X)

P (X_{n} \in d x) = p ψ (x - x_{n}) d x + (1 - p) ν_{n} (d x)

P (X_{n} \in d x) = p ψ (x - x_{n}) d x + (1 - p) ν_{n} (d x)

X_{n} \sim χ_{n} V_{n} + (1 - χ_{n}) U_{n} .

X_{n} \sim χ_{n} V_{n} + (1 - χ_{n}) U_{n} .

∣ c ∣_{m}

∣ c ∣_{m}

∣ c ∣_{m, N}

δ_{*} (c)

d_{k} (F, G) = sup {∣ E (f (F)) - E (f (G)) ∣ : ∥ f ∥_{k, \infty} \leq 1} .

d_{k} (F, G) = sup {∣ E (f (F)) - E (f (G)) ∣ : ∥ f ∥_{k, \infty} \leq 1} .

d_{\mbox Kol} (F, G) = x \in R sup ∣ P (F \leq x) - P (G \leq x) ∣ .

d_{\mbox Kol} (F, G) = x \in R sup ∣ P (F \leq x) - P (G \leq x) ∣ .

\begin{array}[]{l}\displaystyle d_{\mbox{\rm{\scriptsize{TV}}}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))\leq\mathrm{Const}(c,d)\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \displaystyle\times\big{(}d_{k}^{\frac{\theta}{2kk_{\ast}\overline{m}+1}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))+e^{-|c|_{m}^{2}/C\delta_{\ast}^{2}(c)}+e^{-|d|_{m^{\prime}}^{2}/C\delta_{\ast}^{2}(d)}+\left|c\right|_{m+1,N}^{2\theta/(k_{\ast}\overline{m})}+\left|d\right|_{m^{\prime}+1,N}^{2\theta/(k_{\ast}\overline{m})}\big{)},\end{array}

\begin{array}[]{l}\displaystyle d_{\mbox{\rm{\scriptsize{TV}}}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))\leq\mathrm{Const}(c,d)\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \displaystyle\times\big{(}d_{k}^{\frac{\theta}{2kk_{\ast}\overline{m}+1}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))+e^{-|c|_{m}^{2}/C\delta_{\ast}^{2}(c)}+e^{-|d|_{m^{\prime}}^{2}/C\delta_{\ast}^{2}(d)}+\left|c\right|_{m+1,N}^{2\theta/(k_{\ast}\overline{m})}+\left|d\right|_{m^{\prime}+1,N}^{2\theta/(k_{\ast}\overline{m})}\big{)},\end{array}

\begin{array}[]{l}d_{\mbox{\rm{\scriptsize{Kol}}}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))\leq\mathrm{Const}(c,d)\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \times\big{(}d_{k\vee 3}^{\theta/(2N(k\vee 3)+1)}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))+\delta_{\ast}^{\theta/(2(k\vee 3)N+1)}(c)+\delta_{\ast}^{\theta/(2(k\vee 3)N+1)}(d)\big{)},\end{array}

\begin{array}[]{l}d_{\mbox{\rm{\scriptsize{Kol}}}}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))\leq\mathrm{Const}(c,d)\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \times\big{(}d_{k\vee 3}^{\theta/(2N(k\vee 3)+1)}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))+\delta_{\ast}^{\theta/(2(k\vee 3)N+1)}(c)+\delta_{\ast}^{\theta/(2(k\vee 3)N+1)}(d)\big{)},\end{array}

d_{{\mbox{\rm{\scriptsize{TV}}}}}(Q_{N,k_{\ast}}(c,X),S_{N}(c,G))\leq\mathrm{Const}(c)\big{(}\delta_{\ast}^{\theta/(6k_{\ast}m+1)}(c)+e^{-|c|_{m}^{2}/C\delta_{\ast}^{2}(c)}+\left|c\right|_{m+1,N}^{2\theta/(k_{\ast}m)}\big{)},

d_{{\mbox{\rm{\scriptsize{TV}}}}}(Q_{N,k_{\ast}}(c,X),S_{N}(c,G))\leq\mathrm{Const}(c)\big{(}\delta_{\ast}^{\theta/(6k_{\ast}m+1)}(c)+e^{-|c|_{m}^{2}/C\delta_{\ast}^{2}(c)}+\left|c\right|_{m+1,N}^{2\theta/(k_{\ast}m)}\big{)},

d_{\mbox Kol} (S_{N} (c, Z), S_{N} (c, G)) \leq C \times δ_{*}^{1/ (3 N + 1)} (c) .

d_{\mbox Kol} (S_{N} (c, Z), S_{N} (c, G)) \leq C \times δ_{*}^{1/ (3 N + 1)} (c) .

θ (μ) = \int_{R^{N}} ψ (x_{1}, \dots, x_{N}) d μ (x_{1}) \dots d μ (x_{N})

θ (μ) = \int_{R^{N}} ψ (x_{1}, \dots, x_{N}) d μ (x_{1}) \dots d μ (x_{N})

U_{n}^{ψ} = \frac{( n - N )!}{n !} i_{1}, \dots, i_{N} = 1 \sum n δ (i_{1}, \dots, i_{N}) ψ (X_{i_{1}}, \dots, X_{i_{N}}),

U_{n}^{ψ} = \frac{( n - N )!}{n !} i_{1}, \dots, i_{N} = 1 \sum n δ (i_{1}, \dots, i_{N}) ψ (X_{i_{1}}, \dots, X_{i_{N}}),

U_{n}^{ψ} = \frac{( n - N )!}{n !} i_{1}, \dots, i_{N} = 1 \sum n k_{1}, \dots, k_{N} = 1 \sum k_{*} δ (i_{1}, \dots, i_{N}) b (k_{1}, \dots, k_{N}) j = 1 \prod N X_{i_{j}}^{k_{j}} .

U_{n}^{ψ} = \frac{( n - N )!}{n !} i_{1}, \dots, i_{N} = 1 \sum n k_{1}, \dots, k_{N} = 1 \sum k_{*} δ (i_{1}, \dots, i_{N}) b (k_{1}, \dots, k_{N}) j = 1 \prod N X_{i_{j}}^{k_{j}} .

U_{n}^{ψ} - θ (μ)

U_{n}^{ψ} - θ (μ)

= m = 1 \sum N \frac{( n - m )!}{n !} i_{1}, \dots, i_{m} = 1 \sum n k_{1}, \dots, k_{m} = 1 \sum k_{*} c ((n_{1}, k_{1}), \dots, (n_{m}, k_{m})) j = 1 \prod m (X_{i_{j}}^{k_{j}} - E (X_{i_{j}}^{k_{j}}))

n^{m_{0}} (U_{n}^{ψ} - θ (μ)) = C \times Φ_{m_{0}} (a, X) + R_{n}

n^{m_{0}} (U_{n}^{ψ} - θ (μ)) = C \times Φ_{m_{0}} (a, X) + R_{n}

S_{n, p} = ε_{p} (n) 1 \leq i < j \leq n \sum \frac{1}{∣ j - i ∣ ^{p}} X_{i} X_{j}

S_{n, p} = ε_{p} (n) 1 \leq i < j \leq n \sum \frac{1}{∣ j - i ∣ ^{p}} X_{i} X_{j}

∥ Z_{n, i} ∥_{p} \leq M_{p} (Z) .

∥ Z_{n, i} ∥_{p} \leq M_{p} (Z) .

x^{α} = i = 1 \prod m x_{α_{i}},

x^{α} = i = 1 \prod m x_{α_{i}},

\begin{array}[]{c}\displaystyle\left|c\right|_{\mathcal{U}}=\Big{(}\sum_{\alpha}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\Big{)}^{1/2},\quad\left|c\right|_{\mathcal{U},m}=\Big{(}\sum_{\left|\alpha\right|=m}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\Big{)}^{1/2}\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \displaystyle\mathcal{N}_{{\mathcal{U}},q}(c,M)=\Big{(}\sum_{m=0}^{\infty}m^{q}M^{2m}\left|c\right|_{\mathcal{U},m}^{2}\Big{)}^{1/2}=\sum_{\alpha}\left|\alpha\right|^{m}M^{2\left|\alpha\right|}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\end{array}

\begin{array}[]{c}\displaystyle\left|c\right|_{\mathcal{U}}=\Big{(}\sum_{\alpha}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\Big{)}^{1/2},\quad\left|c\right|_{\mathcal{U},m}=\Big{(}\sum_{\left|\alpha\right|=m}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\Big{)}^{1/2}\vskip 3.0pt plus 1.0pt minus 1.0pt\\ \displaystyle\mathcal{N}_{{\mathcal{U}},q}(c,M)=\Big{(}\sum_{m=0}^{\infty}m^{q}M^{2m}\left|c\right|_{\mathcal{U},m}^{2}\Big{)}^{1/2}=\sum_{\alpha}\left|\alpha\right|^{m}M^{2\left|\alpha\right|}\left|c(\alpha)\right|_{\mathcal{U}}^{2}\end{array}

\delta_{{\mathcal{U}},\ast}(c)=\Big{(}\sup_{n}(\sum_{\alpha}1_{\{n\in\alpha^{\prime}\}}\left|c(\alpha)\right|_{\mathcal{U}}^{2})\Big{)}^{1/2}.

\delta_{{\mathcal{U}},\ast}(c)=\Big{(}\sup_{n}(\sum_{\alpha}1_{\{n\in\alpha^{\prime}\}}\left|c(\alpha)\right|_{\mathcal{U}}^{2})\Big{)}^{1/2}.

Φ_{m} (c, Z)

Φ_{m} (c, Z)

S_{N} (c, Z)

\left\|M_{n}\right\|_{\mathcal{U},p}\leq b_{p}\Big{(}{\mathbb{E}}\Big{(}\Big{(}\sum_{k=1}^{n-1}\left|M_{k+1}-M_{k}\right|_{\mathcal{U}}^{2}\Big{)}^{p/2}\Big{)}\Big{)}^{1/p}\leq b_{p}\Big{(}\sum_{k=1}^{n-1}\left\|M_{k+1}-M_{k}\right\|_{\mathcal{U},p}^{2}\Big{)}^{1/2}

\left\|M_{n}\right\|_{\mathcal{U},p}\leq b_{p}\Big{(}{\mathbb{E}}\Big{(}\Big{(}\sum_{k=1}^{n-1}\left|M_{k+1}-M_{k}\right|_{\mathcal{U}}^{2}\Big{)}^{p/2}\Big{)}\Big{)}^{1/p}\leq b_{p}\Big{(}\sum_{k=1}^{n-1}\left\|M_{k+1}-M_{k}\right\|_{\mathcal{U},p}^{2}\Big{)}^{1/2}

\Big{\|}\sum_{j=1}^{m_{\ast}}d_{j}\times Z_{n,j}\Big{\|}_{\mathcal{U},p}\leq\sqrt{m_{\ast}}M_{p}(Z)\Big{(}\sum_{j=1}^{m_{\ast}}\left|d_{j}\right|_{\mathcal{U}}^{2}\Big{)}^{1/2}.

\Big{\|}\sum_{j=1}^{m_{\ast}}d_{j}\times Z_{n,j}\Big{\|}_{\mathcal{U},p}\leq\sqrt{m_{\ast}}M_{p}(Z)\Big{(}\sum_{j=1}^{m_{\ast}}\left|d_{j}\right|_{\mathcal{U}}^{2}\Big{)}^{1/2}.

∥ Φ_{N} (c, Z) ∥_{U, p} \leq \overline{M}_{p}^{N} ∣ c ∣_{U, N}

∥ Φ_{N} (c, Z) ∥_{U, p} \leq \overline{M}_{p}^{N} ∣ c ∣_{U, N}

∥ S_{N} (c, Z) - c (\emptyset) ∥_{U, p} \leq N_{U, 0} (c, \overline{M}_{p}) .

∥ S_{N} (c, Z) - c (\emptyset) ∥_{U, p} \leq N_{U, 0} (c, \overline{M}_{p}) .

c^{n, j} (α) = c (α, (n, j)) 1_{{α_{N - 1}^{'} < n}}

c^{n, j} (α) = c (α, (n, j)) 1_{{α_{N - 1}^{'} < n}}

Φ_{N} (c, Z) = n = N \sum \infty j = 1 \sum m_{*} Z_{n, j} ∣ α ∣ = N - 1 \sum c (α, (n, j)) 1_{{α_{N - 1}^{'} < n}} Z^{α} = n = N \sum \infty j = 1 \sum m_{*} Z_{n, j} Φ_{N - 1} (c^{n, j}, Z) .

Φ_{N} (c, Z) = n = N \sum \infty j = 1 \sum m_{*} Z_{n, j} ∣ α ∣ = N - 1 \sum c (α, (n, j)) 1_{{α_{N - 1}^{'} < n}} Z^{α} = n = N \sum \infty j = 1 \sum m_{*} Z_{n, j} Φ_{N - 1} (c^{n, j}, Z) .

\left\|\Phi_{N}(c,Z)\right\|_{\mathcal{U},p}^{2}\leq b_{p}^{2}\sum_{n=N}^{\infty}\Big{\|}\sum_{j=1}^{m_{\ast}}Z_{n,j}\Phi_{N-1}(c^{n,j},Z)\Big{\|}_{\mathcal{U},p}^{2}\leq b_{p}^{2}M_{p}^{2}(Z)m_{\ast}\sum_{n=N}^{\infty}\sum_{j=1}^{m_{\ast}}\left\|\Phi_{N-1}(c^{n,j},Z)\right\|_{\mathcal{U},p}^{2}

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Total variation distance

between stochastic polynomials

and invariance principles

Vlad Bally

Lucia Caramellino Université Paris-Est, LAMA (UMR CNRS, UPEMLV, UPEC), INRIA, F-77454 Marne-la-Vallée, France. Email: [email protected] di Matematica and INDAM-GNAMPA, Università di Roma “Tor Vergata”, Via della Ricerca Scientifica 1, I-00133 Roma, Italy. Email: [email protected]

Abstract

The goal of this paper is to estimate the total variation distance between two general stochastic polynomials. As a consequence one obtains an invariance principle for such polynomials. This generalizes known results concerning the total variation distance between two multiple stochastic integrals on one hand, and invariance principles in Kolmogorov distance for multi-linear stochastic polynomials on the other hand. As an application we first discuss the asymptotic behavior of U-statistics associated to polynomial kernels. Moreover we also give an example of CLT associated to quadratic forms.

AMS 2010 Mathematics Subject Classification: 60F17, 60H07.

Keywords: Stochastic polynomials; Invariance principles; Quadratic Central Limit Theorem; U-statistics; Abstract Malliavin calculus.

1 Introduction
2 Notation, basic objects and preliminary results
3 Main results
3.1 Doeblin’s condition and splitting
3.2 Main results
3.3 Gaussian and Gamma approximation
4 Examples
4.1 U-statistics associated to polynomial kernels
4.2 A quadratic central limit theorem
5 Stochastic calculus of variation under the Doeblin’s condition
5.1 Abstract Malliavin calculus and Sobolev spaces
5.2 Regularization results
5.3 Estimates of the Sobolev norms
5.4 Estimates of the covariance matrix
5.5 Proof of Theorem 3.3
A An iterated Hoeffding’s inequality
B Norms

1 Introduction

This paper deals with stochastic polynomials of the following type: given a sequence $X=(X_{n})_{n\in{\mathbb{N}}}$ of independent random variables which have finite moments of any order and, given $N\in{\mathbb{N}}$ and $k_{\ast}\in{\mathbb{N}},$ one looks to

[TABLE]

The coefficients $c$ are symmetric and null on the diagonals (that is, if $n_{i}=n_{j}$ for $i\neq j)$ and only a finite number of them are non null, so the above sum is finite. Let us mention that here, for notation simplicity, we take $X_{n}\in{\mathbb{R}},$ but in the paper we work with $X_{n}=(X_{n,1},\ldots,X_{n,d_{\ast}})\in{\mathbb{R}}^{d_{\ast}}.$ Note also that we use the centred random variables $X_{n}^{k}-{\mathbb{E}}(X_{n}^{k})$ , $k=1,\ldots,k_{\ast}$ , but, if the polynomial is given in terms of $X_{n}^{k},$ we may always re-write it in terms of centred random variables.

Our goal is to estimate the total variation distance between the laws of two such polynomials and moreover to establish an invariance principle, that is to estimate the error done by changing $Z_{n}=(Z_{n,1},\ldots,Z_{n,k_{\ast}}):=(X_{n}-{\mathbb{E}}(X_{n}),\ldots,X_{n}^{k_{\ast}}-{\mathbb{E}}(X_{n}^{k_{\ast}}))$ by a centred Gaussian random variable $G_{n}=(G_{n,1},\ldots,G_{n,k_{\ast}})$ which has the same covariance matrix as $Z_{n}$ . Note that this Gaussian vector does not keep the structure given by the powers in the original vector $Z_{n}.$

Since the total variation distance concerns measurable functions, a “regularization effect” has to be at work. This leads us to make the following assumption (known as Doeblin’s condition): there exists $\varepsilon>0,r>0$ and $x_{n}\in{\mathbb{R}},n\in{\mathbb{N}},$ such that $\sup_{n}|x_{n}|<\infty$ and ${\mathbb{P}}(X_{n}\in dx)\geq\varepsilon dx$ on the ball $B_{r}(x_{n}).$ It is easy to see that this is equivalent with saying that

[TABLE]

where $p\in(0,1],\psi$ is a $C^{\infty}$ probability density with the support included in $B_{r}(0)$ and $\nu_{n}$ is a probability measure. The decomposition (1.3) being given, one constructs three independent random variable $\chi_{n},V_{n},U_{n}$ with $V_{n}\sim\psi(x-x_{n})dx,U_{n}\sim\nu_{n}(dx)$ and $\chi_{n}$ Bernoulli with parameter $p$ and then employs the identity of laws

[TABLE]

The density $\psi$ may be chosen (see (3.6)) in order that $\ln\psi$ has nice properties and this allows one to built an abstract Malliavin type calculus based on $V_{n},n\in{\mathbb{N}}$ and to use this calculus in order to obtain the “regularization effect” which is needed. We have already used this argument in [1, 5, 3, 4]. In an independent way, Nourdin and Poly in [30] have used similar arguments in a similar problem: they take $\psi=(1/2r)^{-1}1_{B_{r}(0)}$ so $V_{n}$ has a uniform distribution, and they use a chaos type decomposition obtained in [6]. Note also that hypothesis (1.3) is in fact necessary: in his seminal paper [36] Prohorov proved that (1.3) is (essentially) necessary and sufficient in order to obtain convergence in total variation distance in the Central Limit Theorem (see [1] for details).

The decomposition (1.4) has been introduced by Nummelin (see [22] and [20]) in order to produce atoms which allow one to use the renewal theory for studying the convergence to equilibrium for Markov chains – this is why it is also known as “the Nummelin splitting method”. It has been also used by Poly in his PhD thesis [35] and, to our knowledge, this is the first place where the idea of using the regularization given by the noise $V_{n}$ appears.

In order to present our results we have to introduce some more notation. Given the coefficient $c$ in (1.2) we denote

[TABLE]

The quantity $\left|c\right|$ is essentially equivalent (up to a multiplicative factor) with the variance of $Q_{N,k_{\ast}}(c,X)$ and $\delta_{\ast}(c)$ is essentially equivalent with the “low influence factor” as it is defined and used in [21] (and we follow several ideas from this paper). These are the quantities which come in, in order to estimate the errors.

For $f\in C_{b}^{\infty}({\mathbb{R}}^{d})$ we denote by $\left\|f\right\|_{k,\infty}$ the supremum norm of $f$ and of its derivatives of order less or equal to $k,$ and, for two random variables $F$ and $G,$ we define the distances

[TABLE]

For $k=0,$ $d_{0}=d_{\mbox{\rm{\scriptsize{TV}}}}$ is the total variation distance, and, if $F\sim p_{F}(x)dx$ and $G\sim p_{G}(x)dx$ then $d_{\mbox{\rm{\scriptsize{TV}}}}(F,G)=\left\|p_{F}-p_{G}\right\|_{1}.$ $d_{1}$ is the Fortet-Mourier distance which metrizes the convergence in law. We also consider the Kolmogorov distance

[TABLE]

We are now able to give our first result, Theorem 3.3, concerning the distance between two polynomials $Q_{N,k_{\ast}}(c,X)$ and $Q_{N,k_{\ast}}(d,Y)$ . Assume that $X$ and $Y$ satisfy the Doeblin’s condition (see (1.3)) and moreover assume that the non degeneracy condition $|c|_{m}>0,|d|_{m^{\prime}}>0$ holds for some $m,m^{\prime}\leq N$ and denote $\overline{m}=m\vee m^{\prime}$ . Then we prove (see (3.17)) that for every $k\in{\mathbb{N}}$ and $\theta\in(\frac{1}{(1+k)^{2}},1),$

[TABLE]

where $\mathrm{Const}(c,d)$ denote a quantity which depends on the coefficients $c$ and $d$ in an explicit way (see (3.17)). If $m=N$ then $\left|c\right|_{m+1,N}=0$ so this term does no more appear. Theorem 3.3 is the main result in our paper.

In Theorem 3.7 we give a variant of this result in Kolmogorov distance: we prove (see (3.21)) that

[TABLE]

$\mathrm{Const}(c,d)$ is again a positive quantity explicitly depending on $c$ and $d$ (see 3.21). The estimate (1.8) holds for general laws for $X_{n}$ and $Y_{n}$ (without assuming the Doeblin’s condition). However now we have to assume that the covariance matrix of both $(X_{n}^{1},\ldots,X_{n}^{k_{\ast}})$ and $(Y_{n}^{1},\ldots,Y_{n}^{k_{\ast}})$ is invertible. The proof of (1.8) is a direct consequence of the results of Mossel et al. in [21].

In the case $k_{\ast}=1$ (multilinear stochastic polynomials) and if $X_{n}$ and $Y_{n}$ are Gaussian random variables, $\Phi_{N}(c,X)$ and $\Phi_{N}(d,Y)$ are multiple stochastic integrals. In this special case we may drop out $e^{-1/C\delta_{\ast}^{2}(c)}$ and $e^{-1/C\delta_{\ast}^{2}(d)}$ in (1.7) (see Theorem 3.4). Estimates in total variation for such integrals are already studied: the inequality (1.7) for multiple stochastic integrals (for $k_{\ast}=1$ ) has been firstly announced in [10] with the power $\frac{1}{N}$ instead of $\frac{\theta}{2N+1}$ above, but the proof was only sketched. It has been rigourously proved in [29] with power $\frac{1}{2N+1}$ and recently improved in [8] where the power $\frac{1}{N}\times(\ln N)^{d}$ is obtained. So (1.7) is a generalization of the above results on multiple stochastic integrals to general polynomials depending on a general noise. But, as the above discussion suggests, (1.7) is not the best possible estimate (the approach in [8] does not seem to work in our general framework, so for the moment we are not able to improve it).

A second result, given in Theorem 3.9, concerns the invariance principle. We consider a sequence of independent centred Gaussian random variables $G_{n}=(G_{n,1},\ldots,G_{n,k_{\ast}})\in{\mathbb{R}}^{k_{\ast}}$ and we assume that the covariance matrix of $G_{n}$ coincides with the covariance matrix of $Z_{n}=(Z_{n,1},\ldots,Z_{n,k_{\ast}})$ where $Z_{n,k}:=X_{n}^{k}-{\mathbb{E}}(X_{n}^{k}).$ We denote by $S_{N}(c,G)$ the polynomial $Q_{N,k_{\ast}}(c,X)$ in which $Z_{n}=(Z_{n,1},\ldots,Z_{n,k_{\ast}})$ is replaced by $G_{n}=(G_{n,1},\ldots,G_{n,k_{\ast}}).$ We stress that $S_{N}(c,G)$ is multi-linear with respect to $G_{n,i},i=1,\ldots,k_{\ast}$ in contrast to $Q_{N,k_{\ast}}(c,X)$ which is a general polynomial with respect to $X_{n}.$ In Theorem 3.9 we prove that, if $|c|_{m}>0,$ for some $m\leq N,$ then for every $\theta\in(\frac{1}{16},1)$ ,

[TABLE]

$\mathrm{Const}(c)$ being explicitly dependent on $c$ (see (3.22). A result going in the same direction was previously obtained by Nourdin and Poly in [30]. They take $k_{\ast}=1$ , so $Q_{N}(c,X)$ is a multi-linear polynomial, and they assume Doeblin’s condition for $X_{i}.$ Then they prove that, if $c_{n},n\in{\mathbb{N}}$ is a sequence of coefficients such that $\lim_{n}\delta_{\ast}(c_{n})=0,$ then $\lim_{n}d_{{\mbox{\rm{\scriptsize{TV}}}}}(Q_{N,k_{\ast}}(c,X),S_{N}(c,G))=0.$ The progress achieved in our paper consists in the fact that we deal with general polynomials on one hand and we obtain an estimate of the error on the other hand.

A similar estimate with $d_{\mbox{\rm{\scriptsize{Kol}}}}$ instead of $d_{\mbox{\rm{\scriptsize{TV}}}}$ represents the main result in [21] (see Theorem 3.19 therein). Let us be more precise. In [21] one considers “orthonormal ensembles” which are nothing else than multi-dimensional random variables $Z_{n}=(Z_{n,1},\ldots,Z_{n,k_{\ast}})$ such that ${\mathbb{E}}(Z_{n,i})=0$ and ${\mathbb{E}}(Z_{n,i}Z_{n,j})=\delta_{i,j}$ (the Kronecker delta). One denotes $S_{N}(c,Z)$ the polynomial $Q_{N,k_{\ast}}(c,X)$ defined (1.1) in which $X_{n}^{k}-{\mathbb{E}}(X_{n}^{k})$ is replaced by $Z_{n,k}.$ And in [21] (Theorem 3.19 therein) they prove that if $\left|c\right|=1,$ then

[TABLE]

Note that in this theorem one does not need Doeblin condition to hold true. Note also that the orthonormality condition for $Z_{n,1},\ldots,Z_{n,k_{\ast}}$ is not more restrictive than saying that the covariance matrix $\mathrm{Cov}(Z_{n})$ of $Z_{n}$ is invertible and the lower eigenvalues $\lambda_{n}$ satisfy $\lambda_{n}\geq\underline{\lambda}>0$ for every $n$ (see the proof of Theorem 2.3). So, by taking $Z_{n,k}:=X_{n}^{k}-{\mathbb{E}}(X_{n}^{k}),$ one obtains also (1.9) (under the above hypothesis on $\mathrm{Cov}(Z_{n})).$ The difference with respect to their result is just that we deal with convergence in total variation distance instead of Kolmogorov distance.

An important consequence of (1.9) is that it allows to replace the study of the asymptotic behavior of a sequence $Q_{N,k_{\ast}}(c_{n},X),n\in{\mathbb{N}}$ of general stochastic polynomials by the study of $S_{N}(c_{n},G),n\in{\mathbb{N}},$ which are elements of a finite number of Wiener chaoses. Of course, the central example is the classical CLT, where $N=1$ and $k_{\ast}=1$ , so $S_{1}(c_{n},G)=\sum_{i=1}^{\infty}c_{n}(i)G_{i}$ is just a Gaussian random variable. But, starting with the proof of the “forth moment theorem” by Nualart and Peccati [33] and Nourdin and Peccati [25], a lot of work has been done in order to characterize the convergence to normality of elements of a finite number of Wiener chaoses (see [23, 28, 32, 34] or [24] for an overview). Moreover, convergence to a $\chi_{2}$ distribution has been treated in [25]. We give the consequences of these results in Theorem 3.11 and Theorem 3.13.

Finally we give two more applications. The first one concerns U-statistics. The problem is the following: given a probability law $\mu,$ an integer $N\in{\mathbb{N}},$ and a symmetric kernel $\psi,$ one wants to estimate

[TABLE]

on the basis of a sample $X_{1},\ldots.,X_{n}$ of independent random variables of law $\mu.$ An un-biased estimator of $\theta(\mu)$ is constructed by

[TABLE]

in which $\delta(i_{1},\ldots,i_{N})=0$ if any two indexes are equal, otherwise $\delta(i_{1},\ldots,i_{N})=1$ . In the case when $\psi$ is a polynomial this enters in our framework. This covers an important class of kernels: for example $\psi(x_{1},x_{2})=(x_{1}-x_{2})^{2}$ gives the estimator of the variance. But not all: for example $\psi(x_{1},\ldots,x_{N})=\max_{i=1,N}\left|x_{i}\right|$ is out of reach. Say that $\psi(x_{1},\ldots,x_{N})=\sum_{k_{1},\ldots,k_{N}=1}^{k_{\ast}}\delta(i_{1},\ldots,i_{N})b(k_{1},\ldots,k_{N})\prod_{j=1}^{N}x_{j}^{k_{j}}.$ Then

[TABLE]

This fits in (1.1) except that $X_{j}^{k_{j}}$ is not centred. It turns out that the procedure which consists in centering $X_{j}^{k_{j}}$ coincides, in this framework, with the Hoeffding’s decomposition, which is a central tool in the U-statistics theory. After doing this one obtains

[TABLE]

for some appropriate coefficients $c((n_{1},k_{1}),\ldots,(n_{m},k_{m}))$ , and we are back in our framework. In U-statistics theory one says that the kernel $\psi$ is degenerated at order $m_{0}$ if $\Phi_{m}=0$ for $m\leq m_{0}-1$ and $\Phi_{m_{0}}\neq 0.$ Then one writes

[TABLE]

with $R_{n}\rightarrow 0.$ It follows that the asymptotic behavior of $n^{m_{0}}(U_{n}^{\psi}-\theta(\mu))$ is controlled by $\Phi_{m_{0}}(a,X).$ Using this decomposition, in Theorem 4.3 we characterizes the limit of $n^{m_{0}}(U_{n}^{\psi}-\theta(\mu))$ as a linear combination of multiple stochastic integrals. The limit is considered both in Kolmogorov distance under general conditions and in total variation distance under Doeblin condition for $\mu$ . Let us mention that number of results are already known concerning the convergence in Kolmogorov distance for U-statistics: they represent generalizations of the Berry–Essen theorem (we refer to [19] and [18]). But the result in total variation distance, which generalizes Prohorov’s theorem for the CLT, seems to be new.

Another subject which is very closed, is that of quadratic forms. Here also the asymptotic behavior in Kolmogorov distance is well understood (see de Jong [11, 12] , Rotar’ et al. [13, 37] and Götze et al. [14]) but we have not found results concerning the convergence in total variation. We do not treat this subject in all generality but we restrict ourselves to the following interesting example: for $p\in[0,\frac{1}{2}]$ we define

[TABLE]

where $X_{i},i\in{\mathbb{N}}$ are independent identically distributed random variables with ${\mathbb{E}}(X_{i})=0$ and ${\mathbb{E}}(X_{i}^{2})=1.$ And $\varepsilon_{p}(n)=n^{-(1-p)}$ for $p<\frac{1}{2}$ and $\varepsilon_{1/2}(n)=1/\sqrt{2n\ln n}.$ For $p<\frac{1}{2}$ we prove that $S_{n,p}\rightarrow\int_{0}^{1}\int_{0}^{t}(t-s)^{-p}dW_{s}dW_{t}$ and for $p=\frac{1}{2}$ one has $S_{n,p}\rightarrow\Delta$ with $\Delta$ a standard normal random variable. Thus, there is a change of regime in $p=\frac{1}{2}.$ As before, the convergence takes place in Kolmogorov distance for a general $X$ and in total variation distance under Doeblin’s condition.

The paper is organized as follows. In Section 2, we fix our settings and we give some preliminary results. Section 3 is devoted to our main results: we first precisely define the Doeblin’s condition and the Nummelin splitting (Section 3.1); then we introduce our main result Theorem 3.3 and its several consequences (Section 3.2); finally we analyze the Gaussian and Gamma approximation (Section 3.3). The main examples are developed in Section 4: in Section 4.1 we study the asymptotic behavior of U-statistics written on polynomial kernels and in Section 4.2 we study the convergence of the above quadratic CLT result. Finally, Section 5 contains the proof of our main Theorem 3.3, which is given in the last Section 5.5: in Section 5.1 we introduce the abstract Malliavin calculus, in Section 5.2 we state the regularization lemma we use in this paper, Section 5.3 is devoted to proper estimates of the Sobolev norms and Section 5.4 refers to the non-degeneracy result of the Malliavin covariance matrix. The paper concludes with two appendixes: Appendix A studies an iterated Hoeffding’s inequality for martingales and Appendix B gives useful estimates for the Sobolev norms which are used the Malliavin integration by parts formula.

Acknowledgments. We thank to Cristina Butucea and to Dan Timotin for useful discussions.

2 Notation, basic objects and preliminary results

In this section we introduce multi-linear stochastic polynomials based on a sequence of abstract independent random variables $Z_{n}=(Z_{n,1},\ldots,Z_{n,m_{\ast}})\in{\mathbb{R}}^{m_{\ast}},$ $n\in{\mathbb{N}}.$ In the next section, when dealing with general polynomials as in (1.1), we will take $Z_{n,k}=X_{n}^{k}-{\mathbb{E}}(X_{n}^{k}).$

$\square$ The basic noise. We assume that ${\mathbb{E}}(Z_{n,i})=0$ and that $Z_{n}$ has finite moments of any order: for every $p\geq 1$ there exists some $M_{p}(Z)\geq 1$ such that for every $n\in{\mathbb{N}}$ and $i\in[m_{\ast}]=\{1,\ldots,m_{\ast}\}$

[TABLE]

$\square$ Multi-indexes. We will use “double” multi-indexes $\alpha=(\alpha_{1},\ldots,\alpha_{m})$ with $\alpha_{i}=(\alpha_{i}^{\prime},\alpha_{i}^{\prime\prime})=(n_{i},j_{i})$ with $n_{i}\in{\mathbb{N}}$ and $j_{i}\in[m_{\ast}].$ We always assume that $n_{1}<\ldots<n_{m}.$ So we work with ”ordered” multi-indexes. We also denote $\alpha^{\prime}=(\alpha_{1}^{\prime},\ldots,\alpha_{m}^{\prime})=(n_{1},\ldots,n_{m})$ , $\alpha^{\prime\prime}=(\alpha_{1}^{\prime\prime},\ldots,\alpha_{m}^{\prime\prime})=(j_{1},\ldots,j_{m})$ and $\left|\alpha\right|=m.$ The set of such multi-indexes is denoted by $\Gamma_{m}$ and we set $\Gamma=\cup_{m}\Gamma_{m}$ . We stress that we consider also the void multi-index $\alpha=\emptyset$ and in this case we put $\left|\alpha\right|=0.$ Moreover, for a sequence $x_{n}=(x_{n,1},\ldots,x_{n,m_{\ast}})\in{\mathbb{R}}^{m_{\ast}},n\in{\mathbb{N}}$ we denote

[TABLE]

with $x^{\alpha}=1$ if $\alpha=\emptyset$ .

$\square$ Coefficients. We consider a Hilbert space $\mathcal{U}$ with norm $\left|\cdot\right|_{\mathcal{U}}$ and for a $\mathcal{U}$ valued random variable $X$ , we denote $\left\|X\right\|_{\mathcal{U},p}=({\mathbb{E}}(\left|X\right|_{\mathcal{U}}^{p})^{1/p}.$ In a first stage we have just $\mathcal{U}={\mathbb{R}}$ but in Section 5, when considering stochastic derivatives, we have to use some general space $\mathcal{U}$ . We denote $\mathcal{C(U})=\{c=(c(\alpha))_{\alpha\in\Gamma}:c(\alpha)\in\mathcal{U\}}$ . These are the coefficients we will use. We define

[TABLE]

and

[TABLE]

The notation $n\in\alpha^{\prime}$ means that $\alpha_{j}^{\prime}=n$ for some $j\in[m].$ When ${\mathcal{U}}={\mathbb{R}}$ , we shall omit the subscript ${\mathcal{U}}$ , so we simply write $|c|$ , $|c|_{m}$ , $\mathcal{N}_{q}(c,M)$ and $\delta_{\ast}(c)$ . For several authors (see e.g. [21] or [27]), $\delta_{{\mathcal{U}},\ast}^{2}(c)$ is called the “influence” factor.

$\square$ Multi-linear polynomials. Given $c\in$ $\mathcal{C(U})$ we define

[TABLE]

In the sequel we use several times Burkholder’s inequality for Hilbert space valued martingales: if $M_{n}\in\mathcal{U},n\in{\mathbb{N}}$ is a martingale then for every $p\geq 2$ there exists $b_{p}\geq 1$ such that

[TABLE]

the second inequality being obtained by using the triangle inequality with respect to $\left\|\cdot\right\|_{\mathcal{U},p/2}.$

Moreover, as an immediate consequence of (2.1), for every $n\in{\mathbb{N}}$ and every $d_{j}\in\mathcal{U},j\in[m_{\ast}]$ we have

[TABLE]

Using these two inequalities we obtain

Lemma 2.1

Suppose that (2.1) holds and denote $\overline{M}_{p}=b_{p}M_{p}(Z)\sqrt{m_{\ast}}.$ Then

[TABLE]

and

[TABLE]

Proof. We proceed by recurrence on $N.$ For $N=0$ we have $\Phi_{N}(c,Z)=c(\emptyset)$ so (2.8) is obvious. For $\alpha\in\Gamma$ with $\left|\alpha\right|=N-1$ we denote

[TABLE]

and we write

[TABLE]

Note that, if $n\geq N$ , $Z_{n,j}$ and $\Phi_{N-1}(c^{n,j},Z)$ are independent. So, using (2.6) first and (2.7) then we get

[TABLE]

and by the recurrence hypothesis,

[TABLE]

So (2.8) is proved.

We now prove (2.9) again by induction. The case $N=1$ follows from (2.8). For $N\geq 2$ , we have

[TABLE]

If $n\geq m$ , $Z_{n,j}$ and $\Phi_{m-1}(c^{n,j},Z)$ are independent, so $Z_{n,j}$ and $S_{N\wedge n-1}(c^{n,j},Z)$ are independent as well. Therefore we can apply (2.6) and (2.7) and we obtain

[TABLE]

and by the recurrence hypothesis,

[TABLE]

$\square$

We give now the basic invariance principle. We take $\mathcal{U}={\mathbb{R}},$ and for $f\in C_{b}^{3}({\mathbb{R}}),$ we denote by $\left\|f\right\|_{3,\infty}$ the supremum norm of $f$ and its derivatives up to order three.

Theorem 2.2

Let $Z=(Z_{n})_{n\in{\mathbb{N}}},Z_{n}\in{\mathbb{R}}^{m_{\ast}}$ be a sequence of centred independent random variables which verify (2.1) and let $G=(G_{n})_{n\in{\mathbb{N}}},G_{n}\in{\mathbb{R}}^{m_{\ast}}$ be a sequence of independent centred Gaussian random variables such that ${\mathbb{E}}(G_{n,i}G_{n,j})={\mathbb{E}}(Z_{n,i}Z_{n,j}).$ Then, for every $f\in C_{b}^{3}({\mathbb{R}})$

[TABLE]

with

[TABLE]

in which $\overline{M}_{3}=b_{3}\sqrt{m_{\ast}}\,M_{3}(Z)\vee M_{3}(G).$

Proof. The proof is based on Lindeberg’s method (we follow the argument from [21]). We fix $J\geq N,$ we denote $\Gamma_{N}(J)=\cup_{m=0}^{N}\{\alpha\in\Gamma:\left|\alpha\right|=m,\alpha_{m}^{\prime}\leq J\}$ and we define $S_{N,J}(c,Z)=\sum_{\alpha\in\Gamma_{N}(J)}c(\alpha)Z^{\alpha}.$ For $j=1,\ldots,J+1$ we define the intermediate sequences $Z^{j}=(Z_{1},\ldots,Z_{j-1},G_{j},\ldots,G_{J})$ , with $Z^{1}=(G_{1},\ldots,G_{J})$ and $Z^{J+1}=(Z_{1},\ldots,Z_{J})$ , and we write

[TABLE]

We denote $\Gamma_{N}(j,J)=\{\alpha\in\Gamma_{N}(J):j\notin\alpha^{\prime}\}$ and, for $\beta\in\Gamma_{N}(j,J)$ with $\left|\beta\right|=m$ we define

[TABLE]

This means that, if $\beta$ does not contain $j,$ we insert $(j,i)$ in the convenient position. We put

[TABLE]

and then

[TABLE]

Moreover, with $f_{j}:{\mathbb{R}}^{m_{\ast}}\rightarrow{\mathbb{R}}$ defined by $f_{j}(x):=f(A_{j}+\sum_{i=1}^{m_{\ast}}x_{i}B_{j,i})$ we get

[TABLE]

We use now Taylor’s expansion of order three around [math] for both $f_{j}(Z_{j})$ and $f_{j}(G_{j})$ . Since $Z_{j}$ and $G_{j}$ are independent of $A_{j}$ and $B_{j,\cdot}$ and the first and second moments of $Z_{j,i}$ and $G_{j,i}$ coincide, the first and second order terms in the Taylor expansion cancel and we obtain

[TABLE]

We have

[TABLE]

The same is true for $|\partial_{i_{1}i_{2}i_{3}}^{3}f_{j}(\lambda G_{j})|$ , so (recall that $Z_{j}$ and $G_{j}$ are independent of $B_{j,\cdot})$

[TABLE]

Using (2.9),

[TABLE]

and this gives

[TABLE]

We sum over $j$ and we get

[TABLE]

$\square$

We recall now the main result from [21] concerning the invariance principle in Kolmogorov distance (defined in (1.6)).

Theorem 2.3

Let $Z=(Z_{n})_{n\in{\mathbb{N}}},Z_{n}\in{\mathbb{R}}^{m_{\ast}}$ be a sequence of centred independent random variables which verify (2.1) and let $\mathrm{Cov}(Z_{n})$ denote the covariance matrix of $Z_{n}.$ We assume that there exists $0<\underline{\lambda}\leq 1$ such that for every $n\in{\mathbb{N}}$

[TABLE]

Let $G=(G_{n})_{n\in{\mathbb{N}}},G_{n}\in{\mathbb{R}}^{m_{\ast}}$ be a sequence of independent centred Gaussian random variables such that $\mathrm{Cov}(Z_{n})=\mathrm{Cov}(G_{n}).$ Then

[TABLE]

with

[TABLE]

Proof. We denote $A_{n}=\mathrm{Cov}^{1/2}(Z_{n})$ and we define $\overline{Z}_{n}=A_{n}^{-1}\times Z_{n},$ so that $\overline{Z}_{n,1},\ldots,\overline{Z}_{n,m_{\ast}}$ are orthonormal. In the formalism in [21], $\overline{Z}_{n}$ is called an “orthonormal ensemble”. Then we define

[TABLE]

and we notice that, with this definition,

[TABLE]

Moreover one easily checks that

[TABLE]

Let us check that $\overline{Z}$ is hypercontractive in the sense of [21]. We notice that $M_{p}(\overline{Z})\leq\underline{\lambda}^{-m_{\ast}}M_{p}(Z)$ and we take $\eta^{-1}=b_{p}(b_{p}\underline{\lambda}^{-m_{\ast}}M_{p}(Z))^{N}.$ Then, for any coefficients $c\in\mathcal{C}({\mathbb{R}})$ we have (with $p=3)$

[TABLE]

and this means, in the formalism from [21] that $\overline{Z}$ is $(2,3,\eta)-$ hypercontractive. Now we are able to use Theorem 3.19 in [21] (which is written in terms of $\tau=\delta^{2}_{\ast}(c)$ ), and this yields (2.15). $\square$

3 Main results

3.1 Doeblin’s condition and splitting

We fix $d_{\ast}\in{\mathbb{N}}$ and $k_{\ast}\in{\mathbb{N}}$ , we denote $m_{\ast}=d_{\ast}\times k_{\ast},$ and we work with a sequence of independent random variables $X=(X_{n})_{n\in{\mathbb{N}}},$ $X_{n}=(X_{n,1},\ldots,X_{n,d_{\ast}})\in{\mathbb{R}}^{d_{\ast}}.$ We deal with general polynomials with variables $X_{n,j}$ that is, with linear combinations of monomials $\prod_{i=1}^{m}X_{n_{i},j_{i}}^{k_{i}},k_{i}\leq k_{\ast}.$ Because of the powers $k_{i}$ , this is no more a multi-linear polynomial. In order to come back to multi-linear polynomials we define $Z_{n}(X)\in{\mathbb{R}}^{m_{\ast}}$ by

[TABLE]

With this definition, if $\alpha=((n_{1},l_{1}),\ldots,(n_{m},l_{m}))$ , with $n_{1}<\cdots<n_{m}$ and $l_{1},\ldots,l_{m}\in\{1,\ldots,m_{\ast}\}$ , then

[TABLE]

where $(k_{i},j_{i})=(k(l_{i}),j(l_{i}))$ , $i=1,\ldots,m$ , with

[TABLE]

in which the symbols $\lfloor x\rfloor$ and $\{x\}$ denote the integer and the fractional part of $x\geq 0$ respectively. We denote

[TABLE]

that is

[TABLE]

which agrees with (1.1)-(1.2) in dimension 1 ( $d_{\ast}=1$ ).

The crucial hypothesis in this section is that for every $n\in{\mathbb{N}},$ the law of $X_{n}$ is locally lower bounded by the Lebesgue measure - this is Doeblin’s condition. Let us be more precise.

Hypothesis ${\mathfrak{D}}(\varepsilon,r,R)$ . Let $\varepsilon>0$ , $r>0$ and $R>0$ be fixed. We say that $X=(X_{n})_{n\in{\mathbb{N}}}$ satisfies hypothesis ${\mathfrak{D}}(\varepsilon,r,R)$ if there exist $x_{n}\in{\mathbb{R}}^{d_{\ast}},n\in{\mathbb{N}}$ such that for every measurable set $A\subset B_{r}(x_{n})$

[TABLE]

$\lambda$ * denoting the Lebesgue measure on ${\mathbb{R}}^{d_{\ast}}$ , and*

[TABLE]

Note that there is no assumption about $X_{n}$ , $n\in{\mathbb{N}}$ , being identically distributed, but the fact that the parameters $\varepsilon$ , $r$ and $R$ are the same for every $n,$ represents a uniformity assumption. Note also that this property never holds for $Z_{n}(X).$ This is why we are obliged to work with $X_{n}$ only.

Hypothesis ${\mathfrak{M}}(\varepsilon,r,R)$ . We say that $X=(X_{n})_{n\in{\mathbb{N}}}$ satisfies hypothesis ${\mathfrak{M}}(\varepsilon,r,R)$ if ${\mathfrak{D}}(\varepsilon,r,R)$ holds and if for every $p\geq 1$ one has $\sup_{n\in{\mathbb{N}}}\left\|X_{n}\right\|_{p}<\infty$ .

Note that if Assumption ${\mathfrak{M}}(\varepsilon,r,R)$ holds then $Z_{n}(X)$ verifies (2.1).

The interesting point about random variables which verity ${\mathfrak{D}}(\varepsilon,r,R)$ is that one may use a splitting method in order to obtain a nice representation for $X_{n}$ (in law). We introduce the auxiliary functions $\theta_{r},\psi_{r}:{\mathbb{R}}\rightarrow{\mathbb{R}}_{+}$ defined by

[TABLE]

and we denote

[TABLE]

Let $V_{n},U_{n}\in{\mathbb{R}}^{d_{\ast}}$ and $\chi_{n}\in\{0,1\}$ be independent random variables with laws

[TABLE]

Note that the hypothesis ${\mathfrak{D}}(\varepsilon,r,R)$ ensures that ${\mathbb{P}}(X_{n}\in dx)-\varepsilon\prod_{k=1}^{d_{\ast}}\psi_{r}(\left|x_{k}-x_{n,k})\right|^{2})dx\geq 0,$ so that the law of $U_{n}$ is well defined. It is easy to check that $\chi_{n}V_{n}+(1-\chi_{n})U_{n}$ has the same law as $X_{n}$ . Since all our statements concern only the law of $X_{n}$ , now on we assume that

[TABLE]

Let us mention a nice property for the function $\psi_{r}$ : it is easy to check that for each $k\in{\mathbb{N}},p\geq 1$ there exists a universal constant $C_{k,p}\geq 1$ such that

[TABLE]

where $\theta_{r}^{(k)}$ denotes the derivative of order $k$ of $\theta_{r}.$

Actually, the uniformity property (3.5) has not been used so far. We see now that it gives a “non degeneracy” for the powers of the components of $V_{n}$ uniformly in $n\in{\mathbb{N}}$ . More precisely, we define the random vector $\widetilde{V}_{n}=Z_{n}(V)$ in ${\mathbb{R}}^{m_{*}}$ , that is

[TABLE]

where $k(l)$ and $j(l)$ are given in (3.2). Then, one has the following result.

Lemma 3.1

Let $R>0$ be such that (3.5) holds and let $\mathrm{Cov}(\widetilde{V}_{n})$ denote the covariance matrix of $\widetilde{V}_{n}$ . Then there exists $\lambda_{R}>0$ such that

[TABLE]

for every $\xi\in{\mathbb{R}}^{m_{\ast}}$ and $n\in{\mathbb{N}}$ .

Proof. For $y\in{\mathbb{R}}^{d_{\ast}}$ and $\xi\in{\mathbb{R}}^{m_{\ast}}$ we define

[TABLE]

If $I_{\xi}(y)=0$ then $\sum_{l=1}^{m_{\ast}}(x_{j(l)}^{k(l)}-e_{l}(y))\xi_{l}=0$ for $x$ in an open set, and this imply that $\xi=0.$ Since $\xi\mapsto I_{\xi}(y)$ is continuous, it follows that $\lambda(y)=\inf_{\left|\xi\right|=1}I_{\xi}(y)>0.$ And since $y\mapsto\lambda(y)$ is continuous it follows that one may find $\lambda_{R}>0$ such that $\inf_{\left|y\right|\leq R}\lambda(y)\geq\lambda_{R}$ . Now, we note that $e_{l}(x_{n})={\mathbb{E}}(V_{n,j(l)}^{k(l)})={\mathbb{E}}(\widetilde{V}_{n,l})$ and $I_{\xi}(x_{n})=<\mathrm{Cov}(\widetilde{V}_{n})\xi,\xi\rangle$ . Thus, if $|\xi|=1$ we get $\inf_{n}<\mathrm{Cov}(\widetilde{V}_{n})\xi,\xi\rangle=\inf_{n}\inf_{|\xi|=1}I_{\xi}(x_{n})\geq\lambda_{R}$ , and (3.12) follows. $\square$

We conclude with an inequality which will be useful later on.

Lemma 3.2

Let $R>0$ be such that (3.5) holds and let $\lambda_{R}$ be given in Lemma 3.1. Then for every $d\in\mathcal{C}({\mathbb{R}})$ ,

[TABLE]

with $\widetilde{V}=Z(V)$ defined in (3.11).

Proof. We first fix an integer $m$ , $n_{1}<\cdots<n_{m}$ and we consider $d(l_{1},\ldots,l_{m})$ , $l_{i}\in[m_{\ast}]$ . We prove that

[TABLE]

We define the random variable

[TABLE]

We notice that $\widehat{d}(k),k\in[m_{\ast}]$ are independent of $\widetilde{V}_{n_{m},l},l\in[m_{\ast}]$ and that

[TABLE]

So,

[TABLE]

the above lower bound following from (3.12). By iteration, one gets (3.13).

Consider now the general case. We recall that, for any two multi-indexes $\alpha$ and $\overline{\alpha}$ , ${\mathbb{E}}(\widetilde{V}^{\alpha}\widetilde{V}^{\overline{\alpha}})\neq 0$ if and only if $\alpha^{\prime}=\overline{\alpha}^{\prime}$ . This gives

[TABLE]

where, for fixed $n_{1}<\ldots<n_{m}$ , we have set $d_{n_{1},\ldots,n_{m}}(l_{1},\ldots,l_{m})=d((n_{1},l_{1}),\ldots,(n_{m},l_{m}))$ . The statement now follows from (3.14). $\square$

3.2 Main results

Our goal is to estimate the total variation distance between two polynomials of type $Q_{N,k_{\ast}}(c,X)$ , which we write as in (3.3), that is

[TABLE]

where $Z(X)$ is defined in (3.1) and $\alpha=(\alpha^{\prime},\alpha^{\prime\prime})$ with $\alpha^{\prime\prime}_{i}\in[m_{*}]$ , $m_{*}=d_{*}k_{*}$ .

We will use the following quantities related to the coefficients $c.$ We work first with the Hilbert space $\mathcal{U}={\mathbb{R}}$ (so, we drop $\mathcal{U}$ from the notation) and we recall that $\left|c\right|=\left|c\right|_{\mathcal{U}}$ is defined in (2.2) and $\delta_{\ast}(c)=\delta_{{\mathcal{U}},\ast}(c)$ is defined in (2.3). Moreover, for $m\leq N,$ we define

[TABLE]

Finally we assume that $X$ verifies ${\mathfrak{D}}(\varepsilon,r,R)$ and we denote

[TABLE]

Notice that if $X$ and $Y$ satisfy ${\mathfrak{D}}(\varepsilon,r,R)$ respectively ${\mathfrak{D}}(\varepsilon^{\prime},r^{\prime},R^{\prime}),$ then they both satisfy ${\mathfrak{D}}(\varepsilon\wedge\varepsilon^{\prime},r\wedge r^{\prime},R\vee R^{\prime})$ so we may assume that $\varepsilon$ , $r$ and $R$ are the same.

For $k\in{\mathbb{N}}$ we define the distances

[TABLE]

Note that $d_{0}=d_{\mbox{\rm{\scriptsize{TV}}}}$ is the total variation distance and $d_{1}$ is the Fortet Mourier distance (which metrizes the convergence in law). We give now our first result:

Theorem 3.3

Suppose that $X$ and $Y$ verify Hypothesis ${\mathfrak{M}}(\varepsilon,r,R)$ (that is (2.1) and ${\mathfrak{D}}(\varepsilon,r,R)$ ) and let $c,d\in\mathcal{C}({\mathbb{R}})$ be two families of coefficients. We fix $k,k_{\ast}$ and $N$ and $m\leq N$ and $m^{\prime}\leq N$ such that $|c|_{m}>0$ and $|d|_{m^{\prime}}>0$ and we denote $\overline{m}=m\vee m^{\prime}.$ We also assume that

[TABLE]

Let $\theta\in((\frac{1}{1+k})^{2},1)$ . Then there exist $C>0$ and $a\in(\frac{1}{1+k},1]$ , which depend on the parameters $\varepsilon,r,R,k,k_{\ast},N,m,m^{\prime},\theta$ and the moment bounds $M_{p}(X)$ , $M_{p}(Y)$ for a suitable $p>1,$ but independent of the coefficients $c,d\in\mathcal{C}({\mathbb{R}})$ , such that

[TABLE]

$e_{m,N}(c)$ * and $e_{m^{\prime},N}(d)$ being defined in (3.15).*

In practical situations, one has $|c|^{2}_{m+1,N}=|d|^{2}_{m^{\prime}+1,N}=0$ or both $|c|^{2}_{m+1,N}$ and $|d|^{2}_{m^{\prime}+1,N}$ are very small, so $d_{k}$ in (3.16) is actually the $d_{k}$ -distance between $Q_{N,k_{\ast}}(c,X)$ and $Q_{N,k_{\ast}}(d,Y)$ .

The proof of Theorem 3.3 is done by using a Malliavin type calculus based on $V_{n},n\in{\mathbb{N}}$ which we present in Section 5, so we postpone it for Section 5.5. It represents the main effort in our paper.

As an immediate consequence, we give the following estimate of the total variation distance between two multiple stochastic integrals. We consider a $m_{\ast}$ dimensional Brownian motion $W=(W^{1},\ldots,W^{m_{\ast}}),$ we fix $\kappa=(k_{1},\ldots,k_{m})\in[m_{\ast}]^{m},$ and, for a symmetric kernel $f\in L^{2}[0,1]^{m},$ we denote

[TABLE]

Theorem 3.4

Let $f,g\in L^{2p}[0,1]^{m},p>1.$ Then, for every $k,m\in{\mathbb{N}}_{\ast}$ and $\theta\in((\frac{1}{1+k})^{2},1)$ there exist $C>0$ and $a\in(\frac{1}{1+k},1)$ (both depending on $\theta,m$ and $k$ ) such that

[TABLE]

Remark 3.5

In the case $k=1,$ the above result has first been announced in [10] with the power $\frac{1}{m}$ instead of $\frac{\theta}{2m+1}$ above, but the proof was only sketched. It has rigourously been proved in [29] with power $\frac{1}{2m+1}$ and recently improved in [8] where the power $\frac{1}{m}\times(\ln m)^{d}$ is obtained. So (3.18) is not the best possible estimate. This also indicates that the power in (3.17) is not optimal (but the approach in [8] does not seem to work in our general framework, so for the moment we are not able to improve it).

Remark 3.6

Theorem 3.4, with exactly the same proof, extends to general random variables which live in a finite sum of Wiener chaoses: let $F$ and $G$ be two random variables belonging to $\oplus_{m=0}^{N}\mathcal{W}_{m}$ where $\mathcal{W}_{m}$ is the chaos of order $m.$ We denote by $P_{m}$ the projection on $\mathcal{W}_{m}$ and we put $m(F)=\max\{m:P_{m}F\neq 0\}$ and $\alpha(F)=\|P_{m(F)}F\|_{2}^{-2/{m(F)}}.$ Then, with $N=m(F)\vee m(G),$

[TABLE]

where $a\in(\frac{1}{1+k},1)$ and $C>0$ depend on $\theta,k,N$ .

Proof of Theorem 3.4. Let $n\in{\mathbb{N}}.$ For $\alpha^{\prime}=(\alpha_{1}^{\prime},\ldots,\alpha_{m}^{\prime})\in[n-1]^{m}$ , we denote $I_{\alpha^{\prime}}=\prod_{j=1}^{m}[\frac{\alpha_{j}^{\prime}}{n},\frac{\alpha_{j}^{\prime}+1}{n})$ and we define

[TABLE]

Note that $f_{n}$ is the conditional expectation of $f$ with respect to the partition $I_{\alpha^{\prime}}$ and to the uniform law on $[0,1]^{m}.$ Take now $\alpha=(\alpha_{1},\ldots,\alpha_{n})$ with $\alpha_{i}=(\alpha_{i}^{\prime},\alpha_{i}^{\prime\prime})$ and $(\alpha_{1}^{\prime\prime},\ldots,\alpha_{m}^{\prime\prime})\in[m_{\ast}]^{m}$ . We denote

[TABLE]

so that

[TABLE]

We are now in the framework of Theorem 3.3 and we compare $\Phi_{m}(c_{n,f},G)$ and $\Phi_{m}(c_{n,g},G)$ . We take $k_{\ast}=1,d_{\ast}=m_{\ast}$ and $N=m=m^{\prime}.$ Then $\left|c_{n,f}\right|_{m+1,N}=\left|c_{n,g}\right|_{m+1,N}=0$ . Let us estimate the parameters associated to $c_{n,f}.$ By the convergence theorem for martingales $\left|c_{n,f}\right|_{m}^{2}=m!\left\|f_{n}\right\|_{2}^{2}\rightarrow m!\left\|f\right\|_{2}^{2}>0.$ We estimate now $\delta_{\ast}(c_{n,f})$ . By using Hölder’s inequality,

[TABLE]

so that $e_{m,m}(c_{n,f})\rightarrow 0$ and $e_{m,m}(c_{n,g})\rightarrow 0$ as $n\rightarrow\infty$ .

Now (3.17) gives, for $\theta<1,$ and $n,n^{\prime}\in{\mathbb{N}}$

[TABLE]

where $a\in(\frac{1}{1+k},1)$ . We take $n^{\prime}>n$ and we notice that $d_{k}(I_{\kappa}(f_{n}),I_{\kappa}(f_{n^{\prime}}))\leq\left\|f_{n}-f_{n^{\prime}}\right\|_{2}\rightarrow 0$ so that the above inequality gives $d_{\mbox{\rm{\scriptsize{TV}}}}(I_{\kappa}(f_{n}),I_{\kappa}(f_{n^{\prime}}))\rightarrow 0$ as $n,n^{\prime}\rightarrow\infty.$ It follows that the sequences $I_{\kappa}(f_{n})$ and $I_{\kappa}(g_{n}),n\in{\mathbb{N}}$ are Cauchy in $d_{\mbox{\rm{\scriptsize{TV}}}}$ and we may pass to the limit in (3.20) in order to obtain (3.18). $\square$

We give now the analogous of Theorem 3.3 but in terms of Kolmogorov distance. Here one needs no more Doeblin’s condition nor non degeneracy conditions.

Theorem 3.7

Suppose that $X$ and $Y$ verify (2.1) and are such that $Z(X)$ and $Z(Y)$ both satisfy (2.14). Let $c,d\in\mathcal{C}({\mathbb{R}})$ be two families of coefficients such that $\left|c\right|_{N}>0$ and $\left|d\right|_{N}>0.$ with $\delta_{\ast}(c),\delta_{\ast}(d)\leq 1$ . Then, for every $k\in{\mathbb{N}}$ and $\theta\in((\frac{1}{1+k})^{2},1)$ there exist $C>0$ and $a\in(\frac{1}{1+k},1)$ such that

[TABLE]

where $C>0$ denotes a constant depending on $N$ , suitable moments of $X$ and $Y$ and on the lower bounds $\underline{\lambda}$ in (2.14) applied to $Z(X)$ and $Z(Y)$ .

Remark 3.8

Note that the estimate (3.21) is in terms of $\delta_{\ast}^{\theta/(2(k\vee 3)N+1)}(c)$ whereas in (3.17) it appears $e_{m,N}(c)=\exp(-C\times\frac{|c|_{m}^{2}}{\delta_{\ast}^{2}(c)})$ which is much smaller. But we need that $X_{n}$ and $Y_{n}$ satisfy Doeblin’s condition ${\mathfrak{D}}(\varepsilon,r,R).$

Proof. We consider the Gaussian random variables $G_{X}$ and $G_{Y}$ corresponding to $Z(X)$ and $Z(Y)$ respectively and we use Theorem 2.3 (see (2.15)) in order to obtain

[TABLE]

Using the same argument as in the proof of Theorem 2.3 we may assume that $G_{X}$ and $G_{Y}$ are standard Gaussian random variables so that $S_{N}(c,G_{X})$ and $S_{N}(d,G_{Y})$ are multiple stochastic integrals. By $d_{{\mbox{\rm{\scriptsize{Kol}}}}}\leq d_{{\mbox{\rm{\scriptsize{TV}}}}}$ and by (3.19) first and (2.12) (recall that $Q_{N,k_{\ast}}(c,X)=S_{N}(c,Z_{n}(X))$ then

[TABLE]

$\square$

We give now the invariance principle:

Theorem 3.9

Let $X=(X_{n})_{n\in{\mathbb{N}}}$ be a sequence of independent ${\mathbb{R}}^{d_{\ast}}$ valued random variables which verify Hypothesis ${\mathfrak{M}}(\varepsilon,r,R)$ and $G_{X}=(G_{n,X})_{n\in{\mathbb{N}}},G_{n,X}\in{\mathbb{R}}^{m_{\ast}}$ a sequence of independent and centred Gaussian random variables such that $\mathrm{Cov}(G_{n,X})=\mathrm{Cov}(Z_{n}(X)).$ Suppose that for some $m\leq N$ one has $|c|_{m}>0.$ Let $\theta\in(\frac{1}{16},1)$ . Then there exist $C>0$ and $a\in(\frac{1}{4},1]$ , which depend on the parameters $\varepsilon,r,R,k_{\ast},N,m,m^{\prime},\theta$ and the moment bounds $M_{p}(X)$ , $M_{p}(Y)$ for a suitable $p>1$ but independent of the coefficients $c\in\mathcal{C}({\mathbb{R}})$ , such that

[TABLE]

Proof. This is an immediate consequence of Theorem 3.3 and of Theorem 2.2. $\square$

In a number of concrete applications (see Theorem 4.3 for example), one takes $S_{N}(c,Z(X))=\sum_{n=m}^{N}\Phi_{n}(c,Z(X))$ and, asymptotically, $\Phi_{m}(c,Z(X))$ represents the principal term. Having in mind this we give the following corollary:

Theorem 3.10

Let $c\in\mathcal{C}({\mathbb{R}})$ be such that $c(\alpha)=0$ for $\left|\alpha\right|\leq m-1$ and $\left|c\right|_{m}>0.$ Suppose $|c|_{m+1,N}\leq 1$ .

A. If $G=(G_{n})_{n\in{\mathbb{N}}}$ denote independent centred Gaussian random variables then, for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

B. Let $X$ satisfy ${\mathfrak{M}}(\varepsilon,r,R)$ and let $G=(G_{n})_{n\in{\mathbb{N}}},G_{n}\in{\mathbb{R}}^{m_{\ast}},$ be a sequence of independent and centred Gaussian random variables such that $\mathrm{Cov}(G_{n})=\mathrm{Cov}(Z_{n}(X))$ . Then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

C. If $Z(X)$ satisfies (2.14) then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

In the above estimates (3.23), (3.24) and (3.25), $C>0$ denotes a constant independent of the coefficients $c\in\mathcal{C}({\mathbb{R}})$ .

Proof. One has

[TABLE]

so (3.23) follows from Theorem 3.3 (see (3.17)). Using (3.23) and (3.22) we obtain (3.24). And (3.25) follows from (3.23) and (2.15). $\square$

3.3 Gaussian and Gamma approximation

Theorem 3.10 has the following interesting application: if one considers a sequence of coefficients $c_{n}\in\mathcal{C}({\mathbb{R}}),n\in{\mathbb{N}},$ the study of the asymptotic behavior of $Q_{N,k_{\ast}}(c_{n},X),n\in{\mathbb{N}}$ reduces to the study of the asymptotic behavior of $\Phi_{m}(c_{n},G),n\in{\mathbb{N}}$ , where $G=(G_{n})_{n\in{\mathbb{N}}},G_{n}\in{\mathbb{R}}^{m_{\ast}},$ is a sequence of independent and centred Gaussian random variables such that $\mathrm{Cov}(G_{n})=\mathrm{Cov}(Z_{n}(X))$ . Since $\Phi_{m}(c_{n},G)$ is (nearly) a multiple Wiener stochastic integral of order $m,$ this problem is already treated at least in two significant cases: the convergence to normality and the convergence to a Gamma distribution. In fact, the convergence to normality of the law of $\Phi_{m}(c_{n},G)$ is controlled by the Forth Moment Theorem due to Nualart and Peccati [33] and Nourdin and Peccati [25]. And the convergence to a Gamma distribution (and in particular to a $\chi_{2}$ distribution) is treated in [25]. In order to give the consequences of these results in our framework we have to identify the link between the notation in our paper and in the above mentioned works. Note that the coefficients $c\in\mathcal{C}({\mathbb{R}})$ have been defined as $c(\alpha)$ with $\alpha=(\alpha_{1},\ldots\alpha_{m})$ , $\alpha_{i}=(\alpha_{i}^{\prime},\alpha_{i}^{\prime\prime})$ , with $\alpha^{\prime}$ on the simplex $\alpha_{1}^{\prime}<\ldots<\alpha_{m}^{\prime}.$ We extend them by symmetry on the whole $({\mathbb{N}}\times[m_{\ast}])^{m}$ and we denote by $c_{s}$ this extension (with the convention that $c_{s}(\alpha)$ is zero if $\alpha_{i}=\alpha_{j}$ for $i\neq j)$ . So we will have

[TABLE]

The second point is to write the sequence of multi-dimensional random variables $G_{n}=(G_{n,1},\ldots,G_{n,m_{\ast}})\in{\mathbb{R}}^{m_{\ast}},$ $n\in{\mathbb{N}}$ as a sequence of one-dimensional random variables $\overline{G}_{n}\in{\mathbb{R}},n\in{\mathbb{N}}$ and to re-indicate the coefficients in a corresponding way. But we have to note first that $G_{n,1},\ldots,G_{n,m_{\ast}}$ are not a priori independent, because $\mathrm{Cov}(G_{n})=\mathrm{Cov}(Z_{n}(X))$ is not the identity matrix. So we assume that $\mathrm{Cov}(Z_{n}(X)$ is invertible and we first use (2.17) in order to write

[TABLE]

with $\overline{c}$ defined in (2.16). Now $\overline{G}_{n,1},\ldots,\overline{G}_{n,m_{\ast}}$ are independent and we are ready to write them as a sequence. We define $I:{\mathbb{N}}\times[m_{\ast}]\rightarrow{\mathbb{N}}$ by $I(n,j)=n\times m_{\ast}+j$ . Setting $\lfloor x\rfloor$ and $\{x\}$ the integer respectively the fractional part of $x$ , the inverse function $J=I^{-1}:{\mathbb{N}}\rightarrow{\mathbb{N}}\times[m_{\ast}]$ is then defined as follows: $J(n)=(\lfloor n/m_{\ast}\rfloor,\{n/m_{\ast}\}m_{\ast})$ if $\{n/m_{\ast}\}>0$ and $J(n)=(\lfloor n/m_{\ast}\rfloor-1,m_{\ast})$ if $\{n/m_{\ast}\}=0$ . We extend this definition to multi-indexes: if $\beta=(n_{1},\ldots,n_{m})\in{\mathbb{N}}^{m}$ then $J(\beta)=(J(n_{1}),\ldots,J(n_{m}))\in({\mathbb{N}}\times[m_{\ast}])^{m}.$ And to coefficients: if $f:({\mathbb{N}}\times[m_{\ast}])^{m}\rightarrow{\mathbb{R}}$ we define $\widehat{f}:{\mathbb{N}}^{m}\rightarrow{\mathbb{R}}^{m}$ by $\widehat{f}(\beta)=f(J(\beta)).$ Moreover, we consider the sequence $\widehat{G}_{n}=\overline{G}_{J(n)},n\in{\mathbb{N}}.$ Then

[TABLE]

with the convention that now we work with the multi-index $\alpha\in{\mathbb{N}}^{m}.$ Note that $\Phi_{m}(\widehat{c}_{s},\widehat{G})$ is a multiple stochastic integral of order $m.$

We introduce now the “contraction operators”. For $0\leq r\leq m$ and $\alpha,\beta\in\Gamma_{m-r}$ one denotes $\widehat{c}_{s}\otimes_{r}\widehat{c}_{s}(\alpha,\beta)=\sum_{\gamma\in\Gamma_{r}}\widehat{c}_{s}(\alpha,\gamma)\widehat{c}_{s}(\beta,\gamma)$ with the convention that for $r=0$ we put $\widehat{c}_{s}\otimes_{0}\widehat{c}_{s}(\alpha,\beta)=\widehat{c}_{s}(\alpha)\widehat{c}_{s}(\beta)$ and for $r=m,$ $\widehat{c}_{s}\otimes_{m}\widehat{c}_{s}=\sum_{\gamma\in\Gamma_{m}}\widehat{c}_{s}(\gamma)\widehat{c}_{s}(\gamma).$ Note that, even if $\widehat{c}_{s}$ is symmetric, $\widehat{c}_{s}\otimes_{r}\widehat{c}_{s}$ is not symmetric, so we introduce $\widehat{c}_{s}\widetilde{\otimes}_{r}\widehat{c}_{s}$ to be the symmetrization of $\widehat{c}_{s}\otimes_{r}\widehat{c}_{s}.$

We introduce now

[TABLE]

It is known (see [25]) that $\kappa_{4,m}(\widehat{c}_{s})$ is equal to the forth cumulant of $\Phi_{m}(\widehat{c}_{s},\widehat{G})$ and moreover, it is proved in [25] that, if $\mathcal{N}$ is a standard normal random variable, then

[TABLE]

Using this and Theorem 3.10 we immediately obtain

Theorem 3.11

Let $\mathcal{N}$ be a standard normal random variable.

A. If $X$ satisfies ${\mathfrak{M}}(\varepsilon,r,R)$ and, for every $n\in{\mathbb{N}},$ $\mathrm{Cov}(Z_{n}(X)$ is invertible, then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

B. If $Z(X)$ satisfies (2.1) and (2.14) then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

In the above estimates (3.27) and (3.28), $C>0$ denotes a constant independent of the coefficients $c\in\mathcal{C}({\mathbb{R}})$ .

Remark 3.12

This is a generalization of the “forth moment theorem” to stochastic polynomials. However there is a difference because the influence factor $\delta_{\ast}(c)$ appears in (3.27). One may ask if it is possible to control the distance between stochastic polynomials and the normal distribution in terms of $\kappa_{4,m}(\widehat{c}_{s}))$ only. An affirmative answer has recently been given in the following more particular framework: assume that $d_{\ast}=k_{\ast}=1$ so that $\Phi_{m}(c,X)$ is a multi-linear polynomial. Assume also that the random variables $X_{n},n\in{\mathbb{N}}$ are identically distributed. Then, if ${\mathbb{E}}(X_{1}^{4})\geq 3,$ the convergence to normality is controlled by $\kappa_{4,m}(\widehat{c}_{s}))$ only (see Theorem 2.3 in [26]).

We discuss now the convergence to a Gamma distribution. For $\nu\geq 1$ we consider $F(\nu)$ a centred Gamma distribution of parameter $\nu$ : $F(\nu)=2G(\nu/2)-\nu$ where $G(\nu/2)$ has a Gamma law with parameter $\nu/2$ (that is, with density $g_{\nu/2}(x)\varpropto x^{\nu/2-1}e^{-x}1_{x>0}$ ). If $\nu$ is integer then $F(\nu)$ is a centred chi-square distribution with $\nu$ degrees of freedom. We introduce

[TABLE]

with $\theta_{m}=\frac{1}{4}(m/2)!\left(\begin{array}[]{c}m\\ m/2\end{array}\right).$ Combining Theorem 3.11 and Proposition 3.13 from [25] one obtains

[TABLE]

If $\nu$ is an integer then $F(\nu)$ has a centred $\chi^{2}(\nu)$ distribution, so may be represented as a polynomial of degree two of Gaussian random variables. Then, using Theorem 5.9 in [8] one obtains

[TABLE]

Then, using Theorem 3.10 we obtain

Theorem 3.13

Let $\mathcal{X}_{\nu}$ be a random variable with a centred $\chi^{2}$ distribution with $\nu$ degrees of freedom.

A. If $X$ satisfies ${\mathfrak{M}}(\varepsilon,r,R)$ and, for every $n\in{\mathbb{N}},$ $\mathrm{Cov}(Z_{n}(X)$ is invertible, then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

B. If $Z(X)$ satisfies (2.1) and (2.14) then for every $\theta\in(\frac{1}{4},1)$ there exists $a\in(\frac{1}{2},1]$ such that

[TABLE]

In the above estimates (3.30) and (3.31), $C>0$ denotes a constant independent of the coefficients $c\in\mathcal{C}({\mathbb{R}})$ .

4 Examples

4.1 U-statistics associated to polynomial kernels

Let us first shortly recall how U-statistics appear. One considers a class of distributions $\mathcal{M}$ and aims to estimate a functional $\theta(\mu)$ with $\mu\in\mathcal{M}.$ In order to do it one has at hand a sequence of independent random variables $X_{1},\ldots,X_{n}$ with law $\mu\in\mathcal{M},$ but does not know which is this law. The goal is to construct an unbiased estimator, that is a sequence of functions $f_{n}:{\mathbb{R}}^{n}\rightarrow{\mathbb{R}},$ such that the estimator $U_{n}=f_{n}(X_{1},\ldots,X_{n})$ converges to $\theta(\mu)$ and moreover ${\mathbb{E}}(U_{n})=\theta(\mu)$ for every $\mu\in\mathcal{M}.$ This means that the estimator is unbiased - and this is the origin of the name U-statistics. In 1948 Halmos [15] asked the question if such an unbiased estimator exists and if it is unique. It turns out that the necessary and sufficient condition in order to be able to construct such an estimator is that $\theta(\mu)$ has the following particular form: there exists $N\in{\mathbb{N}}$ and a measurable function $\psi:{\mathbb{R}}^{N}\rightarrow{\mathbb{R}}$ such that

[TABLE]

In this case one may construct the symmetric unbiased estimator $f_{n}$ (and if $\mathcal{M}$ is sufficiently large, this estimator is unique in the class of the symmetric estimators) in the following way:

[TABLE]

where the sum $\sum_{(n,N)}$ is taken over all the subsets $\{i_{1},\ldots,i_{N}\}\subset\{1,\ldots,n\}$ such that $i_{k}\neq i_{p}$ for $k\neq p$ . It is clear that $\psi$ may be taken to be symmetric (if not one takes its symmetrization and this change nothing).

When $\psi(x_{1},\ldots,x_{N})$ is a polynomial, this fits in our framework and our results apply, but, for example $\psi(x_{1},\ldots,x_{N})=\max\{\left|x_{1}\right|,\ldots,\left|x_{N}\right|\},$ is out of reach. We will treat first two standard examples.

Example 1. (Variance estimator) We denote $m_{X}={\mathbb{E}}(X),v_{X}={\mathbb{E}}((X-{\mathbb{E}}(X))^{2})$ and $q_{X}=\mathrm{Var}(2m_{X}X-X^{2})$ . We take $\psi(x_{1},x_{2})=\frac{1}{2}(x_{1}-x_{2})^{2}$ so that

[TABLE]

In order to come back in our framework we write

[TABLE]

It follows that

[TABLE]

thus

[TABLE]

In our notation, we have

[TABLE]

where $c_{n}(\alpha)=0$ if $|\alpha|\neq 1,2$ and

[TABLE]

The quantities which come on in our convergence theorem are

[TABLE]

Our invariance principle (Theorem 3.9) says that $Q_{2,2}(c_{n},X)$ is asymptotically equivalent in total variation distance with

[TABLE]

where $G_{j}=(G_{1,j},G_{2,j})$ are Gaussian random variables with the same mean and covariance as $(X-{\mathbb{E}}(X),X^{2}-{\mathbb{E}}(X^{2})).$ Then $-2m_{X}G_{1}+G_{2}$ is a centred Gaussian random variable with variance $q_{X}=\mathrm{Var}(2m_{X}X-X^{2})$ so, if ${\mathfrak{M}}(\varepsilon,r,R)$ holds, then Theorem 3.9 and Theorem 3.10 yield

[TABLE]

for every $\theta<1$ , with $\Delta$ a standard normal random variable.

Remark 4.1

Another way to do things, used in U-statistics theory, is the following. One employs the two dimensional CLT in order to prove that the term normalized with $1/\sqrt{n}$ converges in law to $\sqrt{q_{X}}\Delta$ and then one notes that the remaining term is smaller, so it may be ignored.

Example 2. We look to the U-statistics associated to $\psi(x_{1},x_{2})=x_{1}x_{2}.$ We set $m_{X}={\mathbb{E}}(X)$ and $v_{X}=\mathrm{Var}(X)$ . Here $\psi$ is not invariant with respect to translations and we have two different limits according to the fact that $m_{X}$ is null or not. We write

[TABLE]

so that

[TABLE]

Case 1: $m_{X}\neq 0$ . Then

[TABLE]

with $c_{n}(\alpha)=0$ if $|\alpha|\neq 1,2$ and

[TABLE]

One has

[TABLE]

Using Theorem 3.9 and Theorem 3.10, the asymptotic behavior of $\sqrt{n}(U_{n}-m_{X}^{2})$ is equivalent to the behavior of

[TABLE]

with $\Delta$ standard normal.

Case 2: $m_{X}=0$ . Then

[TABLE]

where $c_{n}(\alpha)=0$ if $|\alpha|\neq 2$ and

[TABLE]

Here,

[TABLE]

Using the invariance principle (Theorem 3.9) this is close to $\frac{v_{X}}{n-1}\sum_{i_{1}\neq i_{2}}G_{i_{1}}G_{i_{2}}$ with $G_{i},i\in{\mathbb{N}}$ independent standard normal random variables. We define $D_{n}=[0,1]^{2}\smallsetminus\cup_{i=0}^{n-1}[\frac{i}{n},\frac{i+1}{n})^{2}$ and $f_{n}(s_{1},s_{2})=\frac{nv_{X}}{n-1}1_{D_{n}}(s_{1},s_{2}).$ Then the law of $\frac{v_{X}}{n-1}\sum_{i_{1}\neq i_{2}}G_{i_{1}}G_{i_{2}}$ coincides with the law of the double Itô integral $I_{2}(f_{n}).$ Setting $f\equiv v_{X}$ , we recall that the law of $I_{2}(f)$ coincides with the law of $v_{X}(\Delta^{2}-1)$ where $\Delta$ is standard normal. Then, using Theorem 3.9 (with $k_{\ast}=1,N=m=2)$ and Theorem 3.4 (with $k=1,m=2)$ one obtains, for every $\theta<1,$

[TABLE]

An alternative way to solve the problem is to write

[TABLE]

and to use the CLT in order to replace $\frac{1}{\sqrt{n}}\sum_{i=1}^{n}X_{i}$ with $\sqrt{v_{X}}\Delta$ and to say that by the law of large numbers the last term goes to $v_{X}$ . This gives the convergence in law of $nU_{n}^{\psi}$ to $v_{X}(\Delta^{2}-1).$

Remark 4.2

The above two examples suggest the following rough comparison of the strategies employed in the U-statistics theory on one hand and in our paper on the other hand. In the U-statistics theory one tries to make blocks of terms such that in the end $U_{n}^{\psi}$ appears as a continuous function of blocks of the form $\frac{1}{\sqrt{n}}\sum_{i=1}^{n}Y_{i}$ or $\frac{1}{n}\sum_{i=1}^{n}Y_{i}^{2}$ and then use the CLT, respectively the law of large numbers, in order to replace them, asymptotically, by a Gaussian random variable respectively by a constant. Alternatively, in our paper one begins by using the invariance principle in order to change $X_{i}-{\mathbb{E}}(X_{i})$ and $X_{i}^{2}-{\mathbb{E}}(X_{i}^{2})$ by Gaussian random variables $G_{i,1}$ and $G_{i,2}.$ And then one solves the problem of the asymptotic behavior in the framework of Wiener chaoses.

Let us go on and look to general polynomials. We fix $k_{\ast},N\in{\mathbb{N}}$ , we denote $\mathcal{K}_{N}=\{0,1,\ldots,k_{\ast}\}^{N},$ and we define

[TABLE]

with symmetric coefficients $a(\kappa)$ which are null on the diagonals. So $\psi$ is a general symmetric polynomial of order $k_{\ast}$ in the variables $x_{1},\ldots,x_{N}.$ We associate to $\psi$ the U-statistic $U_{n}^{\psi}$ defined in (4.2):

[TABLE]

The above quantity is linked with the stochastic polynomials defined in the previous sections in the following way. One takes $d_{\ast}=1$ and $m_{\ast}=k_{\ast}$ and constructs coefficients $c_{n}$ such that $U_{n}^{\psi}=Q_{N,k_{\ast}}(c_{n},X)=S_{N}(c_{n},Z(X))$ with $Z(X)$ associated to $X$ in (3.1): $Z_{i,k}(X)=X_{i}^{k}-{\mathbb{E}}(X_{i}^{k}),k=1,\ldots,k_{\ast}.$ The problem is that $Z_{i,k}(X)$ is centred whereas $X_{i}^{k},$ which appears in (4.4), is not. I turns out that the operation which consists in centering $X_{i}^{k}$ in (4.4) is exactly the Hoeffding decomposition, introduced by Hoeffding in [16, 17], and which plays a crucial role in the theory of U-statistics. Let us recall it. For $1\leq j\leq N,$ one defines the kernels

[TABLE]

Then Hoeffding’s decomposition is the following:

[TABLE]

where $U_{n}^{h_{j}}$ is the U-statistic associated to $h_{j}$ in the first equality from (4.4) (with $N$ replaced by $j)$ . See for example Theorem 1 in Section 1.6 in [19] for the proof of (4.5).

We denote $m_{k}={\mathbb{E}}(X^{k})$ and we compute

[TABLE]

so we obtain

[TABLE]

We conclude that

[TABLE]

In the theory of U-statistics one says that $U_{n}^{\psi}$ is degenerated at order $m\in[N]$ if $h_{j}=0$ for $j\leq m-1$ and $h_{m}\neq 0,$ which amounts to

[TABLE]

We assume that (4.6) holds and we write

[TABLE]

with

[TABLE]

By (4.6), the U-statistic $U_{n}^{\psi}$ is degenerated at order $m\in[N]$ if and only if

[TABLE]

which is the same non-degeneracy condition we are interested in.

We recall that $X_{i}\sim\mu$ and that in (2.14) we have introduced the covariance matrix $\mathrm{Cov}(Z(X))=\mathrm{Cov}(\mu)$ , that is

[TABLE]

We consider a correlated Brownian motion $W=(W^{1},\ldots,W^{m})$ with $\left\langle W^{i},W^{j}\right\rangle_{t}=C^{i,j}(\mu)t,$ we define the multiple stochastic integrals

[TABLE]

and we denote

[TABLE]

Theorem 4.3

A. If $X$ verifies ${\mathfrak{M}}(\varepsilon,r,R)$ and (4.6) holds then for every $\theta\in(\frac{1}{4},1)$

[TABLE]

B. Suppose that $X$ has finite moments of any order and that $\mathrm{Cov}(Z(X))=\mathrm{Cov}(\mu)\geq\underline{\lambda}>0.$ If (4.6) holds then, for every $\theta\in(\frac{1}{4},1)$

[TABLE]

Proof. In order to use Theorem 3.10 we estimate

[TABLE]

Finally we study the influence factor:

[TABLE]

Then (3.24) gives

[TABLE]

And by employing (3.25) one has

[TABLE]

$\square$

4.2 A quadratic central limit theorem

For $p\in(0,\frac{1}{2}]$ , we look to the quadratic form

[TABLE]

where $Z_{i},i\in{\mathbb{N}}$ are centred independent random variables which have finite moments of any order. The aim of this section is to prove that if $p<\frac{1}{2}$ then $S_{n,p}(Z)$ converges to a double stochastic integral while for $p=\frac{1}{2}$ the limit is a standard Gaussian random variable. In our notation, we have $d_{*}=1$ , $k_{*}=1$ , $N=2$ and

[TABLE]

where $c_{n,p}(\alpha)=0$ for $|\alpha|\neq 2$ and if $|\alpha|=2$ ,

[TABLE]

Theorem 4.4

Let $Z_{i},i\in{\mathbb{N}}$ be a sequence of independent and centred random variables, with ${\mathbb{E}}(Z_{i}^{2})=1$ and which have finite moments of any order.

A. Let $p<\frac{1}{2}$ . We denote $\psi_{p}(s,t)=\left|s-t\right|^{-p}$ and $I_{2}(\psi_{p})=\int_{0}^{1}\int_{0}^{1}\psi_{p}(s,t)dW_{s}dW_{t}$ , $W$ being a Brownian motion. Then for every $\theta\in(\frac{1}{4},1)$ there exists $n_{\ast}$ and $C$ such that for $n\geq n_{\ast}$

[TABLE]

Suppose moreover that ${\mathfrak{D}}(\varepsilon,r,R)$ holds. Then for every $\theta\in(\frac{1}{4},1)$ there exists $n_{\ast}$ and $C$ such that for $n\geq n_{\ast}$

[TABLE]

B. Let $p=\frac{1}{2}$ . We denote $\Delta$ a standard normal random variable. There exists $n_{\ast}$ and $C$ such that for $n\geq n_{\ast}$

[TABLE]

Suppose moreover that ${\mathfrak{D}}(\varepsilon,r,R)$ holds. Then (4.12) holds with $d_{\mbox{\rm{\scriptsize{TV}}}}$ instead of $d_{\mbox{\rm{\scriptsize{Kol}}}}.$

Proof A. We extend by symmetry the coefficients $c_{n,p}(\alpha)$ to all indexes $\alpha=(\alpha_{1},\alpha_{2})$ with $\alpha_{1}\neq\alpha_{2}$ . We denote $t_{i}=\frac{i}{n}$ and we define

[TABLE]

Let us prove that

[TABLE]

We take $q=\frac{2}{3}$ and we write

[TABLE]

with

[TABLE]

Note that if $\left|s-t\right|\geq 1/n^{q}$ then

[TABLE]

so that

[TABLE]

Moreover

[TABLE]

Finally, by comparing Riemann sums with the corresponding integral,

[TABLE]

Since $q=\frac{2}{3}$ we obtain (4.13). It follows that, for sufficiently large $n,$

[TABLE]

And we also have

[TABLE]

Note that $S_{n,p}(Z)=S_{2}(c_{n},Z)$ and $S_{2}(c_{n},G)=I_{2}(\psi_{n,p}).$ Using Theorem 2.3 (with $N=2$ ), Theorem 3.4 (see (3.18) with $k=1,m=2,\frac{1}{4}<\theta<1$ ) and (4.13) we obtain

[TABLE]

so (4.10) is proved for $d_{\mbox{\rm{\scriptsize{Kol}}}}.$

We suppose now that $Z$ verifies (3.4) and we use Theorem 3.9 (see (3.22) with $N=2)$ in order to obtain

[TABLE]

so (4.12) is proved for $d_{\mbox{\rm{\scriptsize{TV}}}}$ also.

B. We have $S_{n,1/2}(Z)=S_{2}(c_{n},Z)$ with (recall that $t_{i}=i/n)$

[TABLE]

We note first that

[TABLE]

These inequalities are easily obtained by comparing $\sum_{j=1}^{n}1_{i\neq j}\left|i-j\right|^{-1}$ with $\int_{\{\left|t_{i}-y\right|>1/n\}}\left|t_{i}-t\right|^{-1}dt.$ It immediately follows that

[TABLE]

and $\delta_{\ast}(c_{n})\leq\frac{\sqrt{2}}{\sqrt{n}}.$ Now, using Theorem 2.3

[TABLE]

and, if $Z_{i}$ satisfies ${\mathfrak{D}}(\varepsilon,r,R)$ , we use Theorem 3.9 and we obtain

[TABLE]

Now we have to estimate the total variation distance between $S_{2}(c_{n},G)=\Phi_{2}(c_{n},G)$ and the normal random variable $\Delta.$ In order to do it we use (3.26), so we have to estimate the kurtosis $\kappa(c_{n}).$ We denote $a(i,j)=1_{i\neq j}\left|i-j\right|^{-1/2}$ and we write

[TABLE]

In order to obtain the last inequality one just looks to the graphs of the functions $t\mapsto(\left|t_{i}-t\right|\left|t_{j}-t\right|)^{-1/2}$ and to the graph of the step approximation of this function. And the step approximation is below the function in these regions. Moreover (see [3] Lemma B1 for a complete computation)

[TABLE]

It follows that

[TABLE]

$\square$

5 Stochastic calculus of variation under the Doeblin’s condition

We assume that the sequence $X=(X_{n})_{n\in{\mathbb{N}}},$ $X_{n}=(X_{n,1},\ldots,X_{n,d_{\ast}})\in{\mathbb{R}}^{d_{\ast}}$ , of independent random variables satisfies Hypothesis ${\mathfrak{M}}(\varepsilon,r,R)$ , that is the Doeblin’s condition ${\mathfrak{D}}(\varepsilon,r,R)$ and the moment finiteness one. We strongly use here the representation (3.9) discussed in Section 3.1, that is,

[TABLE]

where $\chi_{n},V_{n},U_{n}$ are independent with laws given in (3.8). The goal of this section is to present a differential calculus based on $V_{n},n\in{\mathbb{N}}$ which has been introduced in [1, 4] (and which is inspired by the Malliavin calculus [31]).

5.1 Abstract Malliavin calculus and Sobolev spaces

To begin we introduce the space of the simple functionals. We denote by $\Lambda_{m}$ the multi-indexes $\alpha=(\alpha_{1},\ldots,\alpha_{m})$ with $\alpha_{i}=(n_{i},j_{i})\in{\mathbb{N}}\times[d_{\ast}]$ (that is, we do not impose that $n_{1}<\cdots<n_{m}$ ). We consider polynomials with random coefficients

[TABLE]

where $x=(x_{n})_{n\in{\mathbb{N}}}$ with $x_{n}=(x_{n,1},...,x_{n,d_{\ast}})\in R^{d_{\ast}}$ and $x^{\alpha}=\prod_{i=1}^{m}x_{\alpha_{i}}.$ The coefficients $d(\alpha)\in\mathcal{U}$ are random variables which are measurable with respect to $\sigma(\chi_{n},U_{n},n\in{\mathbb{N)}}$ and so, in particular, are independent of $(V_{n})_{n\in{\mathbb{N}}}.$ And we define $\mathcal{P}_{N}(\mathcal{U})$ to be the space of the polynomials computed in $x_{n}=V_{n}$ that is $F\in\mathcal{P}_{N}(\mathcal{U})$ if

[TABLE]

The simple functionals will be $\mathcal{P}(\mathcal{U})=\cup_{N\in{\mathbb{N}}}\mathcal{P}_{N}(\mathcal{U}).$ In particular our polynomials $Q_{N,k_{\ast}}(c,X)$ belong to $\mathcal{P}_{N}(\mathcal{U}).$ Note that $\mathcal{P}(\mathcal{U})$ is dense in $L^{p}(\Omega,\mathcal{F},P)$ with $\mathcal{F}=\sigma(X_{n},n\in{\mathbb{N}})$ . So we will define first our differential operators on $\mathcal{P}(\mathcal{U}),$ and we extend them in the canonical way to their domains in $L^{p}(\Omega,\mathcal{F},P)$ .

We assume that $\mathcal{U}={\mathbb{R}}^{d}$ (so it is a finite dimensional Hilbert space). Let $F\in\mathcal{P}(\mathcal{U})$ , so $F=Q_{N,k_{\ast}}(c,X)$ . For $n\in{\mathbb{N}}$ and $i\in[d_{\ast}]$ we define the first order derivatives

[TABLE]

We look to $DF=(D_{n,i}F)_{n\in{\mathbb{N}},i\in[d_{\ast}]}$ as to a random element of the following Hilbert space $\mathcal{H(U)}$ :

[TABLE]

So $D:\mathcal{P}_{N}(\mathcal{U})\rightarrow\mathcal{P}_{N-1}(\mathcal{H(\mathcal{U})}).$ The Malliavin covariance matrix of $F\in\mathcal{P}(\mathcal{\mathcal{U}})^{d}$ is defined by

[TABLE]

Moreover we define the higher order derivatives in the following way. Let $m\in{\mathbb{N}}$ be fixed and let $\alpha=(\alpha_{1},\ldots,\alpha_{m})$ with $\alpha_{i}=(n_{i},j_{i})\in{\mathbb{N}}\times[d_{\ast}].$ For $F=Q_{N,k_{\ast}}(c,X)\in\mathcal{P(U)}$ , we define

[TABLE]

We look to $D^{(m)}F=(D_{\alpha}^{(m)}F)_{\alpha\in\Gamma_{m}}$ as to a random element of $\mathcal{H}_{m}:=\mathcal{H}^{\otimes m}(\mathcal{U}),$ so $D^{(m)}:\mathcal{P}_{N}(\mathcal{U})\rightarrow\mathcal{P}_{N-m}(\mathcal{H}^{\otimes m}(\mathcal{U}))$ . For $m=1$ , we have $D^{(1)}F=DF$ .

We define now the divergence operator

[TABLE]

Standard integration by parts on ${\mathbb{R}}$ gives the following duality relation: for every $F,G\in\mathcal{P(U)}$

[TABLE]

We define now the Sobolev norms. For $q\geq 1$ we set

[TABLE]

Moreover we define

[TABLE]

and

[TABLE]

Finally we define the Sobolev spaces

[TABLE]

The duality relation (5.6) implies that the operators $D^{(n)}$ and $L$ are closable so we may extend these operators to ${\mathbb{D}}^{q,p}$ in a standard way. But in this work we will restrict ourself to $\mathcal{P(U)}$ .

We recall now the basic computational rules. For $\phi\in C_{\mathrm{{\scriptsize{pol}}}}^{1}({\mathbb{{\mathbb{R}}}}^{M})$ and $F\in\mathcal{P(U)}^{M}$ we have

[TABLE]

and for $\phi\in C_{\mathrm{{\scriptsize{pol}}}}^{2}({\mathbb{R}}^{M})$

[TABLE]

In particular for $F,G\in{\mathbb{D}}^{2,\infty}$

[TABLE]

Let us stress the following fact which is specific in our framework. In order to establish the integration by parts formula in the classical Malliavin calculus one needs that $\sigma_{F}$ is almost surely invertible. And this is always falls here: indeed if $F=\phi(X_{1},\ldots,X_{n})$ then $DF=0$ on the set $\{\chi_{1}=\ldots=\chi_{n}=0\}$ which has strictly positive probability. This is why we have to use a localized version of the integration by parts formula. Given $\eta>0$ we consider a function $\Phi_{\eta}:{\mathbb{R}}\rightarrow{\mathbb{R}}_{+}$ such that $1_{\{\left|x\right|\leq\eta\}}\leq\Phi_{\eta}(x)\leq 1_{\{\left|x\right|\leq 2\eta\}}$ and $|\Phi_{\eta}^{(k)}(x)|\leq C_{k}\eta^{-k}$ for every $k\in{\mathbb{N}}.$ Then we define $\Psi_{\eta}=1-\Phi_{\eta}$ and we notice that on the set $\{\Psi_{\eta}(\det\sigma_{F})>0\}$ we have $\det\sigma_{F}\geq\eta$ , so $\sigma_{F}$ is invertible. We denote

[TABLE]

Theorem 5.1

Let $F=(F^{1},\ldots,F^{d}),F_{i}\in{\mathbb{D}}^{2,\infty}$ and $G\in{\mathbb{D}}^{1,\infty}$ and, for $\eta>0,$ we denote $G_{\eta}=G\times\Psi_{\eta}(\det\sigma_{F}).$ Then for every $\phi\in C_{p}^{\infty}({\mathbb{R}}^{d})$ and every $i=1,\ldots,d$

[TABLE]

with

[TABLE]

Moreover let $m\in{\mathbb{N}},m\geq 2$ and $\alpha=(\alpha_{1},\ldots,\alpha_{m})\in\{1,\ldots,d\}^{m}.$ Suppose that $F=(F^{1},\ldots,F^{d}),F_{i}\in{\mathbb{D}}^{m+1,\infty}$ and $G\in{\mathbb{D}}^{m,\infty}.$ Then

[TABLE]

with $H_{\eta,\alpha}(F,G)$ defined by $H_{\eta,(\alpha_{1},\ldots,\alpha_{m})}(F,G):=H_{\eta,\alpha_{m}}(F,H_{\eta,(\alpha_{1},\ldots,\alpha_{m-1})}(F,G)).$

**Proof. The proof is standard so we just sketch it. **Using the chain rule $D\phi(F)=\nabla\phi(F)DF$ so that

[TABLE]

It follows that, on the set $\{\Phi_{\eta}(\det\sigma_{F})>0\},$ one has $\nabla\phi(F)=\gamma_{F_{\eta}}\left\langle D\phi(F),DF\right\rangle_{\mathcal{H}}$ . Then, by using (5.13) and the duality formula (5.6),

[TABLE]

We use once again (5.13) in order to obtain $H_{\eta,i}(F,G)$ in (5.15). By iteration one obtains the higher order integration by parts formulae. $\square$

We give now useful estimates for the weights which appear in (5.16). For $n,k\in{\mathbb{N}}$ we denote

[TABLE]

Lemma 5.2

Let $n,k\in{\mathbb{N}}$ and $F\in\mathcal{P}^{d}$ and $G\in\mathcal{P}.$ There exists a universal constant $C\geq 1$ (depending on $d,n,k$ only) such that for every multi index $\alpha$ with $\left|\alpha\right|=n$ and every $\eta>0$ one has

[TABLE]

In particular, taking $k=0$ and $G=1$ we have

[TABLE]

The proof is straightforward but technical so we leave it for Appendix B.

5.2 Regularization results

We deal here with functions and their derivatives on ${\mathbb{R}}^{d}$ . So, we use a slightly different definition for multi-indexes. Here, for $m\in{\mathbb{N}}$ , a multi-index of length $m$ is given by $\alpha\in\{1,\ldots,d\}^{m}$ and we set $|\alpha|=m$ its length. For $y=(y_{1},\ldots,y_{d})\in{\mathbb{R}}^{d}$ , we set $y^{\alpha}=\prod_{i=1}^{d}y_{\alpha_{i}}$ . We allow the case $\alpha=\emptyset$ by setting $|\alpha|=0$ and, for $y\in{\mathbb{R}}^{d}$ , $y^{\alpha}=1$ .

We recall that a super kernel $\phi:{\mathbb{R}}^{d}\rightarrow{\mathbb{R}}$ is a function which belongs to the Schwartz space $\mathbb{S}({\mathbb{R}}^{d})$ (infinitely differentiable functions which decrease in a polynomial way to infinity), $\int\phi(x)dx=1,$ and such that for every multi-index $\alpha$ with $|\alpha|=m$ one has

[TABLE]

For $\delta\in(0,1)$ we define $\phi_{\delta}(y)=\delta^{-d}\phi(\delta^{-1}y)$ and for a function $f:{\mathbb{R}}^{d}\rightarrow{\mathbb{R}}$ we denote $f_{\delta}=f\ast\phi_{\delta}$ , the symbol $\ast$ denoting convolution. For $f\in C_{{\mathrm{{\scriptsize{pol}}}}}^{k}({\mathbb{R}}^{d})$ we define $L_{k}(f)$ and $l_{k}(f)$ to be some constants such that

[TABLE]

We give now a “regularization lemma” which is an improvement of Lemma 2.5 in [2].

Lemma 5.3

Let $F\in\mathcal{P}({\mathbb{R}})^{d}$ and $q,m\in{\mathbb{N}}.$ There exists some constant $C\geq 1,$ depending on $d,m$ and $q$ only, such that for every $f\in C_{{\mathrm{{\scriptsize{pol}}}}}^{q+m}({\mathbb{R}}^{d}),$ every multi index $\gamma$ with $\left|\gamma\right|=m$ and every $\eta,\delta>0$

[TABLE]

with ${\mathcal{K}}_{q+m,0}(F)$ defined in (5.17) and $c_{l,q}=\int|\phi(y)||y|^{q}(1+|y|)^{l}dy$ . Moreover, for every $p>1$

[TABLE]

Proof. Using Taylor expansion of order $q$ ,

[TABLE]

with

[TABLE]

Using (5.20) we obtain $\int I(x,y)\phi_{\delta}(x-y)dy=0$ and by a change of variable we get

[TABLE]

So that

[TABLE]

Using integration by parts formula (5.16) (with $G=1)$

[TABLE]

The upper bound from (5.19) (with $p=2)$ gives

[TABLE]

And since

[TABLE]

we conclude that

[TABLE]

In order to prove (5.24), we write

[TABLE]

So the proof of (5.24) will be completed as soon as we check that $l_{0}(\partial^{\gamma}f_{\delta})=l_{0}(\partial^{\gamma}f)\leq l_{m}(f)$ and $L_{0}(\partial^{\gamma}f_{\delta})\leq L_{0}(\partial^{\gamma}f)c_{l_{0}(\partial^{\gamma}f),0}\leq L_{m}(f)c_{l_{m}(f),0}.$ We write

[TABLE]

$\square$

As a consequence, we get a regularization result involving functions which are just continuous and bounded.

Lemma 5.4

Let $F\in\mathcal{P}({\mathbb{R}})^{d}$ and $q\in{\mathbb{N}}.$ There exists some constant $C\geq 1,$ depending on $d,m$ and $q$ only, such that for every $f\in C_{b}({\mathbb{R}}^{d})$ , every $\eta,\delta>0$ and $a<1$ ,

[TABLE]

with ${\mathcal{K}}_{q,0}(F)$ defined in (5.17).

Proof. Let $g$ denote the density of the standard $d$ -dimensional normal law and for $\varepsilon>0$ , set $g_{\varepsilon}(x)=\frac{1}{\varepsilon^{d}}g(\frac{x}{\varepsilon})$ . We notice that $f\ast g_{\varepsilon}$ , $f_{\delta}\ast g_{\varepsilon}\in C^{\infty}_{b}({\mathbb{R}}^{d})$ . Moreover, $l_{0}(f\ast g_{\varepsilon})=l_{0}(f_{\delta}\ast g_{\varepsilon})=0$ and $L_{0}(f\ast g_{\varepsilon})=L_{0}(f_{\delta}\ast g_{\varepsilon})=\|f\|_{\infty}$ , for every $\varepsilon>0$ . So, we can apply (5.3) with $|\gamma|=0$ and we obtain

[TABLE]

We now let $\varepsilon$ tend to 0 and obtain (5.24). $\square$

5.3 Estimates of the Sobolev norms

Through this section we assume that $X$ verifies ${\mathfrak{M}}(\varepsilon,r,R)$ (that is (2.1) and ${\mathfrak{D}}(\varepsilon,r,R))$ and we estimates the Sobolev norms of $Q_{N}(c,X)$ and of $LQ_{N}(c,X).$ We will give our estimates in terms of the norms $\mathcal{N}_{{\mathcal{U}},q}(c,M)$ defined in (2.2).

Proposition 5.5

Let $p\geq 2$ and $N,q\in{\mathbb{N}}$ be given and let $\overline{M}_{p}=b_{p}M_{p}\sqrt{k_{\ast}d_{\ast}}$ with $M_{p}=M_{p}(Z(X)).$ Then

[TABLE]

Remark 5.6

(5.25) says in particular that if $\lim_{N\to\infty}\mathcal{N}_{{\mathcal{U}},q}(c,\overline{M}_{p})<\infty$ (recall that $\mathcal{N}_{{\mathcal{U}},q}(c,\overline{M}_{p})$ is a sum up to $N$ , see (2.2)) then the infinite series $Q_{\infty,k_{\ast}}(c,X)$ belongs to ${\mathbb{D}}^{q,p}.$ Let us compare this result with the corresponding one for functionals on the Wiener space. We take $k_{\ast}=1,d_{\ast}=1,{\mathcal{U}}={\mathbb{R}}$ and $X_{n}$ to be standard normal distributed. Then $\Phi_{m}(c,X)$ is a multiple integral of order $m$ associated to the kernel $f_{c,m}$ which is constant on cubes and equal to the corresponding $c(\alpha).$ So $Q_{\infty,1}(c,X)=\sum_{m=0}^{\infty}c(\alpha)X^{\alpha}=\sum_{m=0}^{\infty}J_{m}(f_{c,m})=\sum_{m=0}^{\infty}\frac{1}{m!}I_{m}(f_{c,m})$ where $J_{m}$ denotes the iterated stochastic integral and $I_{m}$ is the multiple stochastic integral. Note that $b_{2}=1$ and $M_{2}=1$ so $\overline{M}_{2}=1.$ So we have

[TABLE]

It is known that $Q_{\infty,1}(c,X)$ is $q$ time differentiable in $L^{2}$ in Malliavin sense if and only if the quantity in the right hand side is finite. And this is the same in our framework. But in our calculus we need estimates for a large $p>2$ and then $\overline{M}_{p}>1.$ This is why we give up in this paper the case of infinite series and we restrict ourself to finite sums.

Proof. Step 1. For simplicity of notation, we set here $Z=Z(X)$ . For fixed $n_{0}\in{\mathbb{N}}$ , $j_{0}\in[d_{*}]$ and $m\in{\mathbb{N}}$ we set $\Lambda_{n_{0},j_{0}}(m,k)$ as the set of the multi-indexes of length $m$ which do not contain the pair $(n_{0},kd_{*}+j_{0})$ , the case $m=0$ giving the set $\Lambda_{n_{0},j_{0}}(0,k)$ made just by the null multi-index. Then, by observing that $\chi_{n}V_{n}^{k}=\chi_{n}X_{n}^{k}$ for every $n$ and $k$ , one has

[TABLE]

where $(Dc)_{n_{0},j_{0},k}(\beta)=c((n_{0},kd_{*}+j_{0}))$ if $|\beta|=0$ and for $|\beta|=m\geq 1$ ,

[TABLE]

It can be easily checked that

[TABLE]

where, for $|\beta|=m=0,1,\ldots,N$ ,

[TABLE]

and the above coefficients are

[TABLE]

We study $\mathcal{N}_{\mathcal{H(U)},q}(Tc,M)$ . First,

[TABLE]

Moreover, for $m\geq 1$ ,

[TABLE]

and similarly,

[TABLE]

We put all this together and we obtain

[TABLE]

Step 2. Starting from formula (5.26), we use Burkholder’s inequality (2.9) in order to obtain

[TABLE]

$\square$

In order to treat $LQ_{N,k_{\ast}}(c,X)$ we need the following auxiliary lemma:

Proposition 5.7

A. Let $B_{n},\Lambda_{n}\in\mathcal{U}$ be random variables such that $B_{n},\Lambda_{n}\in\mathcal{P(U)}$ for every $n$ and $B_{n}$ is $\sigma(X_{1},\ldots,X_{n})$ measurable. We fix $j\in[d_{\ast}],k\in[k_{\ast}]$ and we consider the process

[TABLE]

For every $q\in{\mathbb{N}}$ and $p\geq 2$ there exists a universal constant $C\geq 1$ depending on $k_{\ast}$ and on $p$ only, such that

[TABLE]

with

[TABLE]

B. If

[TABLE]

then

[TABLE]

**Proof. ** In the following $C\geq 1$ denotes a constant depending on $k_{\ast}$ and on $p$ only and which may change from a line to another.

Step 1. We will use the following facts. First, by the duality formula ${\mathbb{E}}(LX_{n,j}^{k})={\mathbb{E}}(\langle DX_{n,j}^{k},D1\rangle)=0.$ Moreover using the computational rules (see (5.12))

[TABLE]

It follows that

[TABLE]

It is easy to check that $\|X_{n,j}^{k-1}\|_{q,2p}\leq(k-1)!M_{2k_{\ast}p}^{k_{\ast}}(X)$ and a similar estimates holds for $\|X_{n,j}^{k-2}\|_{q,2p}.$ Moreover it is proved in Lemma 3.2 in [1] that there exists a universal constant $C$ such that $\left\|LX_{n,j}\right\|_{q,2p}\leq\frac{C}{r^{q+1}}$ so that

[TABLE]

Step 2. Let $q=0,$ so that $\left\|Y_{J}\right\|_{\mathcal{U},q,p}=\left\|Y_{J}\right\|_{\mathcal{U},p}.$ We have to check that

[TABLE]

Since $B_{n-1}$ is $\sigma(X_{1},\ldots,X_{n-1})$ measurable and ${\mathbb{E}}(LX_{n,j}^{k})=0,$ it follows that $M_{m}=\sum_{n=1}^{m}B_{n-1}LX_{n,j}^{k}$ is a martingale. By (2.6)

[TABLE]

Since $LX_{n,j}^{k}$ and $B_{n-1}$ are independent,

[TABLE]

From $Y_{m}=M_{m}+\Lambda_{m}$ , we conclude that

[TABLE]

so the statement holds for $q=0$ .

Step 3. We estimate the derivatives of $Y_{m}$ . We have

[TABLE]

where $\overline{B}_{n}=DB_{n}$ is $\sigma(X_{1},\ldots,X_{n-1})$ -measurable and $\overline{\Lambda}_{m}=\sum_{k=1}^{m}DLX_{n,j}^{k}B_{n-1}+D\Lambda_{m}.$ Notice that $\overline{Y}_{m}$ , $\overline{B}_{k}$ and $\overline{\Lambda}_{m}$ take values in $\mathcal{H}(\mathcal{U})$ (defined in (5.1)). So, by applying the step above, we get

[TABLE]

where

[TABLE]

If we prove that

[TABLE]

then we obtain

[TABLE]

And by iteration, we get (5.30) for every $q$ . So, let us prove (5.36).

We have $\|\overline{B}_{k}\|_{\mathcal{H}(\mathcal{U}),p}=\|DB_{k}\|_{\mathcal{H}(\mathcal{U}),p}\leq\left\|B_{k}\right\|_{\mathcal{U},1,p}$ . We analyze now $\overline{\Lambda}_{m}.$ First, $\|D\Lambda_{m}\|_{\mathcal{H}(\mathcal{U}),p}\leq\|\Lambda_{m}\|_{\mathcal{U},1,p}$ . Let $I_{m}:=\sum_{n=1}^{m}DLX_{n,j}^{k}B_{n-1}\in\mathcal{H}(\mathcal{U})$ . Since $D_{n^{\prime},j^{\prime}}LX_{n,j}^{k}=0$ if $(n^{\prime},j^{\prime})\neq(n,j)$ we obtain

[TABLE]

Recalling that $D_{n,j}X_{n,j}^{k}$ and $B_{n-1}$ are independent and that $\|D_{n,j}LX_{n,j}^{k}\|_{p}^{2}\leq Cr^{-2}M_{2k_{\ast}p}^{k_{\ast}}(X)$ , we can write

[TABLE]

By inserting all these estimates, we get (5.36). So A is proved. The proof of B is just identical so we skip it. $\square$

Proposition 5.8

For every $q,N\in{\mathbb{N}}$ and $p\geq 2$ there exists a universal constant $C$ depending on $k_{\ast},q$ and $p$ only such that

[TABLE]

where $\overline{M}_{p}=b_{p}M_{p}\sqrt{k_{*}d_{*}}$ and $\widehat{M}_{p}$ is given in (5.31).

**Proof. **We prove this by recurrence on $N$ . The case $N=1$ is straightforward, so we suppose $N>1$ . We recall that, if $\left|\beta\right|=m$ then $c^{n,j}(\beta)=1_{\beta_{m}^{\prime}<n}c(\beta,(n,j))$ and we write

[TABLE]

where $Z=Z(X)$ . Since $\left\langle DZ_{n,j},DQ_{N-1,k_{\ast}}(c^{n,j},X)\right\rangle_{\mathcal{H(U)}}=0$ we get (see (5.12))

[TABLE]

So we are in the framework of the previous lemma with $B_{n-1}=Q_{N-1,k_{\ast}}(c^{n,j},X)$ and

[TABLE]

Notice that

[TABLE]

Then, using (5.33) (recall that $C_{p}(X)\leq\widehat{M}_{p})$ and the recurrence hypothesis

[TABLE]

Moreover, by the estimates of the Sobolev norms given in (5.25), and the same computations as above

[TABLE]

$\square$

Remark 5.9

By using Proposition 5.5 and 5.8, we give here an upper estimate of the $L^{2}$ -norm of the constant ${\mathcal{K}}_{q,0}(Q_{N,k_{\ast}}(c,X))$ defined in (5.17). This will be very useful in the sequel. By using the Hölder inequality we easily get

[TABLE]

By applying the estimates (5.25) and (5.37) we obtain

[TABLE]

$C>0$ * denoting a constant depending on $q,N,k_{\ast}$ and the moment bound $M_{p}(X)$ for a suitable $p>1$ and independent of the coefficients $c$ .*

5.4 Estimates of the covariance matrix

In this section we give estimates for the Malliavin covariance matrix of $Q_{N,k_{\ast}}(c,X)$ which we shortly denote by $\sigma_{N}$ . We restrict ourself to the scalar case, so that $Q_{N,k_{\ast}}(c,X)\in{\mathbb{R}}=\mathcal{U}$ and $\sigma_{N}$ is just a scalar. We start from the formula of the Malliavin derivative of $Q_{N,k_{\ast}}(c,X)$ already discussed in the proof of Proposition 5.5, that is,

[TABLE]

where $\Lambda_{n_{0},j_{0}}(m,k)$ denotes the multi-indexes of length $m$ which do not contain the pair $(n_{0},kd_{\ast}+j_{0})$ and where $(Dc)_{n_{0},j_{0},k}(\beta)=c((n_{0},kd_{\ast}+j_{0}))$ if $|\beta|=0$ and for $|\beta|=m\geq 1$ ,

[TABLE]

The aim of this section is to prove the non-degeneracy estimate (5.44) in next Lemma 5.11. But we first need to study the conditional expectation of $\sigma_{N}$ given the randomness from $\chi_{n}$ and $U_{n}$ .

Lemma 5.10

Assume ${\mathfrak{D}}(\varepsilon,r,R)$ . We denote by ${\mathbb{E}}_{U,\chi}$ the conditional expectation with respect to $\sigma(U_{n},\chi_{n},$ $n\in{\mathbb{N}}).$ Then

[TABLE]

where $\lambda_{R}>0$ is given in Lemma 3.1 and for $\alpha=((\alpha_{1}^{\prime},\alpha_{1}^{\prime\prime}),\ldots,(\alpha_{m}^{\prime},\alpha_{m}^{\prime\prime}))$ , we set $\alpha^{\prime}=(\alpha^{\prime}_{1},\ldots,\alpha^{\prime}_{m})$ and $\chi^{\alpha^{\prime}}=\prod_{i=1}^{m}\chi_{\alpha^{\prime}_{i}}$ .

Proof. We set here $Z=Z(X)$ . We recall that $X_{n,j}=\chi_{n}V_{n,j}+(1-\chi_{n})U_{n,j}$ and we define (with $k(l)$ and $j(l)$ defined in (3.2))

[TABLE]

Then

[TABLE]

So, we have

[TABLE]

where

[TABLE]

One has

[TABLE]

This is because $\left|\beta\right|<\left|\alpha\right|\leq\left|\theta\right|$ , so there is at least one $\theta_{i}\notin\beta$ and ${\mathbb{E}}_{U,\chi}(\widetilde{V}^{\theta_{i}})=0.$ For the same reason, one has

[TABLE]

We recall that $V_{n_{0},j_{0}}^{k}=\widetilde{V}_{n_{0},kd_{\ast}+j_{0}}+E(V_{n_{0},j_{0}}^{k})$ and we use (5.39) in order to we write

[TABLE]

$\Lambda_{n_{0},j_{0}}(m,k)$ denoting the multi-indexes of length $m$ which do not contain the pair $(n_{0},kd_{\ast}+j_{0})$ . By (5.42) and (5.43), one has ${\mathbb{E}}_{U,\chi}(A_{N-1,1}^{n_{0},j_{0}}A_{m,i}^{n_{0},j_{0}})=0$ for every $m\leq N-1$ and $i=2,3$ and ${\mathbb{E}}_{U,\chi}(A_{N-1,1}^{n_{0},j_{0}}A_{m,1}^{n_{0},j_{0}})=0$ for every $m<N-1$ . Thus, $A_{N-1,1}^{n_{0},j_{0}}$ is orthogonal (in $L^{2}({\mathbb{P}}_{U,\chi})$ ) to $D_{n_{0},j_{0}}S_{N}(c,Z)-A_{N-1,1}^{n_{0},j_{0}}$ , so that

[TABLE]

Therefore,

[TABLE]

Now, we write

[TABLE]

For every $\alpha$ there exists at most one $(k,i)$ such that $\alpha_{i}=(n_{0},kd_{\ast}+j_{0})$ so that

[TABLE]

By using (3.13),

[TABLE]

and the statement holds. $\square$

We can now prove the main result of this section.

Lemma 5.11

Assume ${\mathfrak{D}}(\varepsilon,r,R)$ . Let $c\in\mathcal{C}({\mathbb{R}})$ with $|c|_{N}>0$ . For every $\eta>0$ ,

[TABLE]

where $K$ a universal constant (the one in the Carbery Wright inequality) and $\lambda_{R}$ is given in Lemma 3.1.

Remark 5.12

Sometimes $|c|_{N}$ is small and we would like to use $|c|_{m}$ instead, with $m<N.$ We denote $\left|c\right|_{m+1,N}^{2}=\sum_{k=m+1}^{N}c^{2}(\alpha)$ . Then for every $h\geq 1$ there exists $C>0$ such that

[TABLE]

Indeed: we denote $Q_{m+1,N,k_{\ast}}(c,X)=Q_{N,k_{\ast}}(c,X)-Q_{m,k_{\ast}}(c,X)$ and we use the inequality

[TABLE]

in order to obtain

[TABLE]

Using Chebyshev’s inequality and Lemma 5.5, for every $h$ ,

[TABLE]

so the proof of (5.45) is completed.

Proof of Lemma 5.11. We will use the Carbery–Wright inequality that we recall here (see Theorem 8 in [9]). Let $\mu$ be a probability law on ${\mathbb{R}}^{J}$ which is absolutely continuous with respect to the Lebesgue measure and has a log-concave density. There exists a universal constant $K$ such that for every polynomial $Q(x)$ of order $k_{\ast}N$ and for every $\eta>0$ one has

[TABLE]

We will use this result in the following framework. We recall that the coefficients $c(\alpha)$ are null except a finite number of them. So we may find $M$ such that, if $\left|\alpha\right|=m$ and $\alpha_{m}^{\prime}>M$ then $c(\alpha)=0.$ It follows that we may write (see 5.39))

[TABLE]

where $q_{q,\overline{U}}(V)$ is a polynomial of order $k_{\ast}N$ with unknowns $V_{n,j},n\leq M,j\leq d_{\ast}$ and coefficients depending on $\chi_{n}$ and $\overline{U}_{n,j,k}.$ Moreover we recall that ${\mathbb{P}}_{U,\chi}$ is the conditional probability with respect to $\sigma(U_{i},\chi_{i},i\in{\mathbb{N}}).$ We denote by $\mu$ the law of $(V_{n,j},n\leq M,j\leq d_{\ast})$ under ${\mathbb{P}}_{U,\chi}$ : this is a product of laws of the form $c\psi_{r}(\left|x-\overline{x}\right|^{2})dx$ so it is log-concave. So we are able to use (5.46). Using (5.41)

[TABLE]

We take now $\theta>0$ (to be chosen in a moment) and we use (5.46) in order to obtain

[TABLE]

The first term in the above inequality is estimated in Appendix A. In order to fit in the notation used there we denote $\Lambda_{N}(\beta^{\prime})=\{\alpha:|\alpha|=N\mbox{ and }\alpha^{\prime}=\beta^{\prime}\}$ and $\overline{c}^{2}(\beta^{\prime})=\sum_{\alpha\in\Lambda_{N}(\beta^{\prime})}c^{2}(\alpha).$ Then

[TABLE]

Now we apply Lemma A.1 with $x=\theta/\lambda_{R}^{N}.$ Recall that $p=\varepsilon\mathfrak{m}_{r}$ and we have the restriction

[TABLE]

We have $\left|\overline{c}\right|_{N}^{2}=|c|_{N}^{2}$ and

[TABLE]

Then (A.2) gives

[TABLE]

Inserting this in (5.47) we obtain

[TABLE]

Now, $\theta$ is any constant satisfying the restriction (5.48). So, by letting $\theta\uparrow\lambda_{R}^{N}((\varepsilon\mathfrak{m}_{r})/2)^{N}|\overline{c}|_{N}^{2}=\lambda_{R}^{N}((\varepsilon\mathfrak{m}_{r})/2)^{N}|c|_{N}^{2}$ , we finally obtain (5.44). $\square$

5.5 Proof of Theorem 3.3

The goal of this section is to give the proof of Theorem 3.3 so we use the notation from Section 3.

We take $q\in{\mathbb{N}}$ , $q\geq 1$ , and we consider the sequence $\lambda_{q}=\frac{q}{q+k}$ . Since $\lambda^{2}_{q}\uparrow 1$ as $q\to\infty$ , we can find $q$ such that such that $\lambda_{q}^{2}<\theta\leq\lambda^{2}_{q+1}$ . And since $\lambda^{2}_{q+1}\leq\lambda_{q}$ , we get $\lambda_{q}^{2}<\theta\leq\lambda_{q}$ . We work with this value of $q$ and we write simply $\lambda$ in place of $\lambda_{q}$ . Moreover, in the following, $C>0$ stands for a constant which may vary from line to line and which depends on the parameters in the statements but not on the coefficients $c,d\in\mathcal{C}({\mathbb{R}})$ .

We define $a=\theta/\lambda$ , so $\frac{1}{1+k}\leq\lambda<a\leq 1$ . We consider $\eta,\delta\in(0,1)$ , to be chosen in the sequel, and we use the regularization Lemma 5.4 (see (5.24)) with the above choice of $q$ and $a$ . This gives

[TABLE]

the latter inequality following from (5.38). Moreover by (5.45) (therein, $\sigma_{N}=\det\sigma_{Q_{N,k_{\ast}}}$ ), for every $h\geq 1$ (recall that $\overline{m}=m\vee m^{\prime})$

[TABLE]

So,

[TABLE]

A similar estimate holds for $Q_{N,k_{\ast}}(d,Y).$ We use now $d_{k}$ defined in (3.16). Since $\left\|f_{\delta}\right\|_{k,\infty}\leq\delta^{-k}\left\|f\right\|_{\infty}$ one has

[TABLE]

Putting this together, we get

[TABLE]

We optimize first on $\delta:$ we take $\delta=d_{k}^{1/(q+k)}\eta^{2q/(q+k)}(1+|c|+|d|)^{-5q/(q+k)}$ and we obtain (recall that $\lambda=\frac{q}{q+k}\in(0,1)$ ),

[TABLE]

It follows that

[TABLE]

We optimize now on $\eta:$ we take $\eta=d_{k}^{\lambda k_{\ast}\overline{m}/(a+2\lambda kk_{\ast}\overline{m})}$ , so that

[TABLE]

the latter inequality follows from $d_{k}\leq 1$ and, since $a,\lambda\in(0,1)$ , $a+2\lambda kk_{\ast}\overline{m}\leq 1+2kk_{\ast}\overline{m}$ . By inserting,

[TABLE]

Since $\left|c\right|^{2}_{m+1,N}$ $\leq d_{k}^{\frac{k_{\ast}\overline{m}}{2kk_{\ast}\overline{m}+1}},$

[TABLE]

We note that the above exponent is positive because $a>\lambda$ . So, we choose $h\geq 1$ and such that

[TABLE]

so that

[TABLE]

A similar estimate holds with $|c|_{m+1,N}^{2ha}$ replaced by $|d|_{m^{\prime}+1,N}^{2ha}$ . We then obtain

[TABLE]

The statement now follows by recalling that $\lambda a=\theta$ and, from (3.16), $d_{k}\leq C(d_{k}(Q_{N,k_{\ast}}(c,X),Q_{N,k_{\ast}}(d,Y))$ $+|c|^{\frac{2(2kk_{\ast}\overline{m}+1)}{k_{\ast}\overline{m}}}_{m+1,N}+|d|^{\frac{2(2kk_{\ast}\overline{m}+1)}{k_{\ast}\overline{m}}}_{m^{\prime}+1,N}).$ $\square$

Appendix A An iterated Hoeffding’s inequality

In this section we work with multi-indexes $\alpha=(\alpha_{1},\ldots,\alpha_{m})\in{\mathbb{N}}^{m}$ with $1\leq\alpha_{1}<\ldots<\alpha_{m}$ and we look to

[TABLE]

where $\chi_{n}$ , $n\in{\mathbb{N}}$ , denote independent Bernoulli random variables and $\chi^{\alpha}=\prod_{i=1}^{m}\chi_{\alpha_{i}}$ . We denote

[TABLE]

Lemma A.1

Let $p={\mathbb{P}}(\chi_{j}=1)\in(0,1)$ . If

[TABLE]

then

[TABLE]

Proof. We proceed by recurrence on $N.$ If $N=1$ we have

[TABLE]

the latter inequality following from (A.1). And by Hoeffding’s inequality

[TABLE]

Since

[TABLE]

(A.2) follows for $N=1$ . We suppose now that (A.2) holds for $N-1$ and we prove it for $N.$ For $\beta$ with $\left|\beta\right|=N-1$ we define $c_{n}(\beta)=c(\beta,n)1_{\{\beta_{N-1}<n\}}$ and we write

[TABLE]

Then

[TABLE]

We estimate first $b.$ We write

[TABLE]

Notice that

[TABLE]

and

[TABLE]

We also have

[TABLE]

so we can use the recurrence hypothesis and we get

[TABLE]

We estimate now $a.$ We use Corollary 1.4 pg 1654 in Bentkus [7] which asserts the following: if $M_{k},k\in{\mathbb{N}}$ is a martingale such that $\left|M_{k}-M_{k-1}\right|\leq h_{k}$ almost surely, then, for every $n\in{\mathbb{N}},$

[TABLE]

Since $0\leq\chi_{n}\leq 1$ we have

[TABLE]

Notice that $h_{n}\leq\delta_{N}^{2}(c)$ so that

[TABLE]

So, using (A.4)

[TABLE]

This, together with (A.3), gives (A.2). $\square$

Appendix B Norms

The aim of this section is to prove Lemma 5.2. For $F=(F_{1},\ldots,F_{d})$ We work with the norms

[TABLE]

To begin we give several easy computational rules:

[TABLE]

Now, for $F=(F_{1},\ldots,F_{d})$ we consider the Malliavin covariance matrix $\sigma_{F}^{i,j}=\left\langle DF^{i},DF^{j}\right\rangle$ and, if $\det\sigma_{F}\neq 0,$ we denote $\gamma_{F}=\sigma_{F}^{-1}.$ We write

[TABLE]

where $\widehat{\sigma}_{F}^{i,j}$ is the algebraic complement . Then, using (B.1)

[TABLE]

By (B.1) and (B.2), $\left|\widehat{\sigma}_{F}^{i,j}\right|_{k_{1}}\leq C\left|F\right|_{1,k_{1}+1}^{2(d-1)}$ and $\left|\det\sigma_{F}\right|_{k_{2}}\leq C\left|F\right|_{1,k_{2}+1}^{2d}.$ Then, using (B.3)

[TABLE]

so that

[TABLE]

We denote

[TABLE]

and

[TABLE]

We also recall that for $\eta>0,$ we consider a function $\Psi_{\eta}\in C^{\infty}({\mathbb{R}})$ such that $1_{(0,\eta)}\leq\Psi_{\eta}\leq 1_{(0,2\eta)}$ and $\|\Psi_{\eta}^{(k)}\|_{\infty}\leq C_{k}\eta^{-k},\forall k\in{\mathbb{N}}.$ Then we take $\Phi_{\eta}=1-\Psi_{\eta}.$

Lemma B.1

A. For every $k,n\in{\mathbb{N}}$ there exists a universal constant $C$ (depending on $k$ and $n)$ such that, for $\omega$ such that $\det\sigma_{F}(\omega)>0,$

[TABLE]

B. For every $\eta>0$

[TABLE]

Proof A. We first prove (B.7) for $n=1$ . We have

[TABLE]

Using (B.1)

[TABLE]

For $n>1$ , we use recurrence and we obtain

[TABLE]

Then, using (B.1) first and (B.4) secondly, (B.7) follows.

B. Let $G_{\eta}=\Phi_{\eta}(\det\sigma_{F})G).$ For every $p\in{\mathbb{N}}$ one has $\left|G_{\eta}\right|_{p}\leq C\eta^{-p}\left|G\right|_{p}\left|F\right|_{1,p+1}^{d}.$ Moreover one has $H_{\rho}^{(n)}(F,G_{\eta})=1_{\{\det\sigma_{\Phi}>\eta/2\}}H_{\rho}^{(n)}(F,G_{\eta}).$ So (B.7) implies (B.8). $\square$

Bibliography37

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Bally V., Caramellino L.: Asymptotic development for the CLT in total variation distance. Bernoulli , 22, 2442-2485 22.(2016).
2[2] Bally V., Caramellino L.: On the distances between probability density functions. Electronic Journal of Probability , 19 , no. 110, 1-33 (2014).
3[3] Bally V., Caramellino L.: An Invariance principle for Stochastic Series II. Non Gaussian limits. preprint ar Xiv 1607.04544 (2016) .
4[4] Bally V., Caramellino L., Poly G.: Convergence in distribution norms in the CLT for non identical distributed random variables. Preprint ar Xiv:1606.01629, (2016).
5[5] Bally V., Ray C.: Approximation of Markov semigroups in total variation distance. Electronic J. of Probab. 21, no 12.(2016).
6[6] Bakry D., Gentil I., Ledoux M.: Analysis and Geometry of Markov Diffusion Semigroups . Springer (2014)
7[7] Bentkus V.: On Hoeffding’s inequalities. Ann. Probab. 32 , 1650–1673 (2004)
8[8] Bogachev V.I., Kosov V.I., Zelenov G.I.: Fractional smoothness of distributions of polynomials and fractional analog of the Hardy-Landau-Littelwod inequality. ar Xiv:1602.05207 v 2

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Total variation distance

Abstract

Contents

1 Introduction

2 Notation, basic objects and preliminary results

Lemma 2.1

Theorem 2.2

Theorem 2.3

3 Main results

3.1 Doeblin’s condition and splitting

Lemma 3.1

Lemma 3.2

3.2 Main results

Theorem 3.3

Theorem 3.4

Remark 3.5

Remark 3.6

Theorem 3.7

Remark 3.8

Theorem 3.9

Theorem 3.10

3.3 Gaussian and Gamma approximation

Theorem 3.11

Remark 3.12

Theorem 3.13

4 Examples

4.1 U-statistics associated to polynomial kernels

Remark 4.1

Remark 4.2

Theorem 4.3

4.2 A quadratic central limit theorem

Theorem 4.4

5 Stochastic calculus of variation under the Doeblin’s condition

5.1 Abstract Malliavin calculus and Sobolev spaces

Theorem 5.1

Lemma 5.2

5.2 Regularization results

Lemma 5.3

Lemma 5.4

5.3 Estimates of the Sobolev norms

Proposition 5.5

Remark 5.6

Proposition 5.7

Proposition 5.8

Remark 5.9

5.4 Estimates of the covariance matrix

Lemma 5.10

Lemma 5.11

Remark 5.12

5.5 Proof of Theorem 3.3

Appendix A An iterated Hoeffding’s inequality

Lemma A.1

Appendix B Norms

Lemma B.1