Asymptotic independence of point process and Frobenius norm of a large   sample covariance matrix

Johannes Heiny; Carolin Kleemann

arXiv:2302.13914·math.PR·February 28, 2023

Asymptotic independence of point process and Frobenius norm of a large sample covariance matrix

Johannes Heiny, Carolin Kleemann

PDF

Open Access

TL;DR

This paper proves a joint limit theorem for the off-diagonal entries point process and Frobenius norm of a large sample covariance matrix, revealing asymptotic independence and different limiting laws depending on moment conditions.

Contribution

It establishes the first joint convergence result for dependent point processes and sums in the non-Gaussian setting, extending Kallenberg's theorem.

Findings

01

Central limit theorem for Frobenius norm with finite fourth moment

02

Stable law for Frobenius norm with infinite variance

03

Asymptotic independence between point process and Frobenius norm

Abstract

A joint limit theorem for the point process of the off-diagonal entries of a sample covariance matrix $S$ , constructed from $n$ observations of a $p$ -dimensional random vector with iid components, and the Frobenius norm of $S$ is proved. In particular, assuming that $p$ and $n$ tend to infinity we obtain a central limit theorem for the Frobenius norm in the case of finite fourth moment of the components and an infinite variance stable law in the case of infinite fourth moment. Extending a theorem of Kallenberg, we establish asymptotic independence of the point process and the Frobenius norm of $S$ . To the best of our knowledge, this is the first result about joint convergence of a point process of dependent points and their sum in the non-Gaussian case.

Tables1

Table 1. Table 1. Overview of the asymptotic results about max ⁡ S i j , N n subscript 𝑆 𝑖 𝑗 subscript 𝑁 𝑛 \max S_{ij},N_{n} and tr ⁡ ( 𝐒 2 ) tr superscript 𝐒 2 \operatorname{tr}({\mathbf{S}}^{2}) .

	$Var (X) < \infty$ , $𝔼 [X^{4}] = \infty$	$𝔼 [X^{4}] < \infty$
$\max S_{i j}$	Gumbel distribution, [23, Thm. 3.2]	Gumbel distribution, [27, Lem. 3.2]
$N_{n}$	Poisson process, [23, Thm. 3.2]	Poisson process $N$ , [23, Thm. 3.2]
$tr (𝐒^{2})$	Stable distribution, Thm. 2.4	Normal distribution $Z$ , Thm. 2.2
$(N_{n}, tr (𝐒^{2}))$	open	$(N, Z)$ independent, Thm. 2.7

Equations464

\displaystyle{\mathbf{S}}:={\mathbf{S}}_{n}:=\frac{1}{n}\sum_{t=1}^{n}\mathbf{x}_{t}\mathbf{x}_{t}^{\top}=\big{(}S_{ij})_{1\leq i,j\leq p}\,.

\displaystyle{\mathbf{S}}:={\mathbf{S}}_{n}:=\frac{1}{n}\sum_{t=1}^{n}\mathbf{x}_{t}\mathbf{x}_{t}^{\top}=\big{(}S_{ij})_{1\leq i,j\leq p}\,.

∥ S ∥_{F}^{2} := i, j = 1 \sum p S_{ij}^{2} = tr (S^{2}) and 1 \leq i < j \leq p max S_{ij} .

∥ S ∥_{F}^{2} := i, j = 1 \sum p S_{ij}^{2} = tr (S^{2}) and 1 \leq i < j \leq p max S_{ij} .

N_{n} := 1 \leq i < j \leq p \sum ε_{d_{p} (n S_{ij} - d_{p})},

N_{n} := 1 \leq i < j \leq p \sum ε_{d_{p} (n S_{ij} - d_{p})},

d_{p} := 2 lo g \tilde{p} - \frac{lo g lo g p ~ + lo g 4 π}{2 ( lo g p ~ ) ^{1/2}} for \tilde{p} := p (p - 1) /2 .

d_{p} := 2 lo g \tilde{p} - \frac{lo g lo g p ~ + lo g 4 π}{2 ( lo g p ~ ) ^{1/2}} for \tilde{p} := p (p - 1) /2 .

S_{n} := i = 1 \sum n Y_{i} and M_{n} := 1 \leq i \leq n max Y_{i} .

S_{n} := i = 1 \sum n Y_{i} and M_{n} := 1 \leq i \leq n max Y_{i} .

1 - F (x) = q_{+} x^{- α} L (x) and F (- x) = q_{-} x^{- α} L (x), x > 0, α \in (0, 2),

1 - F (x) = q_{+} x^{- α} L (x) and F (- x) = q_{-} x^{- α} L (x), x > 0, α \in (0, 2),

N = i = 1 \sum \infty ε_{- l o g Γ_{i}},

N = i = 1 \sum \infty ε_{- l o g Γ_{i}},

Z_{n} := \frac{tr ( S ^{2} ) - μ _{n}}{σ _{n}},

Z_{n} := \frac{tr ( S ^{2} ) - μ _{n}}{σ _{n}},

σ_{n}^{2}

σ_{n}^{2}

μ_{n}

P (∣ X ∣ > x) = x^{- α} L (x), x > 0,

P (∣ X ∣ > x) = x^{- α} L (x), x > 0,

a_{k} := in f {x \in R : P (∣ X ∣ > x) \leq 1/ k}, k \geq 1,

a_{k} := in f {x \in R : P (∣ X ∣ > x) \leq 1/ k}, k \geq 1,

\frac{n ^{2}}{a _{n p}^{4}} tr (S^{2}) - \frac{2 n ( n + p - 2 )}{a _{n p}^{4}} tr (S) + \frac{n p ( n + p - 2 )}{a _{n p}^{4}} \to d ζ_{α /4}, n \to \infty,

\frac{n ^{2}}{a _{n p}^{4}} tr (S^{2}) - \frac{2 n ( n + p - 2 )}{a _{n p}^{4}} tr (S) + \frac{n p ( n + p - 2 )}{a _{n p}^{4}} \to d ζ_{α /4}, n \to \infty,

\displaystyle{\mathbb{E}}[e^{{\rm i}t\zeta_{\alpha/4}}]=\exp\Big{(}{\rm i}tc_{\alpha}+\frac{\alpha}{4}\int_{0}^{\infty}\big{(}e^{itx}-1-\frac{{\rm i}tx}{1+x^{2}}\big{)}x^{-(\alpha/4+1)}dx\Big{)},

\displaystyle{\mathbb{E}}[e^{{\rm i}t\zeta_{\alpha/4}}]=\exp\Big{(}{\rm i}tc_{\alpha}+\frac{\alpha}{4}\int_{0}^{\infty}\big{(}e^{itx}-1-\frac{{\rm i}tx}{1+x^{2}}\big{)}x^{-(\alpha/4+1)}dx\Big{)},

\frac{n ^{2}}{a _{n p}^{4}} tr (S^{2}) - \frac{2 n ( n + p - 2 )}{a _{n p}^{4}} tr (S) + \frac{n p ( n + p - 2 )}{a _{n p}^{4}} = \frac{1}{a _{n p}^{4}} i = 1 \sum p t = 1 \sum n X_{i t}^{4} + o_{P} (1) .

\frac{n ^{2}}{a _{n p}^{4}} tr (S^{2}) - \frac{2 n ( n + p - 2 )}{a _{n p}^{4}} tr (S) + \frac{n p ( n + p - 2 )}{a _{n p}^{4}} = \frac{1}{a _{n p}^{4}} i = 1 \sum p t = 1 \sum n X_{i t}^{4} + o_{P} (1) .

\displaystyle{\mathbb{P}}\Big{(}\Big{|}\frac{2n(n+p-2)}{a_{np}^{4}}\big{(}\operatorname{tr}({\mathbf{S}})-{\mathbb{E}}[\operatorname{tr}({\mathbf{S}})]\big{)}\Big{|}>\delta\Big{)}

\displaystyle{\mathbb{P}}\Big{(}\Big{|}\frac{2n(n+p-2)}{a_{np}^{4}}\big{(}\operatorname{tr}({\mathbf{S}})-{\mathbb{E}}[\operatorname{tr}({\mathbf{S}})]\big{)}\Big{|}>\delta\Big{)}

\displaystyle\sim np\,{\mathbb{P}}\Big{(}|X^{2}-1|>\tfrac{\delta a_{np}^{4}}{2(n+p-2)}\Big{)}\to 0\,,\qquad n\to\infty\,,

B_{N} := {B bounded Borel set : N (\partial B) = 0},

B_{N} := {B bounded Borel set : N (\partial B) = 0},

n \to \infty lim P (Y_{n} \leq y, N_{n} (B_{1}) \leq l_{1}, \dots, N_{n} (B_{k}) \leq l_{k}) = P (Y \leq y) P (N (B_{1}) \leq l_{1}, \dots, N (B_{k}) \leq l_{k}) .

n \to \infty lim P (Y_{n} \leq y, N_{n} (B_{1}) \leq l_{1}, \dots, N_{n} (B_{k}) \leq l_{k}) = P (Y \leq y) P (N (B_{1}) \leq l_{1}, \dots, N (B_{k}) \leq l_{k}) .

(Z_{n}, N_{n}) \to d (Z, N), n \to \infty,

(Z_{n}, N_{n}) \to d (Z, N), n \to \infty,

G_{n, (1)} \geq G_{n, (2)} \geq \dots \geq G_{n, (p (p - 1) /2)} .

G_{n, (1)} \geq G_{n, (2)} \geq \dots \geq G_{n, (p (p - 1) /2)} .

n \to \infty lim P (Z_{n} \leq y, G_{n, (1)} \leq x_{1}, \dots, G_{n, (k)} \leq x_{k}) = P (Z \leq y) P (- lo g Γ_{1} \leq x_{1}, \dots, - lo g Γ_{k} \leq x_{k}),

n \to \infty lim P (Z_{n} \leq y, G_{n, (1)} \leq x_{1}, \dots, G_{n, (k)} \leq x_{k}) = P (Z \leq y) P (- lo g Γ_{1} \leq x_{1}, \dots, - lo g Γ_{k} \leq x_{k}),

P (Z_{n} \leq y, G_{n, (1)} \leq x_{1}, \dots, G_{n, (k)} \leq x_{k})

P (Z_{n} \leq y, G_{n, (1)} \leq x_{1}, \dots, G_{n, (k)} \leq x_{k})

\displaystyle=\,{\mathbb{P}}\Big{(}Z_{n}\leq y,N_{n}(x_{1},\infty)=0,N_{n}(x_{2},\infty)\leq 1,\ldots,N_{n}(x_{k},\infty)\leq k-1\Big{)}

\displaystyle\to{\mathbb{P}}(Z\leq y)\,{\mathbb{P}}\Big{(}N(x_{1},\infty)=0,N(x_{2},\infty)\leq 1,\ldots,N(x_{k},\infty)\leq k-1\Big{)}\,,\qquad n\to\infty\,.

\displaystyle{\mathbb{P}}\Big{(}N(x_{1},\infty)=0,\ldots,N(x_{k},\infty))\leq k-1\Big{)}={\mathbb{P}}(-\log\Gamma_{1}\leq x_{1},\ldots,-\log\Gamma_{k}\leq x_{k}),

\displaystyle{\mathbb{P}}\Big{(}N(x_{1},\infty)=0,\ldots,N(x_{k},\infty))\leq k-1\Big{)}={\mathbb{P}}(-\log\Gamma_{1}\leq x_{1},\ldots,-\log\Gamma_{k}\leq x_{k}),

\displaystyle{\mathbb{P}}(Z_{n}\leq y,N_{n}(U)\neq 0)={\mathbb{P}}\Big{(}Z_{n}\leq y,\bigcup_{I\in\Lambda_{n}}B_{I}\Big{)}\,,

\displaystyle{\mathbb{P}}(Z_{n}\leq y,N_{n}(U)\neq 0)={\mathbb{P}}\Big{(}Z_{n}\leq y,\bigcup_{I\in\Lambda_{n}}B_{I}\Big{)}\,,

\sum_{d=1}^{2k}(-1)^{d-1}W_{n,d}\leq{\mathbb{P}}\Big{(}Z_{n}\leq y,\bigcup_{I\in\Lambda_{n}}B_{I}\Big{)}\leq\sum_{d=1}^{2k-1}(-1)^{d-1}W_{n,d}\,,

\sum_{d=1}^{2k}(-1)^{d-1}W_{n,d}\leq{\mathbb{P}}\Big{(}Z_{n}\leq y,\bigcup_{I\in\Lambda_{n}}B_{I}\Big{)}\leq\sum_{d=1}^{2k-1}(-1)^{d-1}W_{n,d}\,,

I_{1} < \dots < I_{d} \sum P (∣ Z_{n, d} ∣ > δ) \to 0, n \to \infty.

I_{1} < \dots < I_{d} \sum P (∣ Z_{n, d} ∣ > δ) \to 0, n \to \infty.

{\mathbb{P}}\Big{(}Z_{n}-Z_{n,d}\leq y,\bigcap_{\ell=1}^{d}B_{I_{\ell}}\Big{)}={\mathbb{P}}\big{(}Z_{n}-Z_{n,d}\leq y)\,{\mathbb{P}}\Big{(}\bigcap_{\ell=1}^{d}B_{I_{\ell}}\Big{)}\,.

{\mathbb{P}}\Big{(}Z_{n}-Z_{n,d}\leq y,\bigcap_{\ell=1}^{d}B_{I_{\ell}}\Big{)}={\mathbb{P}}\big{(}Z_{n}-Z_{n,d}\leq y)\,{\mathbb{P}}\Big{(}\bigcap_{\ell=1}^{d}B_{I_{\ell}}\Big{)}\,.

P (Z_{n} \leq y, N_{n} (U) \neq = 0) - P (Z_{n} \leq y) P (N_{n} (U) \neq = 0) \to 0, n \to \infty,

P (Z_{n} \leq y, N_{n} (U) \neq = 0) - P (Z_{n} \leq y) P (N_{n} (U) \neq = 0) \to 0, n \to \infty,

N_{n} = 1 \leq i < j \leq p \sum ε_{d_{p} (n S_{ij} - d_{p})} .

N_{n} = 1 \leq i < j \leq p \sum ε_{d_{p} (n S_{ij} - d_{p})} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRandom Matrices and Applications · Point processes and geometric inequalities · Geochemistry and Geologic Mapping

Full text

Asymptotic independence of point process and Frobenius norm of a large sample covariance matrix

Johannes Heiny

Fakultät für Mathematik, Ruhruniversität Bochum, Universitätsstrasse 150, D-44801 Bochum, Germany

[email protected]

and

Carolin Kleemann

Fakultät für Mathematik, Ruhruniversität Bochum, Universitätsstrasse 150, D-44801 Bochum, Germany

[email protected]

Abstract.

A joint limit theorem for the point process of the off-diagonal entries of a sample covariance matrix $\mathbf{S}$ , constructed from $n$ observations of a $p$ -dimensional random vector with iid components, and the Frobenius norm of $\mathbf{S}$ is proved. In particular, assuming that $p$ and $n$ tend to infinity we obtain a central limit theorem for the Frobenius norm in the case of finite fourth moment of the components and an infinite variance stable law in the case of infinite fourth moment. Extending a theorem of Kallenberg, we establish asymptotic independence of the point process and the Frobenius norm of $\mathbf{S}$ . To the best of our knowledge, this is the first result about joint convergence of a point process of dependent points and their sum in the non-Gaussian case.

Key words and phrases:

Gumbel distribution, extreme value theory, maximum entry, point process, central limit theorem, stable distribution, random matrix, joint convergence, asymptotic independence

1991 Mathematics Subject Classification:

Primary 60G70; Secondary 60B20, 60B12, 60G55, 60F05, 60G50, 60E07

Johannes Heiny’s and Carolin Kleemann’s research was partially supported by the Deutsche Forschungsgemeinschaft (DFG) via RTG 2131 High-dimensional Phenomena in Probability – Fluctuations and Discontinuity.

1. Introduction

Over recent years the analysis of high-dimensional data has emerged as an important and active research area driven by a wide range of applications in various fields such as genomics, medical imaging, signal processing, financial engineering and social science. To study large data sets, for instance, in brain connectivity analysis or in gene expression analysis (see [41, 46, 18]), knowledge of the dependence structure plays a central role. Interpreting the data as observations of a $p$ -dimensional random vector, dependence between the components of the vector is often estimated by covariance/correlation statistics and different functions are used to aggregate these estimates of the pairwise dependencies. For example, [40, 37, 45] and [29] propose sum-type tests based on the Frobenius norm which are usually powerful against dense alternatives. Further very popular methods of aggregating estimates of the pairwise dependencies are maximum-type tests, which have good power properties against sparse alternatives and have been investigated for various covariance/correlation statistics in [27, 47, 31, 45, 6, 20, 12, 23] and [21] among others. Since in practice it is difficult to decide whether the underlying covariance matrix is sparse or dense, it is useful to combine these two types of test statistics to cover both cases [17, 15, 21, 44, 7]. Therefore, an understanding of the joint asymptotic behavior of these test statistics is needed.

The objective of this paper is to contribute to this line of research by providing asymptotic theory for the joint distribution of sum-type statistics and generalized maximum-type statistics of a large sample covariance matrix.

For a sample $\mathbf{x}_{1},\ldots,\mathbf{x}_{n}$ from the $p$ -dimensional population $\mathbf{x}$ with independent and identically distributed (iid) components with mean zero and variance one, the sample covariance matrix ${\mathbf{S}}$ is given by

[TABLE]

Throughout this paper, we assume that the dimension $p=p_{n}$ is a positive integer sequence tending to infinity together with the sample size $n$ . Thus, the $p\times p$ -matrix ${\mathbf{S}}$ is a high-dimensional random matrix whose asymptotic properties are used, for example, in independence testing. Sum-type and maximum-type statistics based on ${\mathbf{S}}$ are given by the (squared) Frobenius norm and the maximum off-diagonal entry of ${\mathbf{S}}$ ,

[TABLE]

We are interested in the joint limiting distribution of $\operatorname{tr}({\mathbf{S}}^{2})$ and the sequence of point processes of the off-diagonal entries of ${\mathbf{S}}$

[TABLE]

where $\varepsilon_{x}$ denotes the Dirac measure in $x\in\mathbb{R}$ and

[TABLE]

It is well-known that the point process in (1.2) contains information about all order statistics of the $S_{ij}$ ’s (see [14]). The distribution of the maximum can be recovered from the identity $\{N_{n}((x,\infty))=0\}=\{\max_{i<j}d_{p}(\sqrt{n}S_{ij}-d_{p})\leq x\}$ , $x\in\mathbb{R}$ . In this sense, the point process $N_{n}$ is a natural and meaningful generalization of the maximum.

In Table 1 an overview of the available results about the convergence of $\max S_{ij},N_{n}$ and $\operatorname{tr}({\mathbf{S}}^{2})$ and the novel contributions of this paper (marked in blue) is given. For the reader’s convenience, the table contains the limit distributions themselves and precise references. It is worth mentioning that the distinction between finite and infinite fourth moment of $X$ , which is a generic random variable with the same distribution as the components of $\mathbf{x}$ , results from the sum-type statistic $\operatorname{tr}({\mathbf{S}}^{2})$ .

1.1. Related literature on sums, maxima and point processes

The joint behavior of the sum and the maximum of a sequence of real-valued random variables has been studied before, motivated for example by the evaluation of wind speed data, which is usually available in the form of the maximum wind speed and the average wind speed during a day or an hour. For iid random variables $(Y_{i})_{i\geq 1}$ we set

[TABLE]

If the distribution function $F$ of $Y_{1}$ belongs to the sum domain of attraction of the normal distribution and the maximum domain of attraction of an extreme value distribution, [8] showed that $(S_{n},M_{n})$ converges in distribution to a limit $(S,M)$ , where $S$ and $M$ are independent and not degenerated. They also proved that if

[TABLE]

where $L$ is a slowly varying function and $0<q_{+}\leq q_{+}+q_{-}=1$ , then $(S_{n},M_{n})$ converges to a limit $(S,M)$ , where $S$ and $M$ are dependent and they provide a hybrid characteristic distribution function of $(S,M)$ .

The papers [1, 2, 25] generalized these results to strongly mixing stationary random variables. For stationary normal random variables [24] and [32] proved asymptotic independence under certain correlation assumptions. Recently, asymptotic independence of a quadratic form in and the maximum of independent random variables was proved in [7] and asymptotic independence of the sum and the maximum of dependent normal random variables that need not be stationary or strongly mixing but fulfill conditions on the smallest and largest eigenvalue of their covariance matrix was shown in [16].

For a triangular array of normal distributed random variables the asymptotic independence of the point process of exceedances and the partial sum was considered in [26, 42] and extended to the multivariate case in [35]. Moreover, [19] established asymptotic independence of the point processes of clusters and the partial sums of bivariate stationary Gaussian triangular arrays. Asymptotic independence of other quantities derived from the sample covariance matrix ${\mathbf{S}}$ has also been considered in a variety of settings. For example, the asymptotic independence of the maximum of sample correlations and the sum of the squared sample correlations between the residuals from the ordinary least squares is proven in [17], while [30] showed asymptotic independence of the largest sample eigenvalues and the trace of ${\mathbf{S}}$ .

1.2. Structure of this paper

This paper is structured as follows. Section 2 contains our main results about the point process of the off-diagonal entries of ${\mathbf{S}}$ and the Frobenius norm of ${\mathbf{S}}$ . Under finite fourth moment of $X$ the Frobenius norm satisfies a CLT (Theorem 2.2), while in the case of infinite fourth moment we obtain a stable limit law (Theorem 2.4). The main result of this paper is Theorem 2.7, which shows asymptotic independence of the point process and the Frobenius norm of ${\bf S}$ . The challenges in the proof of this result and our novel technical contributions are outlined in Section 2.3, while Section 2.4 presents an application to independence testing. The proofs are deferred to Section 3 and helpful auxiliary results are given in Section 4.

1.3. Notation

Convergence in distribution (resp. probability) is denoted by $\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}$ (resp. $\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}$ ), equality in distribution by $\stackrel{{\scriptstyle\rm d}}{{=}}$ , and unless explicitly stated otherwise all limits are for $n\to\infty$ . For sequences $(a_{n})_{n}$ and $(b_{n})_{n}$ we write $a_{n}=O(b_{n})$ if $a_{n}/b_{n}\leq C$ for some constant $C>0$ and every $n\in\mathbb{N}$ , and $a_{n}=o(b_{n})$ if $\lim_{n\to\infty}a_{n}/b_{n}=0$ . Additionally, we use the notation $a_{n}\sim b_{n}$ if $\lim_{n\to\infty}a_{n}/b_{n}=1$ .

2. Main results

Consider a sample $\mathbf{x}_{1},\ldots,\mathbf{x}_{n}$ from the $p$ -dimensional population $\mathbf{x}$ with iid components with generic element $X$ satisfying ${\mathbb{E}}[X]=0$ and ${\mathbb{E}}[X^{2}]=1$ . We will work in the high-dimensional setting, where the dimension $p=p_{n}$ is some positive integer sequence tending to infinity as $n\to\infty$ . We aim to study the joint asymptotic behavior of the point processes $N_{n}$ of the off-diagonal entries (see (1.2)) and the Frobenius norm of the sample covariance matrix ${\mathbf{S}}=(S_{ij})=\frac{1}{n}\sum_{t=1}^{n}\mathbf{x}_{t}\mathbf{x}_{t}^{\top}$ .

2.1. Asymptotic distributions of point process and Frobenius norm of the sample covariance matrix

First we consider the convergence in distribution of the sequence of point processes $N_{n}$ towards a Poisson random measure. For a detailed background on weak convergence of point processes we refer to [38, 10]. The following result can be found in [23, Theorem 3.2] and [22, Theorem 4.1].

Theorem 2.1.

[23, Theorem 3.2]** Assume that there exist $s>2$ and $\varepsilon>0$ such that ${\mathbb{E}}[|X|^{s+\varepsilon}]<\infty$ and let $p=p_{n}\to\infty$ satisfy $p=O(n^{(s-2)/4})$ , as $n\to\infty$ . Then it holds that $N_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}N$ , where $N$ is a Poisson random measure with mean measure $\mu(x,\infty)=\operatorname{e}^{-x}$ for $x\in\mathbb{R}$ .

The Poisson random measure $N$ with mean measure $\mu(x,\infty)=\operatorname{e}^{-x}$ for $x\in\mathbb{R}$ has the representation

[TABLE]

where $\Gamma_{i}=E_{1}+\cdots+E_{i}$ , $i\geq 1$ , and $(E_{i})$ is a sequence of iid standard exponential random variables [38].

Our second object of interest is the squared Frobenius norm of the sample covariance matrix, that is $\|{\mathbf{S}}\|_{F}^{2}:=\sum_{i,j=1}^{p}S_{ij}^{2}=\operatorname{tr}({\mathbf{S}}^{2})$ . In order to study its asymptotic distribution we define

[TABLE]

where

[TABLE]

The next result provides a CLT for $\operatorname{tr}({\mathbf{S}}^{2})$ in the general case that $p_{n}\to\infty$ and ${\mathbb{E}}[X^{4}]<\infty$ .

Theorem 2.2.

If ${\mathbb{E}}[X^{4}]<\infty$ and $p=p_{n}\to\infty$ , then as $n\to\infty$ we have $Z_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}Z$ for a standard normal random variable $Z$ .

Remark 2.3.

Under special assumptions on $X$ and the growth of $p$ , the behavior of $Z_{n}$ can be deduced from CLTs for so-called linear spectral statistics of sample covariance matrices: for example, [3, Theorem 9.10] and [34, Theorem 1.4] in the case $p/n\to C\in(0,\infty)$ and ${\mathbb{E}}[X^{4}]<\infty$ , or [36, Theorem 3.1] in the case $n^{2}/p=O(1)$ if ${\mathbb{E}}[|X|^{6+\delta}]<\infty$ for some $\delta>0$ .

In contrast to the convergence of the point processes $N_{n}$ in Theorem 2.1, the CLT in Theorem 2.7 requires the existence of the fourth moment of $X$ . To characterize the asymptotic distribution of $\operatorname{tr}({\mathbf{S}}^{2})$ in the case ${\mathbb{E}}[X^{4}]=\infty$ we need to assume that $X$ is a regularly varying random variable. We say that a random variable $X$ (or its distribution) is regularly varying with index $\alpha>0$ if

[TABLE]

where $L$ is a slowly varying function, i.e., $\lim_{x\to\infty}L(tx)/L(x)=1$ for $t>0$ . Examples of regularly varying distributions are the Pareto distribution with parameter $\alpha$ and the $t$ -distribution with $\alpha$ degrees of freedom.

For a regularly varying random variable $X$ with index $\alpha$ it holds that ${\mathbb{E}}[|X|^{\beta}]<\infty$ if $\beta<\alpha$ and ${\mathbb{E}}[|X|^{\beta}]=\infty$ if $\beta>\alpha$ . It is well–known that the sequence $(a_{k})_{k}$ defined through

[TABLE]

is of the form $a_{k}=k^{1/\alpha}\ell(k)$ , where $\ell$ is a slowly varying function. For further properties of regularly varying functions we refer to [4, 39]. In the next theorem we consider the case that $X$ has finite variance but infinite fourth moment.

Theorem 2.4.

Let $X$ have a regularly varying distribution with index $\alpha\in(2,4)$ and assume that $p=p_{n}\to\infty$ . Then it holds that

[TABLE]

where $\zeta_{\alpha/4}$ is a non-degenerated, $\alpha/4$ -stable random variable with characteristic function

[TABLE]

where ${\rm i}$ is the imaginary unit and $c_{\alpha}$ is a constant only depending on $\alpha$ .

In the proof of Theorem 2.4 we show that

[TABLE]

Noticing that $\sum_{i,t}X_{it}^{4}$ is a sum of iid regularly varying random variables with index $\alpha/4\in(1/2,1)$ , we obtain an $\alpha/4$ -stable limit distribution after proper normalization [39].

Remark 2.5.

(a) In some cases it is possible to replace $\operatorname{tr}({\mathbf{S}})$ by its expectation ${\mathbb{E}}[\operatorname{tr}({\mathbf{S}})]=p$ in (2.8). For example, if $\lim_{n\to\infty}p/n\in(0,\infty)$ , we get for $\delta>0$

[TABLE]

where Theorem A1 of [11] and the fact that $X^{2}-1$ is regularly varying with index $\alpha/2$ were used for the asymptotic equivalence in the last line.

(b) For $\alpha=4$ the limit in (2.8) does not hold in general, which we shall illustrate in the case ${\mathbb{E}}[X^{4}]<\infty$ . As $\operatorname{tr}({\mathbf{S}})$ is a sum of iid random variables with finite variance, it satisfies a CLT. Furthermore, Theorem 2.2 states that $\operatorname{tr}({\mathbf{S}}^{2})$ is also asymptotically normal. Along the lines of the proof of Theorem 2.2 one can show joint asymptotic normality of $\operatorname{tr}({\mathbf{S}})$ and $\operatorname{tr}({\mathbf{S}}^{2})$ . For the sake of brevity we omit a proof, but we mention that in the special case $\lim_{n\to\infty}p/n\in(0,\infty)$ joint asymptotic normality of $\operatorname{tr}({\mathbf{S}})$ and $\operatorname{tr}({\mathbf{S}}^{2})$ was established in [43, Lemma 2.2].

2.2. Joint limiting distribution of point process and Frobenius norm of the sample covariance matrix

In this subsection we are interested in the joint limiting distribution of $Z_{n}$ in (2.5) and the point process $N_{n}$ in (1.2). For this purpose we start by giving a definition of asymptotic independence; c.f. [26, p. 284]. For the Poisson random measure $N$ defined in (2.4) we will need the collection of sets

[TABLE]

where $\partial B$ is the boundary of $B$ .

Definition 2.6.

Let $(Y_{n})_{n}$ be a sequence of real-valued random variables, which converges to the random variable $Y$ in distribution. Additionally, let $(N_{n})_{n}$ be a sequence of point processes on $\mathbb{R}$ , which converges to the point process $N$ in distribution. We call $(Y_{n})_{n}$ and $(N_{n})_{n}$ asymptotically independent, if and only if for every $y\in\mathbb{R}$ , $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ and $l_{1},\ldots,l_{k}\in\mathbb{N}_{0}:=\mathbb{N}\cup\{0\}$

[TABLE]

Now we state our main result about the joint convergence of $(Z_{n},N_{n})$ .

Theorem 2.7.

Assume that there exist $s\geq 4$ and $\varepsilon>0$ such that ${\mathbb{E}}[|X|^{s+\varepsilon}]<\infty$ and let $p=p_{n}\to\infty$ satisfy $p=O(n^{(s-2)/4})$ , as $n\to\infty$ . Then it holds for the standardized traces $(Z_{n})_{n}$ defined in (2.5) and the point processes $(N_{n})_{n}$ in (1.2) that

[TABLE]

where $Z\sim\mathcal{N}(0,1)$ , $N$ is a Poisson random measure with mean measure $\mu(x,\infty)=\operatorname{e}^{-x},x\in\mathbb{R}$ , and $Z$ and $N$ are independent.

Theorem 2.7 only requires the conditions of Theorems 2.1 and 2.2 which are both formulated under minimal assumptions, c.f. [22]. Note that if ${\mathbb{E}}[X^{4}]=\infty$ and consequently $\operatorname{tr}({\mathbf{S}}^{2})$ does not converge to the normal distribution (see Theorem 2.4), our methods in the proof of Theorem 2.7 cease to work. Thus the joint limit behavior of $N_{n}$ and $\operatorname{tr}({\mathbf{S}}^{2})$ is still an open problem in the case of regularly varying $X$ with index $\alpha\in(2,4)$ . In this case the dominating part of $n^{2}\operatorname{tr}({\mathbf{S}}^{2})$ is $\sum_{i=1}^{p}\sum_{t=1}^{n}X_{it}^{4}$ and hence the sum of iid random variables which are regularly varying with index $\alpha/4$ . In [8] it is shown that this sum is not asymptotically independent of the maximum of the $X_{it}^{4}$ . Therefore, we conjecture that $Z_{n}$ and $N_{n}$ will not be asymptotically independent anymore.

As a consequence of Theorem 2.7 we obtain the asymptotic independence of $Z_{n}$ and a fixed number of upper order statistics of the random variables $(d_{p}(\sqrt{n}S_{ij}-d_{p}))_{1\leq i<j\leq p}$ , which we denote by

[TABLE]

Corollary 2.8.

Let $Z\sim\mathcal{N}(0,1)$ be independent of an iid sequence $(E_{i})_{i\geq 1}$ of standard exponentially distributed random variables and set $\Gamma_{i}:=E_{1}+\ldots+E_{i}$ . Under the conditions of Theorem 2.7 and for fixed $k\geq 1$ it holds

[TABLE]

where $y,x_{1},\ldots,x_{k}\in\mathbb{R}$ .

Proof.

Since $N_{n}(x,\infty)$ is the number of pairs $(i,j)$ with $1\leq i<j\leq p$ , for which $(d_{p}(\sqrt{n}S_{ij}-d_{p}))\in(x,\infty)$ , we get by Theorem 2.7

[TABLE]

In view of the representation $N\stackrel{{\scriptstyle\rm d}}{{=}}\sum_{i=1}^{\infty}\varepsilon_{-\log\Gamma_{i}}$ we obtain

[TABLE]

which proves the corollary. ∎

2.3. Main challenges in the proof of Theorem 2.7

In this subsection we describe the key challenges in the proof of Theorem 2.7 and our novel technical contributions.

The distribution of a point process $N_{n}$ is determined by the family of the distributions of the finite-dimensional random vectors $(N_{n}(B_{1}),\ldots,N_{n}(B_{k}))$ for any choice of suitable Borel sets $B_{1},\ldots,B_{k}$ ; see [10, Proposition 6.2.III]. The collection of these distributions is called the finite-dimensional distributions of $N_{n}$ . Due to the dependence of the $(S_{ij})$ a direct analysis of the finite-dimensional distributions of $N_{n}$ is intractable. The same applies to the Laplace functional of $N_{n}$ which determines the distribution of $N_{n}$ completely and can be seen as a similar tool for a point process as the characteristic function for a real-valued random variable.

Fortunately, Kallenberg proved a sufficient condition for the weak convergence of a sequence of point processes $N_{n}$ towards $N$ , which is often much easier to verify than the convergence of the finite-dimensional distributions or the Laplace functionals. More precisely, he showed that if $N$ is a simple point process (such as (2.4)), then it is enough to ensure that ${\mathbb{E}}[N_{n}(I)]$ converges to ${\mathbb{E}}[N(I)]$ for any $I\in\mathcal{J}$ and that the probability of the event $\{N_{n}(U)=0\}$ converges to the probability of the event $\{N(U)=0\}$ for any $U\in\mathcal{U}$ ; see, for instance, [28, p. 35, Theorem 4.7] or [14, p. 233, Theorem 5.2.2]. We define $\mathcal{U}$ as the set of finite unions of intervals and $\mathcal{J}$ as the set of intervals in $\mathbb{R}$ .

Therefore, instead of showing the convergence of the random vector $(N_{n}(B_{1}),\ldots,N_{n}(B_{k}))$ for any $k\geq 1$ and $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ , it is enough to prove the convergence of the probability of the occurrence of points in finite unions of intervals, which often greatly simplifies the proof. Our Theorem 2.9 below contains a similarly helpful tool, which will be essential for studying the joint asymptotic distribution of $Z_{n}$ and the point process $N_{n}$ .

Theorem 2.9 (Extension of Kallenberg’s Theorem).

Let $(Y_{n})_{n}$ be a sequence of real-valued random variables converging in distribution to a random variable $Y$ . In addition, let $N$ be a simple point process on $\mathbb{R}$ independent of $Y$ and let $(N_{n})_{n}$ be a sequence of point processes. If the following two conditions

(K1)

$\limsup\limits_{n\to\infty}{\mathbb{E}}[N_{n}(I)]\leq{\mathbb{E}}[N(I)]$ , $I\in\mathcal{J}$ , 2. (K2)

$\lim\limits_{n\to\infty}{\mathbb{P}}(Y_{n}\leq y,\,N_{n}(U)=0)={\mathbb{P}}(Y\leq y){\mathbb{P}}(N(U)=0)$ , $y\in\mathbb{R},\,U\in\mathcal{U}$

hold, then $N_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}N$ and $(N_{n})_{n}$ and $(Y_{n})_{n}$ are asymptotically independent.

Theorem 2.9 essentially shows that the sequence of random variables $Y_{n}$ and a sequence of point processes $N_{n}$ are asymptotically independent if the events $\{Y_{n}\leq y\}$ and $\{N_{n}(U)=0\}$ are asymptotically independent for any $y\in\mathbb{R}$ and $U\in\mathcal{U}$ . Since Theorem 2.9 requires mild assumptions on the real-valued random variables $Y_{n}$ and the point processes $N_{n}$ , it is applicable to a wide variety of other settings. Moreover, as seen in Corollary 2.8, it also yields asymptotic independence of the points of $N_{n}$ and $Y_{n}$ .

In Theorem 2.7 the joint convergence of the point process $N_{n}$ of the entries of the sample covariance matrix ${\mathbf{S}}$ and the standardized Frobenius norm $Z_{n}$ of ${\mathbf{S}}$ is considered, which to the best of our knowledge has not previously been studied. We would like to mention that results about the joint convergence of a point process of dependent points and their sum are only available in the Gaussian case [26, 35, 42], whose techniques are not applicable to non-Gaussian sequences.

In view of Definition 2.6, the main challenge in our case is to show the convergence in distribution of the random vectors $(Z_{n},N_{n}(B_{1}),\ldots,N_{n}(B_{k}))$ for any $k\geq 1$ and $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ . Note that every summand of $Z_{n}$ is dependent on a lot of points of $N_{n}$ . To overcome this challenge we developed a novel technical tool Theorem 2.9, which allows us to reduce the convergence in distribution of the random vector above to the following two conditions:

(K1’)

$\limsup\limits_{n\to\infty}{\mathbb{E}}[N_{n}(I)]={\mathbb{E}}[N(I)]$ for any interval $I$ , 2. (K2’)

$\lim\limits_{n\to\infty}{\mathbb{P}}(Z_{n}\leq y,N_{n}(U)=0)=\Phi(y){\mathbb{P}}(N(U)=0)$ for any finite union of intervals $U$ .

Condition (K1’) can be shown through normal approximation to large deviation probabilities. The challenging part is condition (K2’), which controls the dependence between $Z_{n}$ and $N_{n}$ . The advantage of considering ${\mathbb{P}}(Z_{n}\leq y,N_{n}(U)=0)$ , respectively ${\mathbb{P}}(Z_{n}\leq y,N_{n}(U)\neq 0)$ , instead of probabilities for the vector $(Z_{n},N_{n}(B_{1}),\ldots,N_{n}(B_{k}))$ is that we can write

[TABLE]

where $B_{I}=\{d_{p}(\sqrt{n}S_{ij}-d_{p})\in U\}$ for $I=(i,j)\in\Lambda_{n}=\{(i,j):1\leq i<j\leq p\}$ . Then the Bonferroni bounds yield for $k\geq 1$

[TABLE]

where $W_{n,d}:=\sum_{I_{1}<\ldots<I_{d}}{\mathbb{P}}\big{(}Z_{n}\leq y,\bigcap_{\ell=1}^{d}B_{I_{\ell}}\big{)}$ is a sum of probabilities of the intersections of finitely many $B_{I}$ ’s. The number of summands of $Z_{n}$ , which are dependent on $B_{I_{1}},\ldots,B_{I_{d}}$ , is of order $p$ . We identify these dependent summands $Z_{n,d}$ and show that they are negligible, i.e.,

[TABLE]

The remaining summands $Z_{n}-Z_{n,d}$ are independent from $B_{I_{1}},\ldots,B_{I_{d}}$ and therefore

[TABLE]

By first letting $n\to\infty$ and then $k\to\infty$ , we can show that

[TABLE]

from which we deduce (K2’). The detailed proof of Theorem 2.7 will be presented in Section 3.4.

2.4. An application to independence testing

Consider a sample $\mathbf{y}_{1},\ldots,\mathbf{y}_{n}$ from the $p$ -dimensional population ${\mathbf{\Sigma}}^{1/2}\mathbf{x}$ , where ${\mathbf{\Sigma}}$ is an (unknown) non-random positive definite $p\times p$ matrix and $\mathbf{x}$ has iid components with mean zero and variance one. The largest off-diagonal entry of the sample covariance matrix ${\mathbf{S}}=(S_{ij})=\frac{1}{n}\sum_{t=1}^{n}\mathbf{y}_{t}\mathbf{y}_{t}^{\top}$ is a popular statistic for structural tests about properties of ${\mathbf{\Sigma}}$ ; we refer to the review paper [5] for an extensive summary and detailed references. We are interested in the null hypothesis of independence $H_{0}$ : $\,{\mathbf{\Sigma}}={\bf I}_{p}$ . In what follows we will present a rather simplistic extension of the classical maximum-type tests as an application of Theorem 2.7. This application is by no means perfect, it rather serves as an illustration of the potential of the asymptotic independence derived in Theorem 2.7 regarding statistical tests111A thorough analysis will be topic of future research by the authors..

Under the null hypothesis Theorem 2.7 studies the joint limiting distribution of $(Z_{n},N_{n})$ , where $Z_{n}$ is given in (2.5) and

[TABLE]

From (2.9) recall the definition of $G_{n,(1)}\geq\cdots\geq G_{n,(p(p-1)/2)}$ . Assuming the conditions of Theorem 2.7, the asymptotic independence of $Z_{n}$ and $N_{n}$ implies for fixed $k\geq 1$ (c.f. Corollary 2.8) that

[TABLE]

where $Z\sim\mathcal{N}(0,1)$ is independent of the iid sequence $(E_{i})_{i\geq 1}$ of standard exponentially distributed random variables and $\Gamma_{i}:=E_{1}+\ldots+E_{i}$ . Next we introduce a variety of different statistics:

[TABLE]

For the first statistic it holds that $T_{1}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}-\log\Gamma_{1}$ , which is standard Gumbel distributed with distribution function $\Lambda(x)=\exp(-\operatorname{e}^{-x})$ . Recall the well–known fact that

[TABLE]

where the right-hand vector consists of the order statistics of $k$ iid uniform random variables on $[0,1]$ . In combination with (2.11) we get for the other statistics, as $n\to\infty$ ,

[TABLE]

Next we construct the random variables

[TABLE]

where $F_{2,k}$ , $F_{3,k}$ , $F_{4,k}$ are the distribution functions of $\log(U_{(1)}/U_{(k)})$ , $\max_{1\leq i\leq k-1}\log(U_{(k-i)}/U_{(k-i+1)})$ and $\sum_{i=1}^{k-1}(\log(U_{(k-i)}/U_{(k-i+1)}))^{2}$ , respectively, and $\Phi$ denotes the standard normal distribution function. By (2.11), $P_{Z_{n}}$ is asymptotically independent of $P_{T_{1}},P_{T_{2,k}},P_{T_{3,k}},P_{T_{4,k}}$ . Additionally, each of these random variables converges to the uniform distribution on $[0,1]$ .

We propose the following four test statistics

[TABLE]

The null hypothesis $H_{0}$ is rejected by test $i\in\{1,\ldots,4\}$ , whenever

[TABLE]

The next result establishes the asymptotic distribution of $\mathcal{T}_{i,n}$ from which we deduce that the tests in (2.13) have asymptotic level $\beta\in(0,1)$ .

Corollary 2.10.

Under the conditions of Theorem 2.7, it holds that

[TABLE]

where $U$ and $V$ are independent random variables uniformly distributed on $[0,1]$ .

Since ${\mathbb{P}}(\min\{U,V\}\leq x)=2x-x^{2}$ , $x\in[0,1]$ , it follows for $\beta\in(0,1)$

[TABLE]

3. Proofs

To simplify the notation in the proofs we will write $a_{n}\sim b_{n}$ for real-valued sequences $a_{n}$ and $b_{n}$ if $\lim_{n\to\infty}a_{n}/b_{n}=1$ , $a_{n}\gg b_{n}$ if $\lim_{n\to\infty}b_{n}/a_{n}=0$ , $a_{n}\lesssim b_{n}$ if $\limsup_{n\to\infty}a_{n}/b_{n}=C^{\prime}$ for some constant $C^{\prime}\in[0,\infty)$ and $a_{n}\asymp b_{n}$ if $a_{n}\lesssim b_{n}$ and $b_{n}\lesssim a_{n}$ . Additionally, throughout the proofs $C$ denotes a positive constant which may vary from line to line. Unless explicitly stated otherwise, all limits are for $n\to\infty$ .

3.1. Proof of Theorem 2.2

First, notice that the magnitude of $\sigma_{n}$ , defined in (2.6), might be different in the cases ${\mathbb{E}}[X^{4}]\neq 1$ and ${\mathbb{E}}[X^{4}]=1$ . If ${\mathbb{E}}[X^{4}]=1$ , Markov’s inequality yields for $\varepsilon>0$ that

[TABLE]

such that $X^{2}=1$ almost surely, which due to ${\mathbb{E}}[X]=0$ is only possible if $X$ follows a symmetric Bernoulli distribution, i.e., ${\mathbb{P}}(X=-1)={\mathbb{P}}(X=1)=1/2$ . We will often consider the case ${\mathbb{E}}[X^{4}]=1$ separately in the course of this proof.

To prove the statement of Theorem 2.2, we use a central limit theorem for martingale differences. As we need the existence of higher moments, we truncate the random variables $X_{it}$ for $i=1,\ldots,p$ and $t=1,\ldots,n$ in an appropriate way. Let $(\beta_{n})_{n}$ be a positive sequence, which tends to zero and suffices $\beta_{n}\gg({\mathbb{E}}[|X|^{4}\mathds{1}_{\{|X|>\beta_{n}(np)^{1/4}\}}])^{1/4}$ . Notice, that such a sequence exists since we can choose a sequence $\beta_{n}^{\prime}$ , which tends zero sufficiently slowly such that $\beta_{n}^{\prime}(np)^{1/4}\to\infty$ . Now we set $\beta_{n}^{\prime\prime}:=({\mathbb{E}}[|X|^{4}\mathds{1}_{\{|X|>\beta_{n}^{\prime}(np)^{1/4}\}}])^{1/4}$ , which also tends to zero as $n\to\infty$ , and choose $\beta_{n}\gg\max\{\beta_{n}^{\prime},\beta_{n}^{\prime\prime}\}$ .

For this sequence $(\beta_{n})_{n}$ , we set

[TABLE]

Since ${\mathbb{E}}[X^{4}]<\infty$ an application of Lemma 4.1 with $s=4$ yields that it suffices to verify (2.5) with $Z_{n}$ replaced by

[TABLE]

It will be convenient to write $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}$ as a sum of martingale differences. In what follows, the notation

[TABLE]

will be helpful. Noting that $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ij}=\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ji}$ , we start by writing $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}$ as

[TABLE]

Using ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ij}^{2}|\tilde{\mathbf{x}}_{i}]=\sum_{t=1}^{n}\sum_{u=1}^{n}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{it}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{iu}{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{jt}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{ju}]={\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{i,i+1}^{2}|\tilde{\mathbf{x}}_{i}]$ for $j=i+1,\ldots,p$ we have

[TABLE]

Setting

[TABLE]

we get

[TABLE]

where $(M_{j})_{j\geq 1}$ is a martingale difference sequence with respect to the filtration $(\mathcal{F}_{j})_{j\geq 0}$ , where $\mathcal{F}_{j}$ is the sigma algebra generated by $\{\tilde{\mathbf{x}}_{1},\ldots,\tilde{\mathbf{x}}_{j}\}$ . Indeed, we have ${\mathbb{E}}[M_{j}|\mathcal{F}_{j-1}]=0$ .

By the Lindeberg-Feller theorem for martingales (see, for example, [13, Theorem 8.2.4, p. 344]), the convergence $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}\mathcal{N}(0,1)$ is implied by the following two assertions:

(1)

$A_{n}:=\frac{1}{n^{4}\sigma_{n}^{2}}\sum_{j=1}^{p}{\mathbb{E}}[M_{j}^{2}|\mathcal{F}_{j-1}]\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}1\,,\quad n\to\infty\,,$ 2. (2)

$\frac{1}{n^{8}\sigma_{n}^{4}}\sum_{j=1}^{p}{\mathbb{E}}[M_{j}^{4}|\mathcal{F}_{j-1}]\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0\,,\quad n\to\infty\,.$

Proof of (1).

To prove (1) we will show ${\mathbb{E}}[A_{n}]\to 1$ and $\operatorname{Var}(A_{n})\to 0$ as $n\to\infty$ . Since $(M_{j})$ is a martingale difference sequence, we have ${\mathbb{E}}[A_{n}]=\operatorname{Var}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n})$ and thus we get

[TABLE]

where we used

[TABLE]

and that $\operatorname{Cov}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ij}^{2},\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{kl}^{2})=0$ if $\{i,j\}\cap\{k,l\}=\emptyset$ . In the case ${\mathbb{E}}[X^{4}]\neq 1$ we then obtain by Lemma 4.2 that

[TABLE]

since $\sigma_{n}^{2}\asymp p/n+(p/n)^{3}$ . If ${\mathbb{E}}[X^{4}]=1$ one can similarly check that ${\mathbb{E}}[A_{n}]\to 1$ .

Now we turn to $\operatorname{Var}(A_{n})$ . To this end, we need the moments of the truncated random variable $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}$ and note that

[TABLE]

Similarly we obtain

[TABLE]

and for higher moments we get

[TABLE]

From the definitions of $M_{j}$ and the sigma algebra $\mathcal{F}_{j-1}$ , we deduce that there exists some constant $C^{\prime}_{n}$ only depending on $n$ such that

[TABLE]

The second term can be written as

[TABLE]

where $K_{t_{1},t_{2},1}$ can be expressed as

[TABLE]

Notice that $K_{t_{1},t_{2},1}=O(1)$ as $n\to\infty$ since the summands are [math] if $\{t_{1},t_{2}\}\cap\{t_{3},t_{4}\}=\emptyset$ . Otherwise if $t_{3}\neq t_{4}$ we have ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{2t_{3}}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{2t_{4}}]={\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{2}=o((np)^{-3/2})$ . The means are bounded in every case. With the third term of (3.24) we proceed similarly

[TABLE]

where $K_{t_{1},t_{2},2}$ can be written as

[TABLE]

As $n\to\infty$ it holds that $K_{t_{1},t_{2},2}=O(n+(np)^{1/2})$ because the expectation is zero if $\{t_{1},t_{2}\}\cap\{t_{3},t_{4}\}=\emptyset$ , bounded if $t_{3}\neq t_{4}$ and $C\,{\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{6}]=o((np))^{1/2}$ if $t_{1}=t_{2}=t_{3}=t_{4}$ . Hence, we are able to write $A_{n}$ as

[TABLE]

where $C_{n}$ is a constant only depending on $n$ and on the distribution of $X$ and $K_{t_{1},t_{2}}=8(p-j)K_{t_{1},t_{2},1}+4K_{t_{1},t_{2},2}=O(n+p)$ . The expectation of the last term can be written as

[TABLE]

Using Lemma 4.3 we see that, for sequences $C_{n,1},C_{n,2}$ and $C_{n,3}$ tending to constants,

[TABLE]

In view of

[TABLE]

it suffices to show that the variances of $\xi_{n,1}$ , $\xi_{n,2}$ , $\xi_{n,3}$ , $\xi_{n,4}$ and $\xi_{n,5}$ tend to zero as $n\to\infty$ .

We start by bounding the variance of $\xi_{n,1}$ :

[TABLE]

Since the summands are zero if $|\{t_{1},t_{2},t_{3},t_{4}\}|=4$ , bounded by ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]$ , which tends to zero as $n\to\infty$ if $|\{t_{1},t_{2},t_{3},t_{4}\}|=3$ and bounded above by a constant else, we get

[TABLE]

For the variance of $\xi_{n,2}$ it holds that

[TABLE]

As the covariance above is equal to ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{8}]-({\mathbb{E}}[{X}^{4}])^{2}=o(np)$ if $i_{1}=i_{2}=i_{3}$ , ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{6}]{\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}]-{\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{4}]({\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}])^{2}=o((np)^{1/2})$ if $|\{i_{1},i_{2},i_{3}\}|=2$ and bounded above by a constant if $|\{i_{1},i_{2},i_{3}\}|=3$ , we find

[TABLE]

By similar arguments, we obtain

[TABLE]

where the covariance above is ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{6}]{\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}]-{\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{3}]^{2}{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{2}=o((np)^{1/2})$ if $i_{1}=i_{2}=i_{3}=i_{4}$ and $t_{1}=t_{3}$ and $t_{2}=t_{4}$ , zero if $|\{i_{1},\ldots,i_{4}\}|=4$ or $|\{t_{1},\ldots,t_{4}\}|=4$ and bounded by a constant in the remaining cases. Therefore, we get using ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{2}=o((np)^{-3/2})$ that

[TABLE]

For the variance of $\xi_{n,4}$ we have

[TABLE]

Again the covariance above is zero, if $|\{i_{1},\ldots,i_{4}\}|=4$ or $|\{t_{1},\ldots,t_{4}\}|=4$ . If $|\{i_{1},\ldots,i_{4}\}|=3$ it is bounded by ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}]^{2}{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{4}-{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{8}=o((np)^{-3})$ and by a constant in the remaining cases. It follows $\operatorname{Var}(\xi_{n,4})=O\big{(}\frac{p^{4}}{n^{5}\sigma_{n}^{4}}\big{)}$ .

Next, since $\operatorname{Var}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{i_{1}t_{1}}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{i_{1}t_{2}}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{i_{2}t_{1}}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{i_{2}t_{3}})\lesssim 1$ for $|\{t_{1},\ldots,t_{4}\}|=3$ and ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]^{4}=o((np)^{-3})$ we obtain $\operatorname{Var}(\xi_{n,5})=o(p^{3}n^{-5}\sigma_{n}^{-4})$ .

Finally, we combine our variance estimates. In the case ${\mathbb{E}}[X^{4}]\neq 1$ , it holds $\sigma_{n}^{4}\asymp(p/n)^{2}+(p/n)^{6}$ which implies

[TABLE]

In the Bernoulli case ${\mathbb{E}}[X^{4}]=1$ , we have $\sigma_{n}^{4}\asymp(p/n)^{4}$ . As $K_{t_{1},t_{2},1}$ and $K_{t_{1},t_{2},2}$ are zero in this case, $\xi_{n,1}=0$ . Since ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]={\mathbb{E}}[X]=0$ and ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}]^{2}={\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{4}]=1$ one has $\xi_{n,2}=\xi_{n,3}=\xi_{n,5}=0$ . Repeating the above considerations for $\xi_{n,4}$ , one can show that $\operatorname{Var}(\xi_{n,4})\to 0$ as $n\to\infty$ , establishing (3.25) in the Bernoulli case as well.

Equation (3.25) concludes the proof of $\operatorname{Var}(A_{n})\to 0$ , as $n\to\infty$ . In combination with ${\mathbb{E}}[A_{n}]\to 1$ , this proves the desired $A_{n}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}1$ .

Proof of (2).

We write

[TABLE]

and bound the fourth moment of $M_{j}$ in the following way

[TABLE]

First we take a look at

[TABLE]

For the last mean we have

[TABLE]

where we used (3.20) and (3.21) in the last line. Hence, we get after simplifying $M_{j4}$

[TABLE]

and by the Marcinkiewicz-Zygmund inequality, see for example [9, Theorem 2, p.386] in combination with Hölder inequality (see also [17, Lemma 2, p.24]), it follows

[TABLE]

To bound the fourth moment of $M_{j2}$ we consider the eighth moment of $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}$

[TABLE]

where we used (3.19), (3.20) and (3.21) in the last step. By the Marcinkiewicz-Zygmund inequality (see for instance [9, Theorem 2, p.386]) and Lemma 1 of [17] we conclude that

[TABLE]

Finally, it holds for the fourth moment of $M_{j5}$ that

[TABLE]

Observe that if there are more than five different indices $t_{1},\ldots,t_{8}$ then one factor in the mean above is independent from the rest and the mean is zero, thus by (3.19), (3.20) and (3.21)

[TABLE]

Consequently, we obtain

[TABLE]

since $\sigma_{n}^{4}\asymp(p/n)^{6}+(p/n)^{2}$ if ${\mathbb{E}}[X^{4}]\neq 1$ . In the Bernoulli case it holds that $M_{j1}=M_{j3}=M_{j4}=M_{j5}=0$ and ${\mathbb{E}}[M_{j2}^{4}]=(j-1)^{2}O(n^{4})$ . Therefore, $\frac{1}{n^{8}\sigma_{n}^{4}}\sum_{j=1}^{p}{\mathbb{E}}[M_{j}^{4}]=O(p^{3}/(n^{4}\sigma_{n}^{4}))=o(1)$ as $\sigma_{n}^{4}\asymp(p/n)^{4}$ , which finishes the proof.

3.2. Proof of Theorem 2.4

We split the modified trace of ${\mathbf{S}}^{2}$ into four terms

[TABLE]

As $X^{4}$ is regularly varying with index $\alpha/4$ and

[TABLE]

the first term converges to an $\alpha/4$ -stable distribution by [13, Theorem 3.8.2] (see also [39]). By [13, p. 164] this $\alpha$ -stable distribution has the characteristic function

[TABLE]

where $c_{\alpha}$ is a constant only depending on $\alpha$ .

Therefore, it suffices to show that $V_{n,1}$ , $V_{n,2}$ and $V_{n,3}$ tend to zero in probability. We will start by showing $V_{n,3}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0$ . As $X$ is regularly varying with index $\alpha>2$ , the second moment of $X$ exists by Proposition 1.3.2 of [33]. An application of Markov’s inequality yields for $\delta>0$

[TABLE]

where we used that $a_{np}=(np)^{1/\alpha}\ell_{1}(np)$ for $\alpha\in(2,4)$ (see [4]) and a slowly varying function $\ell_{1}$ . The last step is a consequence of the following property of slowly varying functions $\ell$ . By the Potter Bounds, which can be found in Theorem 1.5.6 of [4], it holds that for $x$ sufficiently large and any $\epsilon>0$ and $K>1$

[TABLE]

To show that $V_{n,1}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0$ we will truncate the random variables $X_{it}^{2}$ at $s_{n}:=p^{2/\alpha}n^{2(1+\epsilon)/\alpha}$ for a positive $\epsilon$ sufficiently small. Then we have for the event $Q:=\bigcup\limits_{i=1}^{p}\bigcup\limits_{t=1}^{n}\{X_{it}^{2}>s_{n}\}$ that

[TABLE]

where we used that $|X|$ is regularly varying with index $\alpha\in(2,4)$ . Letting ${\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{2}_{it}:=X_{it}^{2}\mathds{1}_{\{X_{it}^{2}\leq s_{n}\}}$ and $\eta_{it}:=\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{it}^{2}-{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}_{it}^{2}]$ , we get for any $\delta>0$

[TABLE]

where for the last line we used Markov’s inequality and the fact that

[TABLE]

since ${\mathbb{E}}[X^{2}\mathds{1}_{\{X^{2}>s_{n}\}}]\asymp s_{n}{\mathbb{P}}(X^{2}>s_{n})$ by Karamata’s Theorem (see [4]) and due to the Potter bounds (3.26).

Regarding the first term in (3.29), we obtain that

[TABLE]

since ${\mathbb{E}}[X^{4}\mathds{1}_{\{X^{2}\leq s_{n}\}}]\asymp s_{n}^{2}{\mathbb{P}}(X^{2}>s_{n})$ by Karamata’s theorem. Using Karamata’s theorem again and similiar arguments as above, also the second term of (3.29) tends to zero as $n\to\infty$ . Therefore, $V_{n,1}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0$ as $n\to\infty$ .

Notice that $V_{n,2}$ is equal to $V_{n,1}$ if we exchange the roles of $n$ and $p$ , which does not matter for the proof given above. Hence, it also holds that $V_{n,2}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0$ as $n\to\infty$ .

3.3. Proof of Theorem 2.9

Let now $y\in\mathbb{R}$ and $l_{1},\ldots,l_{k}\in\mathbb{N}_{0}$ . Let $(Q_{n})_{n}$ be a sequence of probability measures defined by the distribution functions

[TABLE]

where $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ . We recall that for a point process $\xi$ , $\mathcal{B}_{\xi}:=\{B\;\text{bounded Borel set}:\xi(\partial B)=0\}$ . As $\mathbb{R}^{k+1}$ is a Polish space $(Q_{n})_{n}$ is tight. Then, by Prokhorov’s Theorem, $(Q_{n})_{n}$ is relatively compact with respect to convergence in distribution and hence, for every sequence $(n_{m})_{m\in\mathbb{N}}$ in $\mathbb{N}$ there exists a subsequence $(n_{m_{j}})_{j\in\mathbb{N}}$ with

[TABLE]

where $\tilde{Y},\tilde{N}(B_{1}),\ldots,\tilde{N}(B_{k})$ are real valued random variables. Since the sets $B_{1},\ldots,B_{k}$ are arbitrary, we can also find a subsequence $(n_{m_{j}})_{j\in\mathbb{N}}$ such that (3.30) holds for any choice of $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ . Let $\tilde{N}$ be the point process defined by the random vectors $(\tilde{N}(B_{1}),\ldots,\tilde{N}(B_{k}))$ for $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$ .

Assumption (K2) implies for every $U\in\mathcal{U}$

[TABLE]

Therefore, we get by Lemma 4.6 of [28] that $\mathcal{B}_{N}\subset\mathcal{B}_{\tilde{N}}$ and hence $\mathcal{U}\subset\mathcal{B}_{\tilde{N}}$ . Then, we get

[TABLE]

for every $U\in\mathcal{U}$ . Let $\mathfrak{R}$ be the set of locally finite measures $\mu$ on $(\mathbb{R},\mathcal{B})$ , where $\mathcal{B}$ consists of all bounded Borel sets with $\mu(B)\in\mathbb{N}_{0}$ for all $B\in\mathcal{B}$ . Additionally, $\mathcal{N}$ is the $\sigma$ -algebra on $\mathfrak{R}$ that is generated by the mappings $\mu\mapsto\mu(B),\,B\in\mathcal{B}$ , i.e., the smallest $\sigma$ -algebra making these mappings measurable.

We now introduce the Dynkin-system

[TABLE]

As $Y_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}Y$ , we have ${\mathbb{P}}(Y\leq y)={\mathbb{P}}(\tilde{Y}\leq y)$ and therefore, $\mathfrak{R}\in\mathcal{D}$ . Moreover, $\mathcal{D}$ is closed under proper differences and monotone limits. Let

[TABLE]

By assumption (K2), $\mathcal{C}\subset\mathcal{D}$ and $\mathcal{C}$ is closed under finite intersection. Therefore, by 15.2.1 of [28] it follows that $\sigma(\mathcal{C})\subset\mathcal{D}$ . By Lemmas 2.2, 1.3 and 1.4 of [28] it holds that $\varphi:\mu\to\mu^{*}$ is measurable $\sigma(\mathcal{C})\to\mathcal{N}$ , where $\mu^{*}(B)=\sum_{s\in B}\mathds{1}_{[1,\infty)}(\mu\{s\})$ for every $B\in\mathcal{B}$ . As $\sigma(\mathcal{C})\subset\mathcal{D}$ we get for every $M\in\mathcal{N}$

[TABLE]

A simple point process $\mu$ can be written as

[TABLE]

where $I$ is an index set and the $X_{i}$ ’s are random elements. Therefore, it holds for every $B\in\mathcal{B}$ that

[TABLE]

As $N$ is simple, we get

[TABLE]

for every $M\in\mathcal{N}$ and every $y\in\mathbb{R}$ . Therefore, we also get ${\mathbb{P}}(N\in M)={\mathbb{P}}(\tilde{N}^{*}\in M)$ for every $M\in\mathcal{N}$ . We define the set of measures

[TABLE]

Then, it follows that

[TABLE]

and as $B_{1},\ldots,B_{k},l_{1},\ldots,l_{k}$ were chosen arbitrarily, we have $N\stackrel{{\scriptstyle d}}{{=}}\tilde{N}^{*}$ . Additionally, for $I\in\mathcal{J}$ it holds that as $j\to\infty$

[TABLE]

Then, by $N\stackrel{{\scriptstyle d}}{{=}}\tilde{N}^{*}$ , the definition of $\tilde{N}^{*}$ , (3.32), 15.4.3 of [28] and assumption (K1) we get

[TABLE]

Therefore, $\tilde{N}$ is a.s. simple and consequently

[TABLE]

for every $M\in\mathcal{N}$ . Inserting $\hat{M}$ , we get

[TABLE]

As the subsequence $(n_{m})_{m}$ was arbitrary, we conclude for every $l_{1},\ldots,l_{k}\in\mathbb{N}_{0}$ and $B_{1},\ldots,B_{k}\in\mathcal{B}_{N}$

[TABLE]

and therefore $(Y_{n})_{n}$ and $(N_{n})_{n}$ are asymptotically independent.

3.4. Proof of Theorem 2.7

By the proof of Theorem 2.1 we know that $\lim\limits_{n\to\infty}{\mathbb{E}}[N_{n}(I)]={\mathbb{E}}[N(I)]$ for every interval $I$ . In view of Theorem 2.9, it suffices to show

[TABLE]

for every $y\in\mathbb{R}$ and $U\in\mathcal{U}$ , where $\Phi$ is the distribution function of the standard normal distribution and $\mathcal{U}$ is the set of finite unions of intervals. As $Z_{n}\stackrel{{\scriptstyle\rm d}}{{\rightarrow}}\mathcal{N}(0,1)$ , this is equivalent to

[TABLE]

for every $U\in\mathcal{U}$ and $y\in\mathbb{R}$ . Throughout this proof, we let $U\in\mathcal{U}$ and $y\in\mathbb{R}$ be arbitrary.

Recall that $p=O\big{(}n^{(s-2)/4}\big{)}$ for some $s\geq 4$ and that $\varepsilon>0$ is such that ${\mathbb{E}}|X|^{s+\varepsilon}]<\infty$ . We need the following notation. For $\tilde{s}=s+\varepsilon$ let $(\beta_{n})_{n}$ be a positive sequence, which tends to zero and satisfies $\beta_{n}\gg({\mathbb{E}}[|X|^{\tilde{s}}\mathds{1}_{\{|X|>\beta_{n}(np)^{1/\tilde{s}}\}}])^{1/\tilde{s}}$ . Such a sequence exists by similar reasons as in the proof of Theorem 2.2. We set

[TABLE]

By Lemma 4.4, we have for $\delta>0$

[TABLE]

where

[TABLE]

Lemma 4.1 asserts that $\lim_{n\to\infty}{\mathbb{P}}(|Z_{n}-\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}|>\delta)=0$ for every $\delta>0$ .

Assume for the moment that

[TABLE]

In conjunction with (3.34), this yields for $\delta>0$

[TABLE]

so that taking the limit $\delta\to 0$ establishes (3.33) by the continuity of the normal distribution. Therefore, it remains to show (3.36) for which we proceed similarly to [17].

To this end, we set $A_{n}=A_{n}(y)=\{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}\leq y\}$ and $B_{I}=B_{I}(U)=\{d_{p}(\sqrt{n}S_{ij}-d_{p})\in U\}$ , where $I=(i,j)\in\Lambda_{n}:=\{(i,j):\,1\leq i<j\leq p\}$ . For $I_{1}=(i_{1},j_{1})\in\Lambda_{n}$ and $I_{2}=(i_{2},j_{2})\in\Lambda_{n}$ we write $I_{1}<I_{2}$ if $i_{1}<i_{2}$ or ( $i_{1}=i_{2}$ and $j_{1}<j_{2}$ ). Then we have

[TABLE]

where $A_{n}B_{I}=A_{n}\cap B_{I}$ . Using $Z_{n}-\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}\stackrel{{\scriptstyle{\mathbb{P}}}}{{\rightarrow}}0$ , Theorem 2.2 and Theorem 2.1, it holds

[TABLE]

so that equation (3.36) follows from

[TABLE]

To prove (3.38), we start by setting

[TABLE]

Then the Bonferroni bounds yield for $k\geq 1$

[TABLE]

By (3.39), we have for $k\geq 1$

[TABLE]

From [23, p. 555] we know that

[TABLE]

Note that $\lim_{k\to\infty}\frac{(\mu(U))^{k}}{k!}=0$ . By first letting $n\to\infty$ and then $k\to\infty$ in (3.40) we now obtain (3.38) provided that

[TABLE]

Proof of (3.42). For fixed $I_{1}<\ldots<I_{d}\in\Lambda_{n}$ with $I_{l}=(i_{l},j_{l})$ for $l=1,\ldots,d$ we will identify the summands of $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}$ in (3.35) that are dependent on $B_{I_{1}},\ldots,B_{I_{d}}$ and show that their contribution is negligible (in a suitable way). Therefore, we introduce the set $\Lambda_{n,d}=\Lambda_{n,d}(I_{1},\ldots,I_{d})$ through

[TABLE]

The set $\Lambda_{n,d}$ includes the indices $(i,j)$ of all summands of $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}$ that are dependent on $B_{I_{1}},\ldots,B_{I_{d}}$ . Notice that $\Lambda_{n,d}$ is not a subset of $\Lambda_{n}$ because $\Lambda_{n,d}$ might also contain indices $(i,i)$ corresponding to diagonal elements $S_{ii}$ of the covariance matrix. For our further arguments the following bound on the cardinality of $\Lambda_{n,d}$ is important:

[TABLE]

By the definition of $\Lambda_{n,d}$ , $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n}-\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{Z}_{n,d}$ is independent of $B_{I_{1}},\ldots,B_{I_{d}}$ , where

[TABLE]

Using this independence and applying Lemma 4.4 twice we get for $\delta>0$ ,

[TABLE]

and similarly,

[TABLE]

Therefore, it follows that

[TABLE]

If we assume

[TABLE]

we get using Theorem 2.2 and (3.41) that

[TABLE]

Sending $\delta$ to zero establishes (3.42). Therefore it remains to show (3.43).

By Markov’s inequality we obtain for even $\tau\in\mathbb{N}$ and $\delta>0$ ,

[TABLE]

Letting $\mathcal{K}:=\{i_{l},j_{l}\,|\,l=1,\ldots,d\}$ we write the first term on the right-hand side of (3.44) as follows

[TABLE]

where we used the Marcinkiewicz-Zygmund inequality (see [9, Theorem 2, p. 386]) and the inequality $(a+b)^{c}\leq 2^{c-1}(a^{c}+b^{c})$ for $a>0$ , $b>0$ and $c\geq 1$ . We apply the law of total expectation and the Marcinkiewicz-Zygmund inequality to the first term of (3.45) and obtain

[TABLE]

where $K_{\tau}$ is a constant only depending on $\tau$ , which may vary from line to line. Recalling the notation $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ij}=\sum_{t=1}^{n}\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{}_{ijt}$ we may write

[TABLE]

If more than $\tau+1$ of the indices $t_{1},u_{1},\ldots,\,t_{\tau},u_{\tau}$ are different, then there exists a tuple $(t_{k},u_{k})$ such that $t_{k}\neq t_{l}$ , $t_{k}\neq u_{l}$ , $u_{k}\neq t_{l}$ and $u_{k}\neq u_{l}$ for every $l\neq k$ and therefore one of the factors in the mean of (3.46) is independent, so that the summand disappears. The remaining summands are bounded above by

[TABLE]

Similar to (3.19), (3.20) and (3.21) it holds that ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}]=o((np)^{-(\tilde{s}-1)/\tilde{s}})$ , ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{r}]\leq C$ for $r\leq\tilde{s}$ and ${\mathbb{E}}[{\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}}^{r}]=o((np)^{(r-\tilde{s})/\tilde{s}})$ for $r>\tilde{s}$ . From this we deduce that if $|\{t_{1},u_{1},\ldots,t_{\tau},u_{\tau}\}|=\ell$ one has

[TABLE]

Therefore, we get for the first term of (3.45)

[TABLE]

The term in the maximum is either increasing or decreasing with $\ell$ or it is equal to $o((np)^{4\tau+4-2\tilde{s})/\tilde{s}})$ if $n\asymp(np)^{4/\tilde{s}}$ . Therefore, the right-hand side in (3.49) is

[TABLE]

where we used that $p=O(n^{(s-2)/4})$ and $(s+2)/\tilde{s}>1$ for an $\varepsilon<1$ , and additionally, $\sigma_{n}^{2}\asymp(p/n)^{3}+p/n$ if ${\mathbb{E}}[X^{4}]\neq 1$ . In the Bernoulli case ${\mathbb{E}}[X^{4}]=1$ , (3.47) is bounded by a constant and therefore the first term of (3.45) is $O(n/p^{\tau/2})$ as $\sigma_{n}^{2}\asymp(p/n)^{2}$ .

By Jensen’s inequality the second term of (3.45) is bounded above by

[TABLE]

where $K^{\prime}_{\tau}$ is a constant only depending on $\tau$ , which may vary from line to line. For the mean in the sum we can write

[TABLE]

We consider one of the summands above. There can be $0\leq q\leq\tau$ pairs of indices $(t_{i},u_{i})$ with $t_{i}=u_{i}$ . Assume these pairs are $(t_{1},u_{1}),\ldots,(t_{q},u_{q})$ and for $i>q$ it holds that $t_{i}\neq u_{i}$ . Then the mean in the sum equals

[TABLE]

If there are more than $\tau-q/2+1$ different indices, the summand is equal to zero. For $1\leq\ell\leq\tau-q/2+1$ different indices the summand is bounded above by

[TABLE]

Therefore, we get for the second term of (3.45)

[TABLE]

The term in the maximum is either increasing or decreasing with $\ell$ or it is equal to $o((np)^{2\tau+2-2(\tilde{s}-1)(\tau-q))/\tilde{s}})$ if $n\asymp(np)^{2/\tilde{s}}$ . Therefore, the expression in (3.51) is

[TABLE]

since in the first step both terms in the maximum are growing with $q$ , so that the terms are the largest for $q=\tau$ , and since in the second step the last term of the maximum is larger than the first term for every $p=O(n^{(s-2)/4})$ . As $\sigma_{n}^{2}\asymp p/n+(p/n)^{3}$ if ${\mathbb{E}}[X^{4}]\neq 1$ , we have

[TABLE]

In the Bernoulli case the second term of (3.45) is equal to zero.

For the second term of (3.44) it holds that

[TABLE]

where

[TABLE]

Again the summands with more than $\tau+1$ indices disappear and a summand with $\ell\leq\tau+1$ indices is bounded by

[TABLE]

by similar arguments as for (3.48), so that we obtain

[TABLE]

The term in the maximum is either increasing or decreasing with $\ell$ or it is equal to $o((np)^{4\tau+4-\tilde{s})/\tilde{s}}$ , if $n\asymp(np)^{4/\tilde{s}}$ . We deduce

[TABLE]

where we used that $(s+2)/\tilde{s}>1$ for an $\varepsilon<1$ and $\sigma_{n}^{2}\asymp p/n+(p/n)^{3}$ , if ${\mathbb{E}}[X^{4}]\neq 1$ . In the Bernoulli case the second term of (3.44) is zero.

In summary we derive by (3.50), (3.52), (3.55) and noting that all estimates are uniform in $I_{1},\ldots,I_{d}$ that

[TABLE]

where $\tau$ can be chosen as the smallest even integer larger than

[TABLE]

Since the number of possible choices for $I_{1}<\ldots<I_{d}$ is $\binom{p(p-1)/2}{d}\leq p^{2d}$ and $p=O(n^{(s-2)/4})$ we conclude that

[TABLE]

as $n\to\infty$ , which finishes the proof of (3.43).

4. Auxiliary results

Throughout this section $p=p_{n}$ is a sequence of positive integers tending to infinity as $n\to\infty$ . Furthermore, let $X,(X_{it})_{i,t\geq 1}$ be iid random variables with ${\mathbb{E}}[X]=0$ and ${\mathbb{E}}[X^{2}]=1$ .

Lemma 4.1.

Assume ${\mathbb{E}}[|X|^{s}]<\infty$ for some $s\geq 4$ . For a positive sequence $(\beta_{n})_{n}$ , which tends to zero and satisfies $\beta_{n}\gg({\mathbb{E}}[|X|^{s}\mathds{1}_{\{|X|>\beta_{n}(np)^{1/s}\}}])^{1/s}$ , set

[TABLE]

Then it holds that

[TABLE]

where $\sigma_{n}$ is defined in (2.6).

Proof.

We set

[TABLE]

The probability of $Q_{n}$ tends to zero as $n\to\infty$ , since by the union bound and Markov’s inequality

[TABLE]

where the properties of the sequence $\beta_{n}$ were used in the last step. Therefore, we have for $\delta>0$ ,

[TABLE]

We observe that

[TABLE]

and similarly it follows

[TABLE]

Thus we obtain for $1\leq i<j\leq p$ ,

[TABLE]

and as $T_{11}^{2}-\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{11}^{2}$ is nonnegative we get

[TABLE]

Thereby, it holds

[TABLE]

In the case ${\mathbb{E}}[X^{4}]\neq 1$ , the right-hand side tends to zero as $\sigma_{n}\asymp(p/n)^{1/2}+(p/n)^{3/2}$ . In the symmetric Bernoulli case $X^{2}=1$ , the probability in (4.58) is zero, establishing the desired result. ∎

Lemma 4.2.

Let $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{11}$ , $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}$ and $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{13}$ be defined as in (3.14). Under the assumptions of Theorem 2.2 it holds, as $n\to\infty$ ,

[TABLE]

Proof.

Recalling that $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{X}=X\mathds{1}_{\{|X|\leq(np)^{1/4}\beta_{n}\}}$ , we get by (4.59) and (4.60) with $s=4$

[TABLE]

For higher moments we obtain

[TABLE]

After these preparations we will now prove the first claim of the lemma. By the multinomial theorem we have

[TABLE]

where we used (4.61) and (4.62) in the last step. The same arguments also yield

[TABLE]

Since ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{11}^{2}]^{2}=n^{4}+2n^{3}({\mathbb{E}}[X^{4}]-1)+o(n^{3})$ , we conclude that

[TABLE]

which proves the first part of the lemma. For the second part we consider $\operatorname{Var}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{2})={\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{4}]-{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{2}]^{2}$ . Using (4.61) and (4.62), we get

[TABLE]

as well as

[TABLE]

which implies ${\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{2}]^{2}=n^{2}+o(n)$ . Since $\operatorname{Var}(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{2})={\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{4}]-{\mathbb{E}}[\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{12}^{2}]^{2}$ the second part of the lemma is established.

To show part three of the lemma, we compute, using (4.61), that

[TABLE]

In conjunction with $\eqref{t12}$ , we then obtain

[TABLE]

Finally, using (4.61) and (4.62), we have

[TABLE]

Therefore, we obtain in combination with (4.63) and (4.64),

[TABLE]

completing the proof of the lemma. ∎

Lemma 4.3.

Let $(\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{T}_{ij})$ be defined as in (3.14) an write $\mathcal{F}_{j}$ for the sigma algebra generated by $\{\tilde{\mathbf{x}}_{1},\ldots,\tilde{\mathbf{x}}_{j}\}$ , where $\tilde{\mathbf{x}}_{i}=(X_{i1},\ldots,X_{in})$ . Under the conditions of Theorem 2.2 it holds for $1\leq i_{1},i_{2}<j\leq p$ that

[TABLE]

Proof.

By straightforward calculation we get

[TABLE]

Additionally it holds that

[TABLE]

and therefore

[TABLE]

The lemma follows from (4.65) and (4.66). ∎

Lemma 4.4.

Let $Y$ and $Y^{\prime}$ be real-valued random variables and $B$ an arbitrary event. Then it holds for every $y\in\mathbb{R}$ and $\delta>0$ that

[TABLE]

Proof.

We have

[TABLE]

and

[TABLE]

∎

Bibliography47

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Anderson, C. W., and Turkman, K. F. The joint limiting distribution of sums and maxima of stationary sequences. J. Appl. Probab. 28 , 1 (1991), 33–44.
2[2] Anderson, C. W., and Turkman, K. F. Sums and maxima of stationary sequences with heavy tailed distributions. Sankhyā Ser. A 57 , 1 (1995), 1–10.
3[3] Bai, Z., and Silverstein, J. W. Spectral Analysis of Large Dimensional Random Matrices , second ed. Springer Series in Statistics. Springer, New York, 2010.
4[4] Bingham, N. H., Goldie, C. M., and Teugels, J. L. Regular Variation , vol. 27 of Encyclopedia of Mathematics and its Applications . Cambridge University Press, Cambridge, 1987.
5[5] Cai, T. T. Global testing and large-scale multiple testing for high-dimensional covariance structures. Annual Review of Statistics and Its Application 4 (2017), 423–446.
6[6] Cai, T. T., and Jiang, T. Phase transition in limiting distributions of coherence of high-dimensional random matrices. J. Multivariate Anal. 107 (2012), 24–39.
7[7] Chen, D., and Feng, L. Asymptotic independence of the quadratic form and maximum of independent random variables with applications to high-dimensional tests. ar Xiv preprint ar Xiv:2204.08628 (2022).
8[8] Chow, T. L., and Teugels, J. L. The sum and the maximum of i.i.d. random variables. In Proceedings of the Second Prague Symposium on Asymptotic Statistics (Hradec Králové, 1978) (1979), North-Holland, Amsterdam-New York, pp. 81–92.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

Asymptotic independence of point process and Frobenius norm of a large sample covariance matrix

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. Introduction

1.1. Related literature on sums, maxima and point processes

1.2. Structure of this paper

1.3. Notation

2. Main results

2.1. Asymptotic distributions of point process and Frobenius norm of the sample covariance matrix

Theorem 2.1**.**

Theorem 2.2**.**

Remark 2.3**.**

Theorem 2.4**.**

Remark 2.5**.**

2.2. Joint limiting distribution of point process and Frobenius norm of the sample covariance matrix

Definition 2.6**.**

Theorem 2.7**.**

Corollary 2.8**.**

Proof.

2.3. Main challenges in the proof of Theorem 2.7

Theorem 2.9** (Extension of Kallenberg’s Theorem).**

2.4. An application to independence testing

Corollary 2.10**.**

3. Proofs

3.1. Proof of Theorem 2.2

Proof of (1).

Proof of (2).

3.2. Proof of Theorem 2.4

3.3. Proof of Theorem 2.9

3.4. Proof of Theorem 2.7

4. Auxiliary results

Lemma 4.1**.**

Proof.

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

Lemma 4.4**.**

Proof.

Theorem 2.1.

Theorem 2.2.

Remark 2.3.

Theorem 2.4.

Remark 2.5.

Definition 2.6.

Theorem 2.7.

Corollary 2.8.

Theorem 2.9 (Extension of Kallenberg’s Theorem).

Corollary 2.10.

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Lemma 4.4.