This paper proves a joint limit theorem for the off-diagonal entries point process and Frobenius norm of a large sample covariance matrix, revealing asymptotic independence and different limiting laws depending on moment conditions.
Contribution
It establishes the first joint convergence result for dependent point processes and sums in the non-Gaussian setting, extending Kallenberg's theorem.
Findings
01
Central limit theorem for Frobenius norm with finite fourth moment
02
Stable law for Frobenius norm with infinite variance
03
Asymptotic independence between point process and Frobenius norm
Abstract
A joint limit theorem for the point process of the off-diagonal entries of a sample covariance matrix S, constructed from n observations of a p-dimensional random vector with iid components, and the Frobenius norm of S is proved. In particular, assuming that p and n tend to infinity we obtain a central limit theorem for the Frobenius norm in the case of finite fourth moment of the components and an infinite variance stable law in the case of infinite fourth moment. Extending a theorem of Kallenberg, we establish asymptotic independence of the point process and the Frobenius norm of S. To the best of our knowledge, this is the first result about joint convergence of a point process of dependent points and their sum in the non-Gaussian case.
Tables1
Table 1. Table 1. Overview of the asymptotic results about max S i j , N n subscript 𝑆 𝑖 𝑗 subscript 𝑁 𝑛 \max S_{ij},N_{n} and tr ( 𝐒 2 ) tr superscript 𝐒 2 \operatorname{tr}({\mathbf{S}}^{2}) .
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications · Point processes and geometric inequalities · Geochemistry and Geologic Mapping
Full text
Asymptotic independence of point process and Frobenius norm of a large sample covariance matrix
Johannes Heiny
Fakultät für Mathematik,
Ruhruniversität Bochum,
Universitätsstrasse 150,
D-44801 Bochum,
Germany
A joint limit theorem for the point process of the off-diagonal entries of a sample covariance matrix S, constructed from n observations of a p-dimensional random vector with iid components, and the Frobenius norm of S is proved. In particular, assuming that p and n tend to infinity we obtain a central limit theorem for the Frobenius norm in the case of finite fourth moment of the components and an infinite variance stable law in the case of infinite fourth moment. Extending a theorem of Kallenberg, we establish asymptotic independence of the point process and the Frobenius norm of S. To the best of our knowledge, this is the first result about joint convergence of a point process of dependent points and their sum in the non-Gaussian case.
Key words and phrases:
Gumbel distribution, extreme value theory, maximum entry, point process, central limit theorem, stable distribution, random matrix, joint convergence, asymptotic independence
Johannes Heiny’s and Carolin Kleemann’s research was partially supported by the Deutsche Forschungsgemeinschaft (DFG) via RTG 2131 High-dimensional Phenomena in Probability – Fluctuations and Discontinuity.
1. Introduction
Over recent years the analysis of high-dimensional data has emerged as an important and active research area driven by a wide range of applications in various fields such as genomics, medical imaging, signal processing, financial engineering and social science.
To study large data sets, for instance, in brain connectivity analysis or in gene expression analysis (see [41, 46, 18]), knowledge of the dependence structure plays a central role. Interpreting the data as observations of a p-dimensional random vector, dependence between the components of the vector is often estimated by covariance/correlation statistics and different functions are used to aggregate these estimates of the pairwise dependencies. For example, [40, 37, 45] and [29] propose sum-type tests based on the Frobenius norm which are usually powerful against dense alternatives.
Further very popular methods of aggregating estimates of the pairwise dependencies are maximum-type tests, which have good power properties against sparse alternatives and have been investigated for various
covariance/correlation statistics in [27, 47, 31, 45, 6, 20, 12, 23]
and [21] among others.
Since in practice it is difficult to decide whether the underlying covariance matrix is sparse or dense, it is useful to combine these two types of test statistics to cover both cases [17, 15, 21, 44, 7]. Therefore, an understanding of the joint asymptotic behavior of these test statistics is needed.
The objective of this paper is to contribute to this line of research by providing asymptotic theory for the joint distribution of sum-type statistics and generalized maximum-type statistics of a large sample covariance matrix.
For a sample x1,…,xn from the p-dimensional population x with independent and identically distributed (iid) components with mean zero and variance one,
the sample covariance matrix S is given by
[TABLE]
Throughout this paper, we assume that the dimension p=pn is a positive integer sequence tending to infinity together with the sample size n. Thus, the p×p-matrix S is a high-dimensional
random matrix whose asymptotic properties are used, for example, in independence testing.
Sum-type and maximum-type statistics based on S are given by the (squared) Frobenius norm and the maximum off-diagonal entry of S,
[TABLE]
We are interested in the joint limiting distribution of tr(S2) and the sequence of point processes of the off-diagonal entries of S
[TABLE]
where εx denotes the Dirac measure in x∈R and
[TABLE]
It is well-known that the point process in (1.2) contains information about all order statistics of the Sij’s (see [14]). The distribution of the maximum can be recovered from the identity {Nn((x,∞))=0}={maxi<jdp(nSij−dp)≤x}, x∈R. In this sense, the point process Nn is a natural and meaningful generalization of the maximum.
In Table 1 an overview of the available results about the convergence of maxSij,Nn and tr(S2) and the novel contributions of this paper (marked in blue) is given. For the reader’s convenience, the table contains the limit distributions themselves and precise references. It is worth mentioning that the distinction between finite and infinite fourth moment of X, which is a generic random variable with the same distribution as the components of x, results from the sum-type statistic tr(S2).
1.1. Related literature on sums, maxima and point processes
The joint behavior of the sum and the maximum of a sequence of real-valued random variables has been studied before, motivated for example by the evaluation of wind speed data, which is usually available in the form of the maximum wind speed and the average wind speed during a day or an hour. For iid random variables (Yi)i≥1 we set
[TABLE]
If the distribution function F of Y1 belongs to the sum domain of attraction of the normal distribution and the maximum domain of attraction of an extreme value distribution, [8] showed that (Sn,Mn) converges in distribution to a limit (S,M), where S and M are independent and not degenerated. They also proved that if
[TABLE]
where L is a slowly varying function and 0<q+≤q++q−=1, then (Sn,Mn) converges to a limit (S,M), where S and M are dependent and they provide a hybrid characteristic distribution function of (S,M).
The papers [1, 2, 25] generalized these results to strongly mixing stationary random variables. For stationary normal random variables [24] and [32] proved asymptotic independence under certain correlation assumptions. Recently, asymptotic independence of a quadratic form in and the maximum of independent random variables was proved in [7] and asymptotic independence of the sum and the maximum of dependent normal random variables that need not be stationary or strongly mixing but fulfill conditions on the smallest and largest eigenvalue of their covariance matrix was shown in [16].
For a triangular array of normal distributed random variables the asymptotic independence of the point process of exceedances and the partial sum was considered in [26, 42] and extended to the multivariate case in [35]. Moreover, [19] established asymptotic independence of the point processes of clusters and the partial sums of bivariate stationary Gaussian triangular arrays. Asymptotic independence of other quantities derived from the sample covariance matrix S has also been considered in a variety of settings. For example, the asymptotic independence of the maximum of sample correlations and the sum of the squared sample correlations between the residuals from the ordinary least squares is proven in [17], while [30] showed asymptotic independence of the largest sample eigenvalues and the trace of S.
1.2. Structure of this paper
This paper is structured as follows. Section 2 contains our main results about the point process of the off-diagonal entries of S and the Frobenius norm of S. Under finite fourth moment of X the Frobenius norm satisfies a CLT (Theorem 2.2), while in the case of infinite fourth moment we obtain a stable limit law (Theorem 2.4).
The main result of this paper is Theorem 2.7, which shows asymptotic independence of the point process and the Frobenius norm of S. The challenges in the proof of this result and our novel technical contributions are outlined in Section 2.3, while Section 2.4 presents an application to independence testing. The proofs are deferred to Section 3 and helpful auxiliary results are given in Section 4.
1.3. Notation
Convergence in distribution (resp. probability) is denoted by →d (resp. →P), equality in distribution by =d, and unless explicitly stated otherwise all limits are for n→∞. For sequences (an)n and (bn)n we write an=O(bn) if an/bn≤C for some constant C>0 and every n∈N, and an=o(bn) if limn→∞an/bn=0. Additionally, we use the notation
an∼bn if limn→∞an/bn=1.
2. Main results
Consider a sample x1,…,xn from the p-dimensional population x with iid components with generic element X satisfying E[X]=0 and E[X2]=1. We will work in the high-dimensional setting, where the dimension p=pn is some positive integer sequence tending to infinity as n→∞.
We aim to study the joint asymptotic behavior of the point processes Nn of the off-diagonal entries (see (1.2)) and the Frobenius norm of the sample covariance matrix S=(Sij)=n1∑t=1nxtxt⊤.
2.1. Asymptotic distributions of point process and Frobenius norm of the sample covariance matrix
First we consider the convergence in distribution of the sequence of point processes Nn towards a Poisson random measure. For a detailed background on weak convergence of point processes we refer to [38, 10].
The following result can be found in [23, Theorem 3.2] and [22, Theorem 4.1].
Theorem 2.1**.**
[23, Theorem 3.2]**
Assume that there exist s>2 and ε>0 such that E[∣X∣s+ε]<∞ and let p=pn→∞ satisfy p=O(n(s−2)/4), as n→∞.
Then it holds that Nn→dN,
where N is a Poisson random measure with mean measure μ(x,∞)=e−x for x∈R.
The Poisson random measure N with mean measure μ(x,∞)=e−x for x∈R has the representation
[TABLE]
where Γi=E1+⋯+Ei, i≥1, and (Ei) is a sequence of iid standard exponential random variables [38].
Our second object of interest is the squared Frobenius norm of the sample covariance matrix, that is ∥S∥F2:=∑i,j=1pSij2=tr(S2).
In order to study its asymptotic distribution we define
[TABLE]
where
[TABLE]
The next result provides a CLT for tr(S2) in the general case that pn→∞ and E[X4]<∞.
Theorem 2.2**.**
If E[X4]<∞ and p=pn→∞, then as n→∞ we have Zn→dZ for a standard normal random variable Z.
Remark 2.3**.**
Under special assumptions on X and the growth of p, the behavior of Zn can be deduced from CLTs for so-called linear spectral statistics of sample covariance matrices: for example, [3, Theorem 9.10] and [34, Theorem 1.4] in the case p/n→C∈(0,∞) and E[X4]<∞, or [36, Theorem 3.1] in the case n2/p=O(1) if E[∣X∣6+δ]<∞ for some δ>0.
In contrast to the convergence of the point processes Nn in Theorem 2.1, the CLT in Theorem 2.7 requires the existence of the fourth moment of X. To characterize the asymptotic distribution of tr(S2) in the case E[X4]=∞ we need to assume that X is a regularly varying random variable.
We say that a random variable X (or its distribution) is regularly varying with index α>0 if
[TABLE]
where L is a slowly varying function, i.e., limx→∞L(tx)/L(x)=1 for t>0. Examples of regularly varying distributions are the Pareto distribution with parameter α and the t-distribution with α degrees of freedom.
For a regularly varying random variable X with index α it holds that E[∣X∣β]<∞ if β<α and E[∣X∣β]=∞ if β>α. It is well–known that the sequence (ak)k defined through
[TABLE]
is of the form ak=k1/αℓ(k), where ℓ is a slowly varying function. For further properties of regularly varying functions we refer to [4, 39]. In the next theorem we consider the case that X has finite variance but infinite fourth moment.
Theorem 2.4**.**
Let X have a regularly varying distribution with index α∈(2,4) and assume that p=pn→∞. Then it holds that
[TABLE]
where ζα/4 is a non-degenerated, α/4-stable random variable with characteristic function
[TABLE]
where i is the imaginary unit and cα is a constant only depending on α.
Noticing that ∑i,tXit4 is a sum of iid regularly varying random variables with index α/4∈(1/2,1), we obtain an α/4-stable limit distribution after proper normalization [39].
Remark 2.5**.**
(a) In some cases it is possible to replace tr(S) by its expectation E[tr(S)]=p in (2.8). For example, if limn→∞p/n∈(0,∞), we get for δ>0
[TABLE]
where Theorem A1 of [11] and the fact that X2−1 is regularly varying with index α/2 were used for the asymptotic equivalence in the last line.
(b) For α=4 the limit in (2.8) does not hold in general, which we shall illustrate in the case E[X4]<∞. As tr(S) is a sum of iid random variables with finite variance, it satisfies a CLT. Furthermore, Theorem 2.2 states that tr(S2) is also asymptotically normal. Along the lines of the proof of Theorem 2.2 one can show joint asymptotic normality of tr(S) and tr(S2). For the sake of brevity we omit a proof, but we mention that in the special case limn→∞p/n∈(0,∞) joint asymptotic normality of tr(S) and tr(S2) was established in [43, Lemma 2.2].
2.2. Joint limiting distribution of point process and Frobenius norm of the sample covariance matrix
In this subsection we are interested in the joint limiting distribution of Zn in (2.5) and the point process Nn in (1.2).
For this purpose we start by giving a definition of asymptotic independence; c.f. [26, p. 284].
For the Poisson random measure N defined in (2.4) we will need the collection of sets
[TABLE]
where ∂B is the boundary of B.
Definition 2.6**.**
Let (Yn)n be a sequence of real-valued random variables, which converges to the random variable Y in distribution. Additionally, let (Nn)n be a sequence of point processes on R, which converges to the point process N in distribution. We call (Yn)n and (Nn)n asymptotically independent, if and only if for every y∈R, B1,…,Bk∈BN and l1,…,lk∈N0:=N∪{0}
[TABLE]
Now we state our main result about the joint convergence of (Zn,Nn).
Theorem 2.7**.**
Assume that there exist s≥4 and ε>0 such that E[∣X∣s+ε]<∞ and let p=pn→∞ satisfy p=O(n(s−2)/4), as n→∞.
Then it holds for the standardized traces (Zn)n defined in (2.5) and the point processes (Nn)n in (1.2) that
[TABLE]
where Z∼N(0,1), N is a Poisson random measure with mean measure μ(x,∞)=e−x,x∈R, and Z and N are independent.
Theorem 2.7 only requires the conditions of Theorems 2.1 and 2.2 which are both formulated under minimal assumptions, c.f. [22].
Note that if E[X4]=∞ and consequently tr(S2) does not converge to the normal distribution (see Theorem 2.4), our methods in the proof of Theorem 2.7 cease to work. Thus the joint limit behavior of Nn and tr(S2) is still an open problem in the case of regularly varying X with index α∈(2,4). In this case the dominating part of n2tr(S2) is ∑i=1p∑t=1nXit4 and hence the sum of iid random variables which are regularly varying with index α/4. In [8] it is shown that this sum is not asymptotically independent of the maximum of the Xit4. Therefore, we conjecture that Zn and Nn will not be asymptotically independent anymore.
As a consequence of Theorem 2.7 we obtain the asymptotic independence of Zn and a fixed number of upper order statistics of the random variables (dp(nSij−dp))1≤i<j≤p, which we denote by
[TABLE]
Corollary 2.8**.**
Let Z∼N(0,1) be independent of an iid sequence (Ei)i≥1 of standard exponentially distributed random variables and set Γi:=E1+…+Ei.
Under the conditions of Theorem 2.7 and for fixed k≥1 it holds
[TABLE]
where y,x1,…,xk∈R.
Proof.
Since Nn(x,∞) is the number of pairs (i,j) with 1≤i<j≤p, for which (dp(nSij−dp))∈(x,∞), we get by Theorem 2.7
[TABLE]
In view of the representation N=d∑i=1∞ε−logΓi
we obtain
In this subsection we describe the key challenges in the proof of Theorem 2.7 and our novel technical contributions.
The distribution of a point process Nn is determined by the family of the distributions of the finite-dimensional random vectors
(Nn(B1),…,Nn(Bk)) for any choice of suitable Borel sets B1,…,Bk; see [10, Proposition 6.2.III].
The collection of these distributions is called the finite-dimensional distributions of Nn. Due to the dependence of the (Sij) a direct analysis of the finite-dimensional distributions of Nn is intractable. The same applies to the Laplace functional of Nn which determines the distribution of Nn completely and can be seen as a similar tool for a point process as the characteristic function for a real-valued random variable.
Fortunately, Kallenberg proved a sufficient condition for the weak convergence of a sequence of point processes Nn towards N, which is often much easier to verify than the convergence of the finite-dimensional distributions or the Laplace functionals. More precisely, he showed that if N is a simple point process (such as (2.4)), then it is enough to ensure that E[Nn(I)] converges to E[N(I)] for any I∈J and that the probability of the event {Nn(U)=0} converges to the probability of the event {N(U)=0} for any U∈U; see, for instance, [28, p. 35, Theorem 4.7] or [14, p. 233, Theorem 5.2.2].
We define U as the set of finite unions of intervals and J as the set of intervals in R.
Therefore, instead of showing the convergence of the random vector (Nn(B1),…,Nn(Bk)) for any k≥1 and B1,…,Bk∈BN, it is enough to prove the convergence of the probability of the occurrence of points in finite unions of intervals, which often greatly simplifies the proof. Our Theorem 2.9 below contains a similarly helpful tool, which will be essential for studying the joint asymptotic distribution of Zn and the point process Nn.
Theorem 2.9** (Extension of Kallenberg’s Theorem).**
Let (Yn)n be a sequence of real-valued random variables converging in distribution to a random variable Y. In addition, let N be a simple point process on R independent of Y
and let (Nn)n be a sequence of point processes.
If the following two conditions
(K1)
n→∞limsupE[Nn(I)]≤E[N(I)], I∈J,
2. (K2)
n→∞limP(Yn≤y,Nn(U)=0)=P(Y≤y)P(N(U)=0), y∈R,U∈U
hold, then Nn→dN and (Nn)n and (Yn)n are asymptotically independent.
Theorem 2.9 essentially shows that the sequence of random variables Yn and a sequence of point processes Nn are asymptotically independent if the events {Yn≤y} and {Nn(U)=0} are asymptotically independent for any y∈R and U∈U. Since Theorem 2.9 requires mild assumptions on the real-valued random variables Yn and the point processes Nn, it is applicable to a wide variety of other settings. Moreover, as seen in Corollary 2.8, it also yields asymptotic independence of the points of Nn and Yn.
In Theorem 2.7 the joint convergence of the point process Nn of the entries of the sample covariance matrix S and the standardized Frobenius norm Zn of S is considered, which to the best of our knowledge has not previously been studied. We would like to mention that results about the joint convergence of a point process of dependent points and their sum are only available in the Gaussian case [26, 35, 42], whose techniques are not applicable to non-Gaussian sequences.
In view of Definition 2.6, the main challenge in our case is to show the convergence in distribution of the random vectors (Zn,Nn(B1),…,Nn(Bk)) for any k≥1 and B1,…,Bk∈BN. Note that every summand of Zn is dependent on a lot of points of Nn.
To overcome this challenge we developed a novel technical tool Theorem 2.9, which allows us to reduce the convergence in distribution of the random vector above to the following two conditions:
(K1’)
n→∞limsupE[Nn(I)]=E[N(I)] for any interval I,
2. (K2’)
n→∞limP(Zn≤y,Nn(U)=0)=Φ(y)P(N(U)=0) for any finite union of intervals U.
Condition (K1’) can be shown through normal approximation to large deviation probabilities. The challenging part is condition (K2’), which controls the dependence between Zn and Nn.
The advantage of considering P(Zn≤y,Nn(U)=0), respectively P(Zn≤y,Nn(U)=0), instead of probabilities for the vector (Zn,Nn(B1),…,Nn(Bk)) is that we can write
[TABLE]
where BI={dp(nSij−dp)∈U} for I=(i,j)∈Λn={(i,j):1≤i<j≤p}.
Then the Bonferroni bounds yield for k≥1
[TABLE]
where W_{n,d}:=\sum_{I_{1}<\ldots<I_{d}}{\mathbb{P}}\big{(}Z_{n}\leq y,\bigcap_{\ell=1}^{d}B_{I_{\ell}}\big{)} is a sum of probabilities of the intersections of finitely many BI’s.
The number of summands of Zn, which are dependent on BI1,…,BId, is of order p. We identify these dependent summands Zn,d and show that they are negligible, i.e.,
[TABLE]
The remaining summands Zn−Zn,d are independent from BI1,…,BId and therefore
[TABLE]
By first letting n→∞ and then k→∞, we can show that
[TABLE]
from which we deduce (K2’). The detailed proof of Theorem 2.7 will be presented in Section 3.4.
2.4. An application to independence testing
Consider a sample y1,…,yn from the p-dimensional population Σ1/2x, where Σ is an (unknown) non-random positive definite p×p matrix and x has iid components with mean zero and variance one.
The largest off-diagonal entry of the sample covariance matrix S=(Sij)=n1∑t=1nytyt⊤
is a popular statistic for structural tests about properties of Σ; we refer to the review paper [5] for an extensive summary and detailed references. We are interested in the null hypothesis of independence H0:Σ=Ip.
In what follows we will present a rather simplistic extension of the classical maximum-type tests as an application of Theorem 2.7. This application is by no means perfect, it rather serves as an illustration of the potential of the asymptotic independence derived in Theorem 2.7 regarding statistical tests111A thorough analysis will be topic of future research by the authors..
Under the null hypothesis Theorem 2.7 studies the joint limiting distribution of (Zn,Nn), where Zn is given in (2.5) and
[TABLE]
From (2.9) recall the definition of
Gn,(1)≥⋯≥Gn,(p(p−1)/2).
Assuming the conditions of Theorem 2.7, the asymptotic independence of Zn and Nn implies for fixed k≥1 (c.f. Corollary 2.8) that
[TABLE]
where Z∼N(0,1) is independent of the iid sequence (Ei)i≥1 of standard exponentially distributed random variables and Γi:=E1+…+Ei.
Next we introduce a variety of different statistics:
[TABLE]
For the first statistic it holds that T1→d−logΓ1, which is standard Gumbel distributed with distribution function Λ(x)=exp(−e−x). Recall the well–known fact that
[TABLE]
where the right-hand vector consists of the order statistics of k iid uniform random variables on [0,1]. In combination with (2.11) we get for the other statistics, as n→∞,
[TABLE]
Next we construct the random variables
[TABLE]
where F2,k, F3,k, F4,k are the distribution functions of log(U(1)/U(k)), max1≤i≤k−1log(U(k−i)/U(k−i+1)) and ∑i=1k−1(log(U(k−i)/U(k−i+1)))2, respectively, and Φ denotes the standard normal distribution function. By (2.11), PZn is asymptotically independent of PT1,PT2,k,PT3,k,PT4,k. Additionally, each of these random variables converges to the uniform distribution on [0,1].
We propose the following four test statistics
[TABLE]
The null hypothesis H0 is rejected by test i∈{1,…,4}, whenever
[TABLE]
The next result establishes the asymptotic distribution of Ti,n from which we deduce that the tests in (2.13) have asymptotic level β∈(0,1).
Corollary 2.10**.**
Under the conditions of Theorem 2.7, it holds that
[TABLE]
where U and V are independent random variables uniformly distributed on [0,1].
Since P(min{U,V}≤x)=2x−x2, x∈[0,1], it follows for β∈(0,1)
[TABLE]
3. Proofs
To simplify the notation in the proofs we will write an∼bn for real-valued sequences an and bn if limn→∞an/bn=1, an≫bn if limn→∞bn/an=0, an≲bn if limsupn→∞an/bn=C′ for some constant C′∈[0,∞) and
an≍bn if an≲bn and bn≲an. Additionally, throughout the proofs C denotes a positive constant which may vary from line to line.
Unless explicitly stated otherwise, all limits are for n→∞.
First, notice that the magnitude of σn, defined in (2.6), might be different in the cases E[X4]=1 and E[X4]=1. If E[X4]=1, Markov’s inequality yields for ε>0 that
[TABLE]
such that X2=1 almost surely, which due to E[X]=0 is only possible if X follows a symmetric Bernoulli distribution, i.e., P(X=−1)=P(X=1)=1/2. We will often consider the case E[X4]=1 separately in the course of this proof.
To prove the statement of Theorem 2.2, we use a central limit theorem for martingale differences. As we need the existence of higher moments, we truncate the random variables Xit for i=1,…,p and t=1,…,n in an appropriate way.
Let (βn)n be a positive sequence, which tends to zero and suffices βn≫(E[∣X∣4\mathds1{∣X∣>βn(np)1/4}])1/4. Notice, that such a sequence exists since we can choose a sequence βn′, which tends zero sufficiently slowly such that βn′(np)1/4→∞. Now we set βn′′:=(E[∣X∣4\mathds1{∣X∣>βn′(np)1/4}])1/4, which also tends to zero as n→∞, and choose βn≫max{βn′,βn′′}.
For this sequence (βn)n, we set
[TABLE]
Since E[X4]<∞ an application of Lemma 4.1 with s=4 yields that it suffices to verify (2.5) with Zn replaced by
[TABLE]
It will be convenient to write \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn as a sum of martingale differences. In what follows, the notation
[TABLE]
will be helpful. Noting that \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tij=\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tji, we start by writing \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn as
[TABLE]
Using E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tij2∣x~i]=∑t=1n∑u=1n\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xit\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111XiuE[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xjt\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xju]=E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Ti,i+12∣x~i] for j=i+1,…,p we have
[TABLE]
Setting
[TABLE]
we get
[TABLE]
where (Mj)j≥1 is a martingale difference sequence with respect to the filtration (Fj)j≥0, where Fj is the sigma algebra generated by {x~1,…,x~j}. Indeed, we have E[Mj∣Fj−1]=0.
By the Lindeberg-Feller theorem for martingales (see, for example, [13, Theorem 8.2.4, p. 344]), the convergence \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn→dN(0,1) is implied by the following two assertions:
(1)
An:=n4σn21∑j=1pE[Mj2∣Fj−1]→P1,n→∞,
2. (2)
n8σn41∑j=1pE[Mj4∣Fj−1]→P0,n→∞.
Proof of (1).
To prove (1) we will show E[An]→1 and Var(An)→0 as n→∞. Since (Mj) is a martingale difference sequence, we have E[An]=Var(\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn) and thus we get
[TABLE]
where we used
[TABLE]
and that Cov(\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tij2,\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tkl2)=0 if {i,j}∩{k,l}=∅. In the case E[X4]=1 we then obtain by Lemma 4.2 that
[TABLE]
since σn2≍p/n+(p/n)3. If E[X4]=1 one can similarly check that E[An]→1.
Now we turn to Var(An). To this end, we need the moments of the truncated random variable \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X and note that
[TABLE]
Similarly we obtain
[TABLE]
and for higher moments we get
[TABLE]
From the definitions of Mj and the sigma algebra Fj−1, we deduce that there exists some constant Cn′ only depending on n such that
[TABLE]
The second term can be written as
[TABLE]
where Kt1,t2,1 can be expressed as
[TABLE]
Notice that Kt1,t2,1=O(1) as n→∞ since the summands are [math] if {t1,t2}∩{t3,t4}=∅. Otherwise if t3=t4 we have E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2t3\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2t4]=E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]2=o((np)−3/2). The means are bounded in every case.
With the third term of (3.24) we proceed similarly
[TABLE]
where Kt1,t2,2 can be written as
[TABLE]
As n→∞ it holds that Kt1,t2,2=O(n+(np)1/2) because the expectation is zero if {t1,t2}∩{t3,t4}=∅, bounded if t3=t4 and CE[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X6]=o((np))1/2 if t1=t2=t3=t4.
Hence, we are able to write An as
[TABLE]
where Cn is a constant only depending on n and on the distribution of X and Kt1,t2=8(p−j)Kt1,t2,1+4Kt1,t2,2=O(n+p).
The expectation of the last term can be written as
[TABLE]
Using Lemma 4.3 we see that, for sequences Cn,1,Cn,2 and Cn,3 tending to constants,
[TABLE]
In view of
[TABLE]
it suffices to show that the variances of ξn,1, ξn,2, ξn,3, ξn,4 and ξn,5 tend to zero as n→∞.
We start by bounding the variance of ξn,1:
[TABLE]
Since the summands are zero if ∣{t1,t2,t3,t4}∣=4, bounded by E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X], which tends to zero as n→∞ if ∣{t1,t2,t3,t4}∣=3 and bounded above by a constant else, we get
[TABLE]
For the variance of ξn,2 it holds that
[TABLE]
As the covariance above is equal to E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X8]−(E[X4])2=o(np) if i1=i2=i3, E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X6]E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2]−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X4](E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2])2=o((np)1/2) if ∣{i1,i2,i3}∣=2 and bounded above by a constant if ∣{i1,i2,i3}∣=3, we find
[TABLE]
By similar arguments, we obtain
[TABLE]
where the covariance above is E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X6]E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2]−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X3]2E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]2=o((np)1/2) if i1=i2=i3=i4 and t1=t3 and t2=t4, zero if ∣{i1,…,i4}∣=4 or ∣{t1,…,t4}∣=4 and bounded by a constant in the remaining cases. Therefore, we get using E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]2=o((np)−3/2) that
[TABLE]
For the variance of ξn,4 we have
[TABLE]
Again the covariance above is zero, if ∣{i1,…,i4}∣=4 or ∣{t1,…,t4}∣=4. If ∣{i1,…,i4}∣=3 it is bounded by E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2]2E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]4−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]8=o((np)−3) and by a constant in the remaining cases. It follows \operatorname{Var}(\xi_{n,4})=O\big{(}\frac{p^{4}}{n^{5}\sigma_{n}^{4}}\big{)}.
Next, since Var(\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xi1t1\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xi1t2\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xi2t1\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xi2t3)≲1 for ∣{t1,…,t4}∣=3 and E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]4=o((np)−3) we obtain Var(ξn,5)=o(p3n−5σn−4).
Finally, we combine our variance estimates. In the case E[X4]=1, it holds σn4≍(p/n)2+(p/n)6 which implies
[TABLE]
In the Bernoulli case E[X4]=1, we have σn4≍(p/n)4. As Kt1,t2,1 and Kt1,t2,2 are zero in this case, ξn,1=0. Since E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]=E[X]=0 and E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X2]2=E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X4]=1 one has ξn,2=ξn,3=ξn,5=0. Repeating the above considerations for ξn,4, one can show that Var(ξn,4)→0 as n→∞, establishing (3.25) in the Bernoulli case as well.
Equation (3.25) concludes the proof of Var(An)→0, as n→∞. In combination with E[An]→1, this proves the desired An→P1.
Proof of (2).
We write
[TABLE]
and bound the fourth moment of Mj in the following way
[TABLE]
First we take a look at
[TABLE]
For the last mean we have
[TABLE]
where we used (3.20) and (3.21) in the last line. Hence, we get after simplifying Mj4
[TABLE]
and by the Marcinkiewicz-Zygmund inequality, see for example [9, Theorem 2, p.386] in combination with Hölder inequality (see also [17, Lemma 2, p.24]), it follows
[TABLE]
To bound the fourth moment of Mj2 we consider the eighth moment of \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T12
[TABLE]
where we used (3.19), (3.20) and (3.21) in the last step. By the Marcinkiewicz-Zygmund inequality (see for instance [9, Theorem 2, p.386]) and Lemma 1 of [17] we conclude that
[TABLE]
Finally, it holds for the fourth moment of Mj5 that
[TABLE]
Observe that if there are more than five different indices t1,…,t8 then one factor in the mean above is independent from the rest and the mean is zero, thus by (3.19), (3.20) and (3.21)
[TABLE]
Consequently, we obtain
[TABLE]
since σn4≍(p/n)6+(p/n)2 if E[X4]=1. In the Bernoulli case it holds that Mj1=Mj3=Mj4=Mj5=0 and E[Mj24]=(j−1)2O(n4). Therefore, n8σn41∑j=1pE[Mj4]=O(p3/(n4σn4))=o(1) as σn4≍(p/n)4, which finishes the proof.
the first term converges to an α/4-stable distribution by [13, Theorem 3.8.2] (see also [39]). By [13, p. 164] this α-stable distribution has the characteristic function
[TABLE]
where cα is a constant only depending on α.
Therefore, it suffices to show that Vn,1, Vn,2 and Vn,3 tend to zero in probability. We will start by showing Vn,3→P0. As X is regularly varying with index α>2, the second moment of X exists by Proposition 1.3.2 of [33]. An application of Markov’s inequality yields for δ>0
[TABLE]
where we used that anp=(np)1/αℓ1(np) for α∈(2,4) (see [4]) and a slowly varying function ℓ1. The last step is a consequence of the following property of slowly varying functions ℓ. By the Potter Bounds, which can be found in Theorem 1.5.6 of [4], it holds that for x sufficiently large and any ϵ>0 and K>1
[TABLE]
To show that Vn,1→P0 we will truncate the random variables Xit2 at sn:=p2/αn2(1+ϵ)/α for a positive ϵ sufficiently small.
Then we have for the event Q:=i=1⋃pt=1⋃n{Xit2>sn} that
[TABLE]
where we used that ∣X∣ is regularly varying with index α∈(2,4). Letting \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xit2:=Xit2\mathds1{Xit2≤sn} and ηit:=\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xit2−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xit2], we get for any δ>0
[TABLE]
where for the last line we used Markov’s inequality and the fact that
[TABLE]
since E[X2\mathds1{X2>sn}]≍snP(X2>sn) by Karamata’s Theorem (see [4]) and due to the Potter bounds (3.26).
Regarding the first term in (3.29), we obtain that
[TABLE]
since E[X4\mathds1{X2≤sn}]≍sn2P(X2>sn) by Karamata’s theorem. Using Karamata’s theorem again and similiar arguments as above, also the second term of (3.29) tends to zero as n→∞.
Therefore, Vn,1→P0 as n→∞.
Notice that Vn,2 is equal to Vn,1 if we exchange the roles of n and p, which does not matter for the proof given above. Hence, it also holds that Vn,2→P0 as n→∞.
Let now y∈R and l1,…,lk∈N0. Let (Qn)n be a sequence of probability measures defined by the distribution functions
[TABLE]
where B1,…,Bk∈BN. We recall that for a point process ξ, Bξ:={Bbounded Borel set:ξ(∂B)=0}. As Rk+1 is a Polish space (Qn)n is tight. Then, by Prokhorov’s Theorem, (Qn)n is relatively compact with respect to convergence in distribution and hence, for every sequence (nm)m∈N in N there exists a subsequence (nmj)j∈N with
[TABLE]
where Y~,N~(B1),…,N~(Bk) are real valued random variables. Since the sets B1,…,Bk are arbitrary, we can also find a subsequence (nmj)j∈N such that (3.30)
holds for any choice of B1,…,Bk∈BN. Let N~ be the point process defined by the random vectors (N~(B1),…,N~(Bk)) for B1,…,Bk∈BN.
Assumption (K2) implies for every U∈U
[TABLE]
Therefore, we get by Lemma 4.6 of [28] that BN⊂BN~ and hence U⊂BN~.
Then, we get
[TABLE]
for every U∈U.
Let R be the set of locally finite measures μ on (R,B), where B consists of all bounded Borel sets with μ(B)∈N0 for all B∈B. Additionally, N is the σ-algebra on R that is generated by the mappings μ↦μ(B),B∈B, i.e., the smallest σ-algebra making these mappings measurable.
We now introduce the Dynkin-system
[TABLE]
As Yn→dY, we have P(Y≤y)=P(Y~≤y) and therefore, R∈D. Moreover, D is closed under proper differences and monotone limits. Let
[TABLE]
By assumption (K2), C⊂D and C is closed under finite intersection. Therefore, by 15.2.1 of [28] it follows that σ(C)⊂D. By Lemmas 2.2, 1.3 and 1.4 of [28] it holds that φ:μ→μ∗ is measurable σ(C)→N, where μ∗(B)=∑s∈B\mathds1[1,∞)(μ{s}) for every B∈B. As σ(C)⊂D we get for every M∈N
[TABLE]
A simple point process μ can be written as
[TABLE]
where I is an index set and the Xi’s are random elements.
Therefore, it holds for every B∈B that
[TABLE]
As N is simple, we get
[TABLE]
for every M∈N and every y∈R. Therefore, we also get P(N∈M)=P(N~∗∈M) for every M∈N. We define the set of measures
[TABLE]
Then, it follows that
[TABLE]
and as B1,…,Bk,l1,…,lk were chosen arbitrarily, we have N=dN~∗. Additionally, for I∈J it holds that as j→∞
[TABLE]
Then, by N=dN~∗, the definition of N~∗, (3.32), 15.4.3 of [28] and assumption (K1) we get
[TABLE]
Therefore, N~ is a.s. simple and consequently
[TABLE]
for every M∈N. Inserting M^, we get
[TABLE]
As the subsequence (nm)m was arbitrary, we conclude for every l1,…,lk∈N0 and B1,…,Bk∈BN
[TABLE]
and therefore (Yn)n and (Nn)n are asymptotically independent.
By the proof of Theorem 2.1 we know that n→∞limE[Nn(I)]=E[N(I)] for every interval I. In view of Theorem 2.9, it suffices to show
[TABLE]
for every y∈R and U∈U, where Φ is the distribution function of the standard normal distribution and U is the set of finite unions of intervals. As Zn→dN(0,1), this is equivalent to
[TABLE]
for every U∈U and y∈R. Throughout this proof, we let U∈U and y∈R be arbitrary.
Recall that p=O\big{(}n^{(s-2)/4}\big{)} for some s≥4 and that ε>0 is such that E∣X∣s+ε]<∞. We need the following notation. For s~=s+ε let (βn)n be a positive sequence, which tends to zero and satisfies βn≫(E[∣X∣s~\mathds1{∣X∣>βn(np)1/s~}])1/s~. Such a sequence exists by similar reasons as in the proof of Theorem 2.2. We set
so that taking the limit δ→0 establishes (3.33) by the continuity of the normal distribution.
Therefore, it remains to show (3.36) for which we proceed similarly to [17].
To this end, we set An=An(y)={\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn≤y} and BI=BI(U)={dp(nSij−dp)∈U}, where I=(i,j)∈Λn:={(i,j):1≤i<j≤p}. For I1=(i1,j1)∈Λn and I2=(i2,j2)∈Λn we write I1<I2 if i1<i2 or (i1=i2 and j1<j2). Then we have
[TABLE]
where AnBI=An∩BI.
Using Zn−\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn→P0, Theorem 2.2 and Theorem 2.1, it holds
Note that limk→∞k!(μ(U))k=0. By first letting n→∞ and then k→∞ in (3.40) we now obtain (3.38) provided that
[TABLE]
Proof of (3.42).
For fixed I1<…<Id∈Λn with Il=(il,jl) for l=1,…,d we will identify the summands of \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn in (3.35) that are dependent on BI1,…,BId and show that their contribution is negligible (in a suitable way). Therefore, we introduce the set Λn,d=Λn,d(I1,…,Id) through
[TABLE]
The set Λn,d includes the indices (i,j) of all summands of \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn that are dependent on BI1,…,BId. Notice that Λn,d is not a subset of Λn because Λn,d might also contain indices (i,i) corresponding to diagonal elements Sii of the covariance matrix.
For our further arguments the following bound on the cardinality of Λn,d is important:
[TABLE]
By the definition of Λn,d, \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn−\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Zn,d is independent of BI1,…,BId, where
[TABLE]
Using this independence and applying Lemma 4.4 twice we get for δ>0,
Sending δ to zero establishes (3.42). Therefore it remains to show (3.43).
By Markov’s inequality we obtain for even τ∈N and δ>0,
[TABLE]
Letting K:={il,jl∣l=1,…,d} we write the first term on the right-hand side of (3.44) as follows
[TABLE]
where we used the Marcinkiewicz-Zygmund inequality (see [9, Theorem 2, p. 386]) and the inequality (a+b)c≤2c−1(ac+bc) for a>0, b>0 and c≥1.
We apply the law of total expectation and the Marcinkiewicz-Zygmund inequality to the first term of (3.45) and obtain
[TABLE]
where Kτ is a constant only depending on τ, which may vary from line to line.
Recalling the notation \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tij=∑t=1n\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111ijt we may write
[TABLE]
If more than τ+1 of the indices t1,u1,…,tτ,uτ are different, then there exists a tuple (tk,uk) such that tk=tl, tk=ul, uk=tl and uk=ul for every l=k and therefore one of the factors in the mean of (3.46) is independent, so that the summand disappears.
The remaining summands are bounded above by
[TABLE]
Similar to (3.19), (3.20) and (3.21) it holds that E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X]=o((np)−(s~−1)/s~), E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xr]≤C for r≤s~ and E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Xr]=o((np)(r−s~)/s~) for r>s~. From this we deduce that if ∣{t1,u1,…,tτ,uτ}∣=ℓ one has
The term in the maximum is either increasing or decreasing with ℓ or it is equal to o((np)4τ+4−2s~)/s~) if n≍(np)4/s~.
Therefore, the right-hand side in (3.49) is
[TABLE]
where we used that p=O(n(s−2)/4) and (s+2)/s~>1 for an ε<1, and additionally,
σn2≍(p/n)3+p/n if E[X4]=1.
In the Bernoulli case E[X4]=1, (3.47) is bounded by a constant and therefore the first term of (3.45) is O(n/pτ/2) as σn2≍(p/n)2.
By Jensen’s inequality the second term of (3.45) is bounded above by
[TABLE]
where Kτ′ is a constant only depending on τ, which may vary from line to line.
For the mean in the sum we can write
[TABLE]
We consider one of the summands above. There can be 0≤q≤τ pairs of indices (ti,ui) with ti=ui. Assume these pairs are (t1,u1),…,(tq,uq) and for i>q it holds that ti=ui. Then the mean in the sum equals
[TABLE]
If there are more than τ−q/2+1 different indices, the summand is equal to zero. For 1≤ℓ≤τ−q/2+1 different indices the summand is bounded above by
The term in the maximum is either increasing or decreasing with ℓ or it is equal to
o((np)2τ+2−2(s~−1)(τ−q))/s~) if n≍(np)2/s~.
Therefore, the expression in (3.51) is
[TABLE]
since in the first step both terms in the maximum are growing with q, so that the terms are the largest for q=τ, and since in the second step the last term of the maximum is larger than the first term for every p=O(n(s−2)/4). As σn2≍p/n+(p/n)3 if E[X4]=1, we have
[TABLE]
In the Bernoulli case the second term of (3.45) is equal to zero.
Throughout this section p=pn is a sequence of positive integers tending to infinity as n→∞. Furthermore, let X,(Xit)i,t≥1 be iid random variables with E[X]=0 and E[X2]=1.
Lemma 4.1**.**
Assume E[∣X∣s]<∞ for some s≥4. For a positive sequence (βn)n, which tends to zero and satisfies βn≫(E[∣X∣s\mathds1{∣X∣>βn(np)1/s}])1/s, set
The probability of Qn tends to zero as n→∞, since by the union bound and Markov’s inequality
[TABLE]
where the properties of the sequence βn were used in the last step.
Therefore, we have for δ>0,
[TABLE]
We observe that
[TABLE]
and similarly it follows
[TABLE]
Thus we obtain for 1≤i<j≤p,
[TABLE]
and as T112−\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T112 is nonnegative we get
[TABLE]
Thereby, it holds
[TABLE]
In the case E[X4]=1, the right-hand side tends to zero as σn≍(p/n)1/2+(p/n)3/2.
In the symmetric Bernoulli case X2=1, the probability in (4.58) is zero, establishing the desired result.
∎
Lemma 4.2**.**
Let \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T11, \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T12 and \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T13 be defined as in (3.14). Under the assumptions of Theorem 2.2 it holds, as n→∞,
[TABLE]
Proof.
Recalling that \macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111X=X\mathds1{∣X∣≤(np)1/4βn}, we get by (4.59) and (4.60) with s=4
[TABLE]
For higher moments we obtain
[TABLE]
After these preparations we will now prove the first claim of the lemma. By the multinomial theorem we have
[TABLE]
where we used (4.61) and (4.62) in the last step. The same arguments also yield
[TABLE]
Since E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T112]2=n4+2n3(E[X4]−1)+o(n3), we conclude that
[TABLE]
which proves the first part of the lemma. For the second part we consider Var(\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T122)=E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T124]−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T122]2. Using (4.61) and (4.62), we get
[TABLE]
as well as
[TABLE]
which implies E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T122]2=n2+o(n). Since Var(\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T122)=E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T124]−E[\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111T122]2 the second part of the lemma is established.
To show part three of the lemma, we compute, using (4.61), that
Therefore, we obtain in combination with (4.63) and (4.64),
[TABLE]
completing the proof of the lemma.
∎
Lemma 4.3**.**
Let (\macc@depth\frozen@everymath\macc@group\macc@set@skewchar\macc@nested@a111Tij) be defined as in (3.14) an write Fj for the sigma algebra generated by {x~1,…,x~j}, where x~i=(Xi1,…,Xin). Under the conditions of Theorem 2.2 it holds for 1≤i1,i2<j≤p that
Let Y and Y′ be real-valued random variables and B an arbitrary event. Then it holds for every y∈R and δ>0 that
[TABLE]
Proof.
We have
[TABLE]
and
[TABLE]
∎
Bibliography47
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] Anderson, C. W., and Turkman, K. F. The joint limiting distribution of sums and maxima of stationary sequences. J. Appl. Probab. 28 , 1 (1991), 33–44.
2[2] Anderson, C. W., and Turkman, K. F. Sums and maxima of stationary sequences with heavy tailed distributions. Sankhyā Ser. A 57 , 1 (1995), 1–10.
3[3] Bai, Z., and Silverstein, J. W. Spectral Analysis of Large Dimensional Random Matrices , second ed. Springer Series in Statistics. Springer, New York, 2010.
4[4] Bingham, N. H., Goldie, C. M., and Teugels, J. L. Regular Variation , vol. 27 of Encyclopedia of Mathematics and its Applications . Cambridge University Press, Cambridge, 1987.
5[5] Cai, T. T. Global testing and large-scale multiple testing for high-dimensional covariance structures. Annual Review of Statistics and Its Application 4 (2017), 423–446.
6[6] Cai, T. T., and Jiang, T. Phase transition in limiting distributions of coherence of high-dimensional random matrices. J. Multivariate Anal. 107 (2012), 24–39.
7[7] Chen, D., and Feng, L. Asymptotic independence of the quadratic form and maximum of independent random variables with applications to high-dimensional tests. ar Xiv preprint ar Xiv:2204.08628 (2022).
8[8] Chow, T. L., and Teugels, J. L. The sum and the maximum of i.i.d. random variables. In Proceedings of the Second Prague Symposium on Asymptotic Statistics (Hradec Králové, 1978) (1979), North-Holland, Amsterdam-New York, pp. 81–92.