Local semicircle law under fourth moment condition
Friedrich G\"otze, Alexey Naumov, Alexander Tikhomirov

TL;DR
This paper extends the local semicircle law for Wigner matrices to the case where only the fourth moment is finite, removing the previous need for a higher moment condition, and discusses implications for spectral convergence and eigenvector localization.
Contribution
It proves that the local semicircle law holds under only the finite fourth moment condition, improving previous results that required a higher moment assumption.
Findings
The local semicircle law holds with only finite fourth moment.
Convergence rates of the empirical spectral distribution are established.
Results on eigenvalue localization and eigenvector delocalization are provided.
Abstract
We consider a random symmetric matrix with upper triangular entries being independent random variables with mean zero and unit variance. Assuming that , it was proved in [G\"otze, Naumov and Tikhomirov, Bernoulli, 2018] that with high probability the typical distance between the Stieltjes transforms , of the empirical spectral distribution (ESD) and the Stieltjes transforms of the semicircle law is of order . The aim of this paper is to remove and show that this result still holds if we assume that . We also discuss applications to the rate of convergence of the ESD to the semicircle law in the Kolmogorov distance, rates of localization of the eigenvalues around the classical positionsâŚ
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRandom Matrices and Applications ¡ Spectral Theory in Mathematical Physics ¡ Advanced Algebra and Geometry
Local semicircle law under fourth moment condition
F. GĂśtze
Friedrich GĂśtze
Faculty of Mathematics
Bielefeld University
Bielefeld, Germany
,Â
A. Naumov
Alexey A. Naumov
National Research University Higher School of Economics, Moscow, Russia; and IITP RAS, Moscow, Russia
 andÂ
A. Tikhomirov
Alexander N. Tikhomirov
Department of Mathematics
Komi Science Center of Ural Division of RAS
Syktyvkar, Russia; and National Research University Higher School of Economics, Moscow, Russia
(Date: March 17, 2024)
Abstract.
We consider a random symmetric matrix with upper triangular entries being independent random variables with mean zero and unit variance. Assuming that , it was proved in [17] that with high probability the typical distance between the Stieltjes transforms , , of the empirical spectral distribution (ESD) and the Stieltjes transforms of the semicircle law is of order . The aim of this paper is to remove and show that this result still holds if we assume that . We also discuss applications to the rate of convergence of the ESD to the semicircle law in the Kolmogorov distance, rates of localization of the eigenvalues around the classical positions and rates of delocalization of eigenvectors.
Key words and phrases:
Wignerâs random matrices, local semicircle law, Stieltjes transform, Steinâs method, rigidity, delocalization, empirical spectral distribution
1. Introduction and main result
One of the main questions in random matrix theory is to investigate the limiting behaviour of spectral statistics of eigenvalues of large dimensional random matrices, for example, the distance between neighbouring eigenvalues or -point correlation function. It turns out that there is a universality phenomena which states that the distribution of these statistics is independent of the particular distribution of the matrix entries, but depends on some global characteristics like the existence of moments. In the recent years there was a significant progress in the analysis of universality phenomena for the Wigner ensemble of random matrices, i.e. Hermitian matrices with independent entries subject to the symmetry constraint. We refer the interested reader for a comprehensive literature review and more details to the forthcoming book by L. ErdÜs and H.-T. Yau [11]. In the current paper we will not discuss the question of universality, but turn our attention to the local semicircle law which is the necessary intermediate step to the universality, but has its own important applications.
In what follows we consider a Hermitian random matrix , such that are independent random variables (r.v.) with zero mean. We also allow the distribution of matrix entries to depend on , but omit the latter from matrix notations. Furthermore, for simplicity we will assume that for all . As it was mentioned above we refer to such matrices as Wignerâs ensemble. Denote the eigenvalues of the normalized matrix in the increasing order by and introduce the empirical spectral distribution (ESD) . It was proved by E. Wigner [26] and further generalized by many authors (see e.g. monographs [3], [2], [23]) that with probability one weakly converges to the deterministic limit with absolutely continues density
[TABLE]
where . In particular, these results imply convergence in the macroscopic regime, i.e. for all intervals of fixed length and independent of , which contain macroscopically large number of eigenvalues. In turned out that an appropriate analytical tool is the Stieltjes transform of ESD given by
[TABLE]
where . Under rather general conditions one may show (see e.g. [23]) that with probability one for fixed
[TABLE]
It is of interest to investigate the microscopic regime, i.e. the case of smaller intervals, where the number of eigenvalues cease to be macroscopically large. This regime is essential for many applications as the rate of convergence of to the limiting distribution , rigidity of eigenvalues or delocalization of the corresponding eigenvectors among others. To deal with this regime one needs to establish the convergence of to in the region , where is some function of . Significant progress in that direction was recently made in a series of results by L. ErdÜs, B. Schlein, H.-T. Yau, J. Yin et al [9], [8], [10],[12], [6] showing that with high probability uniformly in
[TABLE]
where and is some positive constant. This result was called the local semicircle law. It means that the fluctuations of around are of order (up to a logarithmic factor). In the papers [9], [8], [10], [12] the inequality  (1.4) has been proved assuming that the distribution of has sub-exponential tails for all . Moreover in [6] this assumption had been relaxed to requiring for all , where are some constants. In the recent years the series of results appeared, where the latter assumptions were further relaxed to the condition that
[TABLE]
for some , see e.g. [7], [5], [20], [13], [14], [15], [18] and [17]. In particular, the result of [17] implies that (1.4) holds with .
The main emphasis of the current paper is to remove from the condition (1.5). The main idea of the proof is motivated by the recent result of A. Aggarwal [1] who established the bulk universality for Wignerâs matrices with finite moments of order . He proved that (1.4) still holds true, but the factor is replaced by , where is some constant depending on and . In the current paper we show that (1.4) still holds assuming finite fourth moment only. Taking into account the behaviour of the extreme eigenvalues of we also believe that it is the best possible moment assumption for (1.4) to remain valid. In the section 1.3 below we briefly discuss how using technique from [1] and [17] one may achieve this aim.
1.1. Notations
Throughout the paper we will use the following notations. We assume that all random variables are defined on common probability space and let be the mathematical expectation with respect to . For a r.v. we use notation to denote . We denote by the indicator function of the set .
We denote by and the set of all real and complex numbers. Let be the imaginary and real parts of . We also define . Let denotes the set of the first positive integers. For any introduce . To simplify all notations we will write instead of and respectively.
For any matrix together with its resolvent and Stieltjes transform we shall systematically use the corresponding notations , respectively, for the sub-matrix of with entries . For simplicity we write instead of . The same is applies to etc.
By and we denote some positive constants.
For an arbitrary matrix taking values in we define the operator norm by , where . We also define the Hilbert-Schmidt norm by .
1.2. Main results
Without loss of generality we will assume in what follows that is a real symmetric matrix which satisfies the following conditions.
Definition 1.1** (Conditions ).**
We say that a Hermitian random matrix satisfies conditions if its entries in the upper triangular part are independent random variables with and .
Our results proven below apply to the case of Hermitian matrices as well. Here we may additionally assume for simplicity that real and imaginary parts, , are independent r.v. for all . Otherwise one needs to extend the moment inequalities for linear and quadratic forms in complex r.v. (see [13][Theorem A.1-A.2]) to the case of dependent real and imaginary parts, the details of which we omit.
We will also often refer to the following condition .
Definition 1.2** (Conditions ).**
We say that the set of conditions holds if are satisfied and , where .
Let us introduce the following notation
[TABLE]
where were defined in (1.2) and (1.3) respectively. Recall that is the imaginary part of . The main result of this paper is the following theorem, which estimates the fluctuations (1.4).
Theorem 1.3**.**
Assume that the conditions hold and let be some constant.
- â˘
There exist positive constants and depending on and such that
[TABLE]
for all , and .
- â˘
For any there exist positive constants and depending on such that
[TABLE]
for all , and .
Remark*.*
- Using Markovâs inequality the bound (1.6) may be used to show that for any there exists some positive constant such that with probability at least for all and :
[TABLE]
Hence, (1.4) holds with .
- It is interesting to investigate the case of generalised matrix when for any , but could be different. Unfortunately, the technique of the current paper doesnât allow to deal with such case directly. Fortunately, one may apply a combination of the multiplicative descent used in this paper (and first developed in [4]) together with the additive descent developed in the series of papers by L. ErdĂśs, B. Schlein, H.-T. Yau, J. Yin et al; see e.g. [11]. This combination was recently used in [16]. We donât give details here to simplify the proof.
The result of the previous theorem may be formulated under conditions . In this case one may truncate and re-normalize the entries of by means of Lemmas A.1â A.3 in the appendix. We obtain the following corollary.
Corollary 1.4**.**
Assume that the conditions hold and let be some constant. There exist positive constants and depending on and such that
[TABLE]
for all , and . Similar result holds true for (1.7).
We believe that the power of the logarithm could be reduced. The main technical problem is in Lemmas A.1â A.3 in the appendix. Truncation on the level near requires additional logarithmic factors.
1.3. Sketch of the proof of Theorem 1.3
To prove Theorem 1.3 we use the strategy from [17].
- (1)
Applying inequality (2.6) we may estimate (depending on being near or far from the spectral interval ) by the moments . This inequality first appeared in [4][Proposition 2.2]. 2. (2)
Estimation of consists of two parts:
a) Estimation of ; see Lemma 3.1. This bound requires to estimate high moments of (i.e. quadratic and linear forms ); see (2). This step also uses the following crude bound
[TABLE]
Unfortunately, the technique from [13], [15] doesnât work since we may truncate on the level , where is of the logarithmic order (opposite to the case when . This allows to truncate on the level for some small ). Let us demonstrate this on the quadratic form . Applying [15][Theorem 7] or [13][Theorem A.2] we obtain
[TABLE]
where satisfies . There is no problem to deal with first two terms in the r.h.s. of the previous inequality. The most difficult term is the last one. In the sub-Gaussian case this term has the order (see [17][Lemma 4.4] ) and is small for (here we also use the fact that ). Under assumptions we may only guaranteer that . But . Hence, the last term in the estimate for is bounded by , which could be very large for large . It is worth to mention here, that if one can truncate on the level, say, , then there will be an additional factor in the denominator.
To overcome this problem we use ideas from [1]. We introduce configuration matrix such that if and if for some of the logarithmic order; see (3.4). One may show that with high probability this matix has the block structure (see (3.6)). This means that with high probability in each row and in each column of there is only small (of logarithmic order) number of large entries () and large number of small entries. Fixing admissible (see Definition 3.6 below) configuration (corresponding to the block structure) one may estimate \operatorname{\mathbb{E}}(|{\bf R}_{jk}(z)|^{p}\big{|}{\bf L}),z=u+iv; see Lemma 3.7. For each subrow with small entries we use bounds from [13], [15]. For each subrow with large entries we may use crude bounds which doesnât contain factor ; see decomposition (3) and corresponding estimates below. Using now total probability rule and the crude bound if is not admissible we estimate .
b) More accurate (than (1.8)) bounds for ; see section 4. We use Lemma 4.1 which provides a general framework for estimation of moments of statistics of independent r.v. This requires estimation of for . The latter could be done since has moments of order up to .
1.4. Applications of the main results
This section is addressed to application of Theorem 1.3 and Corollary 1.4 to different questions as the rate of convergence of the ESD to the semicircle law , rigidity estimates for the eigenvalues and delocalization bounds for the corresponding eigenvectors . Up to the power of logarithmic factors these results repeat the corresponding results from [14], [17]. We formulate all results with comments, but leave the proof. The interested reader may recover the proof from the corresponding papers mentioned above. It is worth to mention that these questions has been intensively studied under stronger assumptions in many papers; see e.g. [9], [8], [10], [5], [7], [6] and [25]. We also refer to the recent monograph [11] and survey [24].
1.4.1. Rate of convergence of ESD
Our first result provides quantitative estimates for the rate of convergence of the ESD to the semicircle law in the Kolmogorov distance.
Corollary 1.5**.**
Assume that the conditions hold. For any there exists positive constant such that with probability at least
[TABLE]
For the proof see [17][Theorem 1.4]. The difference is in application of Corollary 1.4 instead of [17][Theorem 1.1]. The proof is mainly based on application of the smoothing inequality (see e.g. [17][Corollary 6.2]) and Corollary 1.4. We believe that the power of the logarithm could be reduced from to or even , which would be optimal due to the result of Gustavsson [19] for the Gaussian Unitary Ensembles (GUE).
Using this result on main prove the following corollary
Corollary 1.6**.**
Assume that conditions hold. For any there exists positive constant such that for all with probability at least :
[TABLE]
1.4.2. Rigidity
Taking into account the result of Theorem 1.5 and using Smirnovâs transform one may also get the rigidity estimates for the majority of eigenvalues . More precisely, one may control the eigenvalues on the bulk of the spectrum. To deal with the smallest (largest) eigenvalues one needs more accurate bound then in Theorem 1.3 for the distance between Stieltjes transforms. For any we define .
Theorem 1.7**.**
Assume that the conditions hold and and . There exist positive constants and depending on and such that
[TABLE]
for all , and .
Let us define the quantile position of the -th eigenvalue by
[TABLE]
The following results give the bounds for the fluctuations of around .
Corollary 1.8**.**
Assume that the conditions hold and let be an integer. Then
- â˘
(bulk) Let . For any there exists positive constant such that with probability at least :
[TABLE]
- â˘
(edge) Let or . There exists positive constant such that with probability at least :
[TABLE]
For the detailed proof see [14][Theorem 1.3] making minor changes. For the bulk of the spectrum we mainly use the following formula
[TABLE]
see proof of [14][Theorem 1.3]. Here, . Taking into account that
[TABLE]
and Corollary 1.5 one may obtain the estimates for the bulk of the spectrum. Clearly, the factor comes from the bound for the . The proof for the edge of the spectrum requires more involved technique. In particular, following [6][Theorem 7.6] we write
[TABLE]
where for some . The first case when is trivial since in this situation (see (1.4.2)) and we may repeat the calculations for the case of the bulk to get
[TABLE]
Applying (1.4.2) we obtain . This enables to write the estimate
[TABLE]
Estimation of the r.h.s. of the previous inequality requires to use truncation technique leading to very poor probability bounds (of order ). Namely, we need to replace satisfying by the corresponding matrix satisfying . To estimate the r.h.s. of (1.10) one may follow [6][Theorem 7.3] and use Theorem 1.7.
1.4.3. Delocalization of eigenvectors
Let us denote by the eigenvectors of corresponding to the eigenvalue . The following theorem is the direct corollary of Lemma 3.1.
Corollary 1.9**.**
Assume that conditions hold. There exist positive constant such that with probability at least :
[TABLE]
Comparison with a similar result for the Gaussian Orthogonal Ensembles (GOE) (see [2][Corollary 2.5.4]) shows that this result is optimal with respect to the power of logarithm. For the proof see [17][Theorem 1.4], replacing [17][Lemma 3.1] with Lemma 3.1(with ). For the readers convenience we give an idea of the proof. We introduce the following distribution function
[TABLE]
Using the eigenvalue decomposition of it is easy to see that
[TABLE]
For any we have
[TABLE]
Estimation of the r.h.s. of the previous inequality requires again to use truncation technique leading to very poor probability bounds. Similarly to the edge case of Corollary 1.8 we replace satisfying by the corresponding matrix satisfying , and apply Lemma 3.1(with ). Replacing conditions by one may improve the estimate.
2. Proof of the main result
We start this section with the recursive representation of the diagonal entries of the resolvent . As noted before we shall systematically use for any matrix together with its resolvent , Stieltjes transform and etc. the corresponding quantities and etc. for the corresponding sub matrix with entries . Here and . We will often omit the argument from and write instead. We may express in the following way
[TABLE]
Let , where
[TABLE]
Using these notations we may rewrite (2.1) as follows
[TABLE]
Introduce
[TABLE]
and
[TABLE]
Applying (2.3) we arrive at the following representation for in terms of and
[TABLE]
It was proved in [4][Proposition 2.2] (see also [13][Lemma B.1]) that for all and (using the quantities (2.4))
[TABLE]
Moreover, for all and
[TABLE]
It is easy to check that and moreover, there exist constants such that for all , , where is defined in the section 1.4.2; see e.g. [6][Lemma 4.3]. Hence, in order to bound (or respectively) it is enough to control .
Let us introduce the following region in the complex plane:
[TABLE]
where are arbitrary fixed positive real numbers and is some large constant defined below.
The following theorem provides a bound for for all in terms of diagonal resolvent entries.
Theorem 2.1**.**
Assume that the conditions hold and and . There exist positive constants and depending on and such that for all we have
[TABLE]
where .
Remark*.*
To prove Theorem 1.7 one need more stronger bound for than (2.9). Minor changes in the proof of Theorem 2.1 will lead to the following estimate
[TABLE]
where \mathcal{A}(q):=\max\big{(}\max_{j=1,\ldots,n}\operatorname{\mathbb{E}}^{\frac{1}{q}}\operatorname{Im}^{q}{\bf R}_{jj},\operatorname{Im}m_{sc}(z)\big{)}. This estimate is sufficient for our purposes. The term may be estimate due to Lemma 3.1. We omit the details.
The proof of Theorem 2.1 is one of the crucial steps in the proof of the main result and will be given in the next section. We finish this section with the proof of Theorems 1.3 and 1.7.
Proof of Theorem 1.3.
To estimate we may choose one of the bounds (2.7), depending on whether is near the edge of the spectrum or away from it. If then we may take the bound and obtain
[TABLE]
If the opposite inequality holds, , then we will use the bound :
[TABLE]
Both inequalities combined yield
[TABLE]
Similar arguments are applicable to . â
Proof of Theorem 1.7.
Following the remark after Theorem 2.1 we may conclude that
[TABLE]
Using the bound we get
[TABLE]
Since for all , , and
[TABLE]
(see e.g. [6][Lemma 4.3]) we finally get
[TABLE]
This bound concludes the proof of the theorem. â
3. Bounds for moments of diagonal entries of the resolvent
The main result of this section is the following lemma which provides a bound for moments of the diagonal entries of the resolvent. Recall that (see the definition (2.8)) for
[TABLE]
where are any fixed real numbers and is some large constant determined below. The first value is sufficient to obtain optimal bounds for delocalization of eigenvectors. The second value, , is necessary for the main Theorem 1.3.
Lemma 3.1**.**
Assuming the conditions there exist a positive constant depending on and positive constants depending on such that for all and we have
[TABLE]
We provide the proof of (3.1) only. The proof of (3.2) and (3.3) is the same and will be omitted. For the details see [17][Lemma 3.1].
We start with introducing the following events
[TABLE]
where is some quantity depending on . We also denote
[TABLE]
Using Markovâs inequality, it is easy to check that
[TABLE]
Following [1] let us introduce the following configuration matrix
[TABLE]
with . Let and , , be mutually independent random variables distributed as conditioned on and resp. Let where
[TABLE]
We may consider matrix as the matrix conditioned on the configuration . We repeat some classification of configuration matrices from [1].
Definition 3.2**.**
Fix an configuration matrix . We call* linked (w.r.t ) if ; otherwise we call them unlinked.*
Definition 3.3**.**
The indices and are* connected if there exists an integer and a sequence of indices such that is linked to for each .*
Definition 3.4**.**
We call index * deviant (w.r.t. to ) if there exist some index such that and are linked. Otherwise is called typical. Let denote the set of deviant indexes, and let denote the set of typical indexes.*
Definition 3.5**.**
We say that is:**
- â˘
*deviant-inadmissible *if there exist at least deviant indices, where may depend on . **
- â˘
-connected-inadmissible if there exist distinct indices that are pairwise connected.
For any define
[TABLE]
Definition 3.6**.**
We call configuration -admissible, if it is not deviant-inadmissible and
[TABLE]
Following [1] we may estimate
[TABLE]
Applying Chernoffs inequality we may also show that
[TABLE]
Denote by the set of all -admissible configurations. In what follows we take . Then
[TABLE]
for some large .
Let us fix the -admissible configuration . By definition of -admissibility we may find Hermitian matrices of order , such that and matrix may be rewritten as follows
[TABLE]
Moreover, the zero-entries of matrix can only be inside of , and in each row (column) may contain at most zero-entries.
Denote and . We also assume that .
Lemma 3.7**.**
Let be -admissible. Assuming the conditions there exist a positive constant depending on and positive constants depending on such that for all and we have
[TABLE]
Since is fixed and we shall omit from the notation of the resolvent and denote . Sometimes in order to simplify notations we shall also omit the argument in and just write . For any (see section 1.1) we may express in the following way (compare with (2.1))
[TABLE]
where and . Here
[TABLE]
We also introduce the quantities and
[TABLE]
The following lemma allows to recursively estimate the moments of .
Lemma 3.8**.**
For an arbitrary set and all there exist a positive constant depending on only such that for all with and we have
[TABLE]
Proof.
The proof may be found in [4][Lemma 3.4] or [13][Lemma 4.2]. â
Let us take .
Lemma 3.9**.**
Let be -admissible and assume that the conditions hold. Let and be arbitrary numbers such that . There exist a sufficiently large constant and small constant depending on only such that the following statement holds. Fix some . Suppose that for some integer , all such that
[TABLE]
Then for all such that ,
[TABLE]
Proof.
Let us fix an arbitrary and , such that . In the following let . First we note that for any and we may write
[TABLE]
(the same inequalities hold for replaced by ). The first inequality follows from the fact that (see e.g [4][Lemma 3.4] or [13][Lemma C.1]) and the second inequality follows from the assumption (3.7). Moreover, for any we have
[TABLE]
Applying Lemma 3.8 we get
[TABLE]
By the Cauchy-Schwarz inequality
[TABLE]
The last two inequalities imply that
[TABLE]
It remains to estimate . By an obvious inequality we have
[TABLE]
Let . For a -admissible configuration . It is easy to check that
[TABLE]
Moreover, for
[TABLE]
The bound for is the direct corollary of (3.10)
[TABLE]
Let us consider . We may rewrite it as a sum , where
[TABLE]
Applying a crude bound and (3.9) we get
[TABLE]
Using Burkholderâs type inequality and (3.10) we obtain the following estimate
[TABLE]
This inequality and  (3.8)â(3.9) together imply
[TABLE]
For the term we use the Rosenthal inequality, see e.g. [21],
[TABLE]
Again the crude bound imply
[TABLE]
Similarly,
[TABLE]
Finally,
[TABLE]
For the term we may proceed similarly. We get that , where
[TABLE]
The crude bound implies that
[TABLE]
Applying the Rosenthal inequality we get
[TABLE]
It is straightforward to check that
[TABLE]
Finally, for we may write
[TABLE]
The off-diagonal entries may be expressed as follows
[TABLE]
Applying
[TABLE]
We proceed similarly to the estimation of . For simplicity let us denote
[TABLE]
The crude bound implies that
[TABLE]
By Rosenthalâs inequality
[TABLE]
Finally,
[TABLE]
Analysing (3.12)â(3.25) it is easy to see that one may choose sufficiently large constant and small constant such that
[TABLE]
â
Proof of Lemma 3.7.
Let us choose some sufficiently large constant , where is defined in Lemma 3.8. We also choose as in Lemma 3.9 . Let . Since we may write
[TABLE]
for all such that and . Fix arbitrary and . Lemma 3.9 yields that
[TABLE]
for , . We may repeat this procedure times and finally obtain
[TABLE]
for and . â
The previous lemma allows to obtain by taking . Without loss of generality we may consider only (otherwise one may apply Lyapunovâs inequality for moments). It follows from Lemma 3.7 that for any -admissible :
[TABLE]
for all . We may descent from to while keeping . Indeed, first we may take and show that for
[TABLE]
It remains to remove the log factor from the r.h.s. of the previous equation. To this aim we shall adopt the moment matching technique which has been successfully used recently in [20] and [17].
We consider the pairs , and denote by random variables such that: , for some chosen later, and
[TABLE]
It follows from [20][Lemma 5.2]. that such a set of random variables exists. Let us denote such that
[TABLE]
Introduce and . Then, in Lemma A.4 we show that for all and there exist positive constants such that
[TABLE]
It is easy to see that are sub-Gaussian random variables. Repeating the proof of Lemma 3.9, see Lemma A.5 in the appendix, we get
[TABLE]
for some . We omit the details and proceed to the proof of Lemma 3.1.
Proof of Lemma 3.1.
From (3.5) we conclude that
[TABLE]
for some large . It is easy to see that
[TABLE]
for some . â
4. Estimate of
In this section we prove Theorem 2.1. We will follow the main idea of the proof of corresponding results in [15]. The main technical problem is to estimate the r.h.s of (4.5). Using definiton of we come to the problem of estimation . Since and are dependent we need to use the Cauchy-Schwarz inequality. Unfortunately, we can only estimate without truncation. To estimate higher moments we need to use truncation arguments, i.e. use . This will lead to the non-optimal bounds. It is worth to mention that in the case when we can estimate without truncation. To overcome the problem mentioned above we split the r.h.s. of (4.5) into two terms corresponding to or for some large . To obtain bounds of order for we need to take in Lemma 3.1 of the order ().
To simplify the proof of Theorem 2.1 we will formulate below a simple lemma, which provides a general framework to estimate the moments of some statistics of independent random variables.
Let us consider the following statistic
[TABLE]
where and are -measurable r.v. for some -algebra . Assume that there exist -algebras such that
[TABLE]
For simplicity we denote \operatorname{\mathbb{E}}_{j}(\cdot):=\operatorname{\mathbb{E}}(\cdot\big{|}\mathfrak{M}^{(j)}). Let be arbitrary -measurable r.v. and denote
[TABLE]
Lemma 4.1**.**
For all there exist some absolute constant such that
[TABLE]
where
[TABLE]
Proof.
See [16][Lemma 6.1]. â
Remark*.*
We conclude the statement of the last lemma by several remarks.
- (1)
It follows from the definition of that instead of estimation of high moments of one needs to estimate conditional expectation for some small . Typically, . 2. (2)
Moreover, to get the desired bounds one needs to choose an appropriate approximation of and estimate .
Proof of Theorem 2.1.
Recalling the definition of (see (2.5)) we may rewrite it in the following way
[TABLE]
One may see that is a special case of , where
[TABLE]
We estimate each term in Lemma 4.1. Here, and .
4.1. Bound for
Applying the Schur complement formula we get
[TABLE]
Since we rewrite
[TABLE]
Hence, using Lemma A.6 we obtain
[TABLE]
We may apply the bound (see (2.7)), Youngâs inequaltity and get
[TABLE]
where .
4.2. Bound for
The term , may be bounded from above by the following quantity
[TABLE]
Let us fix an arbitrary , and choose
[TABLE]
Then
[TABLE]
This equation implies that
[TABLE]
Let us take some positive constant such that for :
[TABLE]
It is straightforward to check that
[TABLE]
Moreover, from (4.6) and negligibility of high moments of we may conclude that
[TABLE]
The last two inequalities (4.7) and (4.8) imply that
[TABLE]
4.3. Bound for
We note that
[TABLE]
We estimate the conditional expectation. Let
[TABLE]
where is positive r.v. with sufficiently many bounded moments. For example, to estimate the r.h.s. of (4.10) one may take . But for further analysis it will be necessary to consider more general .
Representation of . By definition we may write the following representation
[TABLE]
where and . Hence,
[TABLE]
where
[TABLE]
The equation (4.2) and Lemma A.6[Inequality (A.11)] yield that
[TABLE]
For simplicity we denote the quadratic form in (4.2) by
[TABLE]
and rewrite it as a sum of the three terms , where
[TABLE]
It follows from (4.2) and that
[TABLE]
Using representation  (4.4) we estimate
[TABLE]
Applying this inequality and Lemma A.6[Inequality (A.11)] we may write
[TABLE]
Then
[TABLE]
where
[TABLE]
Hence, taking , we may estimate
[TABLE]
It is straightforward to check that
[TABLE]
We proceed to estimation of . The arguments for all other terms are similar and will be omitted. It follows from (4.13) that Hence,
[TABLE]
It remains to estimate . Applying the Cauchy-Schwarz inequality and arguments similar to (4.6)â(4.8) one may write
[TABLE]
Here we also use the moment bounds for quadratic and linear forms, see [13][Lemmas A.3âA.12]. Repeating the same arguments for we come to the following bound
[TABLE]
where we also used the crude estimate . Using Youngâs inequality we immediately obtain
[TABLE]
4.4. Bound for and
It is easy to see that one may estimate and simultaneously. Indeed, it is enough to estimate
[TABLE]
where . Let us fix . Using (4.12) we get
[TABLE]
Applying Youngâs inequality we obtain
[TABLE]
Similarly to (4.6)
[TABLE]
Repeating now all calculations above we get
[TABLE]
Collecting (4.3), (4.9), (4.14) and (4.15) we conclude the claim of the Theorem 2.1. â
5. Acknowledgements
We would like to thank the Associate Editor and the Reviewer for helpful comments and suggestions.
Results have been obtained under support of the RSF grant No. 18-11-00132 (HSE University). F. GĂśtze has been supported by DFG through the Collaborative Research Centres 1283 âTaming uncertainty and profiting from randomness and low regularity in analysis, stochastics and their applicationsâ.
Appendix A Auxiliary results
A.1. Truncation
In this section we will show that the conditions allows us to assume that for all we have , where is some positive constant .
Let , and finally , where . We denote symmetric random matrices by and formed from and respectively. Similar notations are used for the corresponding resolvent matrices, ESD and Stieltjes transforms.
Lemma A.1**.**
Assuming the conditions we have for all
[TABLE]
Proof.
From Baiâs rank inequality (see [3][Theorem A.43]) we conclude that
[TABLE]
Integrating by parts we get
[TABLE]
It is easy to see that
[TABLE]
Applying Rosenthalâs inequality, [21], we get that
[TABLE]
From these inequalities we may conclude the statement of Lemma. â
Lemma A.2**.**
Assuming the conditions we have for all
[TABLE]
Proof.
It is easy to see that
[TABLE]
Applying the resolvent equality we get
[TABLE]
From (A.1) and (A.2) we may conclude
[TABLE]
Taking the -th power and mathematical expectation we get
[TABLE]
Since satisfies conditions we may apply Lemma 3.1 and conclude
[TABLE]
We also have
[TABLE]
To finish the proof it remains to estimate the term
[TABLE]
Applying the obvious inequality we get
[TABLE]
From this inequality and (A.3) we conclude the statement of the lemma. â
Lemma A.3**.**
Assuming the conditions we have for all :
[TABLE]
Proof.
It is easy to see that
[TABLE]
Applying the obvious inequalities and we get
[TABLE]
From
[TABLE]
we obtain
[TABLE]
By Lemma A.2 we know . This implies that
[TABLE]
Finally
[TABLE]
â
A.2. Replacement
We say that the conditions are satisfied if satisfies the conditions and have a sub-Gaussian distribution. It is well-known that the random variables are sub-gaussian if and only if for some constant .
Lemma A.4**.**
For all and there exist positive constants such that
[TABLE]
where is defined in (3.26).
Proof.
The method is based on the following replacement scheme, which has been used in recent results [5], [20] and [17]. We replace all by for such that , thus replacing the corresponding resolvent entries by for every pair of . Let . Denote by the random matrix with all entries in the positions replaced by . Assume that we have already exchanged all entries in positions and are going to replace an additional entry in the position with . Without loss of generality we may assume that (hence ) and then denote . The following additional notations will be needed.
[TABLE]
and , where denotes a unit column-vector with all zeros except -th position. In these notations we may write
[TABLE]
Recall that and denote and . Let us assume that we have already proved the following fact
[TABLE]
where is some quantity depending on (see (A.9) below for precise definition) and are some numbers. Similarly,
[TABLE]
where . It follows from (A.4) and (A.5) that
[TABLE]
Let us denote . We get
[TABLE]
with some positive constant . Repeating (A.6) recursively for we arrive at the following bound
[TABLE]
where . It is easy to see from the definition of that for some , say , we have
[TABLE]
From this inequality and (A.7) we deduce that
[TABLE]
with some positive constants and . From the last inequality we may conclude the statement of the lemma. It remains to prove (A.4) (resp. (A.5)). Applying the resolvent equation we get for
[TABLE]
The same identity holds for
[TABLE]
We investigate (A.8). In order handle arbitrary high moments of we apply a Stein type technique similar to Theorem. Let us introduce the following function and write
[TABLE]
Applying (A.8) we get
[TABLE]
Repeating the arguments from [17] one may show that
[TABLE]
For the term one may write down the following representation
[TABLE]
with the remainder term bounded in absolute value
[TABLE]
and
[TABLE]
where
[TABLE]
One may see that the term doesnât depend on but depends on . â
Lemma A.5**.**
Let be -admissible and assume that the conditions hold. Let and be arbitrary numbers such that . There exist a sufficiently large constant and small constant depending on only such that the following statement holds. Fix some . Suppose that for some integer , all such that
[TABLE]
Then for all such that ,
[TABLE]
Proof.
We first observe the fact that the factor appears only in the terms with . Let us consider only one term, for example, :
[TABLE]
Applying the Hanson-Wright inequality, see e.g. [22] we obtain that
[TABLE]
â
A.3. Inequalities for resolvent
Lemma A.6**.**
For any we have
[TABLE]
For any
[TABLE]
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Aggarwal. A. Bulk universality for generalized Wigner matrices with few moments. Probability Theory and Related Fields , 173 (1-2): 375â432, 2019.
- 2[2] G. Anderson, A. Guionnet, and O. Zeitouni. An introduction to random matrices , volume 118 of Cambridge Studies in Advanced Mathematics . Cambridge University Press, Cambridge, 2010.
- 3[3] Z. Bai and J. Silverstein. Spectral analysis of large dimensional random matrices . Springer, New York, second edition, 2010.
- 4[4] C. Cacciapuoti, A. Maltsev, and B. Schlein. Bounds for the stieltjes transform and the density of states of wigner matrices. Probability Theory and Related Fields , 163(1):1â59, 2015.
- 5[5] L. ErdĹs, A. Knowles, H.-T. Yau, and J. Yin. Spectral statistics of ErdĹs-RĂŠnyi Graphs II: Eigenvalue spacing and the extreme eigenvalues. Comm. Math. Phys. , 314(3):587â640, 2012.
- 6[6] L. ErdĹs, A. Knowles, H.-T. Yau, and J. Yin. The local semicircle law for a general class of random matrices. Electron. J. Probab. , 18:no. 59, 58, 2013.
- 7[7] L. ErdĹs, A. Knowles, H.-T. Yau, and J. Yin. Spectral statistics of ErdĹs-RĂŠnyi graphs I: Local semicircle law. Ann. Probab. , 41(3B):2279â2375, 2013.
- 8[8] L. ErdĹs, B. Schlein, and H.-T. Yau. Local semicircle law and complete delocalization for Wigner random matrices. Comm. Math. Phys. , 287(2):641â655, 2009.
