Noise sensitivity of the top eigenvector of a Wigner matrix

Charles Bordenave; G\'abor Lugosi; Nikita Zhivotovskiy

arXiv:1903.04869·math.PR·March 3, 2020

Noise sensitivity of the top eigenvector of a Wigner matrix

Charles Bordenave, G\'abor Lugosi, Nikita Zhivotovskiy

PDF

TL;DR

This paper studies how the top eigenvector of a Wigner matrix changes when a small or large number of entries are randomly resampled, revealing a phase transition in its sensitivity.

Contribution

It establishes a precise threshold at which the top eigenvector shifts from being nearly aligned to nearly orthogonal due to entry resampling.

Findings

01

For k much less than N^{5/3}, eigenvectors remain almost collinear.

02

For k much greater than N^{5/3}, eigenvectors become almost orthogonal.

03

Identifies a phase transition in eigenvector sensitivity at k ~ N^{5/3}.

Abstract

We investigate the noise sensitivity of the top eigenvector of a Wigner matrix in the following sense. Let $v$ be the top eigenvector of an $N \times N$ Wigner matrix. Suppose that $k$ randomly chosen entries of the matrix are resampled, resulting in another realization of the Wigner matrix with top eigenvector $v^{[k]}$ . We prove that, with high probability, when $k ≪ N^{5/3 - o (1)}$ , then $v$ and $v^{[k]}$ are almost collinear and when $k ≫ N^{5/3}$ , then $v^{[k]}$ is almost orthogonal to $v$ .

Equations337

X^{[k]}_{i,j}=\left\{\begin{array}[]{ll}X^{\prime}_{i,j}&\text{if $(i,j)\in S_{k}$}\\ X_{i,j}&\text{otherwise},\end{array}\right.

X^{[k]}_{i,j}=\left\{\begin{array}[]{ll}X^{\prime}_{i,j}&\text{if $(i,j)\in S_{k}$}\\ X_{i,j}&\text{otherwise},\end{array}\right.

E ⟨ v, v^{[k]} ⟩ = o (1) .

E ⟨ v, v^{[k]} ⟩ = o (1) .

E 1 \leq k \leq ε_{N} N^{5/3} max s \in {- 1, 1} min ∥ v - s v^{[k]} ∥_{2} = o (1) .

E 1 \leq k \leq ε_{N} N^{5/3} max s \in {- 1, 1} min ∥ v - s v^{[k]} ∥_{2} = o (1) .

λ^{[1]} - λ ≃ (1 + \mathbbm 1 (i_{1} \neq = j_{1})) v_{i_{1}} (X_{i_{1}, j_{1}}^{'} - X_{i_{1}, j_{1}}) v_{j_{1}} ≃ \frac{X _{i_{1}, j_{1}}^{'} - X _{i_{1}, j_{1}}}{N ^{1 + o (1)}},

λ^{[1]} - λ ≃ (1 + \mathbbm 1 (i_{1} \neq = j_{1})) v_{i_{1}} (X_{i_{1}, j_{1}}^{'} - X_{i_{1}, j_{1}}) v_{j_{1}} ≃ \frac{X _{i_{1}, j_{1}}^{'} - X _{i_{1}, j_{1}}}{N ^{1 + o (1)}},

λ^{[k]} - λ = t = 0 \sum k - 1 (λ^{[t + 1]} - λ^{[t]}) ≃ \frac{k}{N ^{1 + o (1)}} .

λ^{[k]} - λ = t = 0 \sum k - 1 (λ^{[t + 1]} - λ^{[t]}) ≃ \frac{k}{N ^{1 + o (1)}} .

(E ⟨ v, v^{[k]} ⟩)^{2} ≲ \frac{N ^{2} Var ( λ )}{k} .

(E ⟨ v, v^{[k]} ⟩)^{2} ≲ \frac{N ^{2} Var ( λ )}{k} .

X^{(i)} = (X_{1}, \dots, X_{i - 1}, X_{i}^{'}, X_{i + 1}, \dots, X_{n}) and X^{[i]} = (X_{1}^{'}, \dots, X_{i}^{'}, X_{i + 1}, \dots, X_{n})

X^{(i)} = (X_{1}, \dots, X_{i - 1}, X_{i}^{'}, X_{i + 1}, \dots, X_{n}) and X^{[i]} = (X_{1}^{'}, \dots, X_{i}^{'}, X_{i + 1}, \dots, X_{n})

Var (f (X)) = \frac{1}{2} i = 1 \sum n E [(f (X) - f (X^{(i)})) (f (X^{[i - 1]}) - f (X^{[i]}))] .

Var (f (X)) = \frac{1}{2} i = 1 \sum n E [(f (X) - f (X^{(i)})) (f (X^{[i - 1]}) - f (X^{[i]}))] .

Var (f (X)) = \frac{1}{2} i = 1 \sum n E [(f (X) - f (X^{(σ (i))})) (f (X^{σ ([i - 1])}) - f (X^{σ ([i])}))] .

Var (f (X)) = \frac{1}{2} i = 1 \sum n E [(f (X) - f (X^{(σ (i))})) (f (X^{σ ([i - 1])}) - f (X^{σ ([i])}))] .

B_{i} = E [(f (X) - f (X^{(σ (i))})) (f (X^{σ ([i - 1])}) - f (X^{σ ([i])}))],

B_{i} = E [(f (X) - f (X^{(σ (i))})) (f (X^{σ ([i - 1])}) - f (X^{σ ([i])}))],

B_{k} \leq \frac{2 Var ( f ( X ))}{k} .

B_{k} \leq \frac{2 Var ( f ( X ))}{k} .

B_{k}^{'} \leq \frac{2 Var ( f ( X ))}{k} (\frac{n + 1}{n}),

B_{k}^{'} \leq \frac{2 Var ( f ( X ))}{k} (\frac{n + 1}{n}),

B_{i}^{'} = E [(f (X) - f (X^{(j)})) (f (X^{σ ([i - 1])}) - f (X^{(j) \circ σ ([i - 1])}))] .

B_{i}^{'} = E [(f (X) - f (X^{(j)})) (f (X^{σ ([i - 1])}) - f (X^{(j) \circ σ ([i - 1])}))] .

Var (λ) \leq (c + o (1)) N^{- 1/3},

Var (λ) \leq (c + o (1)) N^{- 1/3},

Var (λ) ≲ (lo g N)^{C l o g l o g N} N^{- 1/3},

Var (λ) ≲ (lo g N)^{C l o g l o g N} N^{- 1/3},

∥ w ∥_{\infty} \leq \frac{( lo g N ) ^{C}}{N} .

∥ w ∥_{\infty} \leq \frac{( lo g N ) ^{C}}{N} .

1 \leq i, j \leq N max s \in {- 1, 1} in f ∥ s v - u^{(ij)} ∥_{\infty} \leq N^{- \frac{1}{2} - α},

1 \leq i, j \leq N max s \in {- 1, 1} in f ∥ s v - u^{(ij)} ∥_{\infty} \leq N^{- \frac{1}{2} - α},

\frac{2 Var ( λ )}{k} \cdot \frac{( 2 N ) + N + 1}{( 2 N ) + N} \geq E [(λ - μ) (λ^{[k]} - μ^{[k]})] .

\frac{2 Var ( λ )}{k} \cdot \frac{( 2 N ) + N + 1}{( 2 N ) + N} \geq E [(λ - μ) (λ^{[k]} - μ^{[k]})] .

E [(λ - μ) (λ^{[k]} - μ^{[k]})] ≃ \frac{1}{N ^{2}} E [⟨ v, v^{[k]} ⟩^{2}] .

E [(λ - μ) (λ^{[k]} - μ^{[k]})] ≃ \frac{1}{N ^{2}} E [⟨ v, v^{[k]} ⟩^{2}] .

E [⟨ v, v^{[k]} ⟩^{2}] ≲ \frac{N ^{\frac{5}{3}}}{k},

E [⟨ v, v^{[k]} ⟩^{2}] ≲ \frac{N ^{\frac{5}{3}}}{k},

E [(λ - μ) (λ^{[k]} - μ^{[k]})] = E [(⟨ v, X v ⟩ - ⟨ u, Y u ⟩) (⟨ v^{[k]}, X^{[k]} v^{[k]} ⟩ - ⟨ u^{[k]}, Y^{[k]} u^{[k]} ⟩)] .

E [(λ - μ) (λ^{[k]} - μ^{[k]})] = E [(⟨ v, X v ⟩ - ⟨ u, Y u ⟩) (⟨ v^{[k]}, X^{[k]} v^{[k]} ⟩ - ⟨ u^{[k]}, Y^{[k]} u^{[k]} ⟩)] .

⟨ u, (X - Y) u ⟩ \leq ⟨ v, X v ⟩ - ⟨ u, Y u ⟩ \leq ⟨ v, (X - Y) v ⟩ .

⟨ u, (X - Y) u ⟩ \leq ⟨ v, X v ⟩ - ⟨ u, Y u ⟩ \leq ⟨ v, (X - Y) v ⟩ .

⟨ x, (X - Y) x ⟩ = U_{t, s} x_{t} x_{s}

⟨ x, (X - Y) x ⟩ = U_{t, s} x_{t} x_{s}

(⟨ v, X v ⟩ - ⟨ u, Y u ⟩) (⟨ v^{[k]}, X^{[k]} v^{[k]} ⟩ - ⟨ u^{[k]}, Y^{[k]} u^{[k]} ⟩) \geq I,

(⟨ v, X v ⟩ - ⟨ u, Y u ⟩) (⟨ v^{[k]}, X^{[k]} v^{[k]} ⟩ - ⟨ u^{[k]}, Y^{[k]} u^{[k]} ⟩) \geq I,

I = V_{t, s} min {v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]}, u_{t} u_{s} v_{t}^{[k]} v_{s}^{[k]}, v_{t} v_{s} u_{t}^{[k]} u_{s}^{[k]}, u_{t} u_{s} u_{t}^{[k]} u_{s}^{[k]}},

I = V_{t, s} min {v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]}, u_{t} u_{s} v_{t}^{[k]} v_{s}^{[k]}, v_{t} v_{s} u_{t}^{[k]} u_{s}^{[k]}, u_{t} u_{s} u_{t}^{[k]} u_{s}^{[k]}},

V_{i, j} = U_{i, j} U_{i, j}^{'} = (1 + \mathbbm 1 (i \neq = j))^{2} (X_{i, j} - X_{i, j}^{''}) (X_{i, j}^{'} - X_{i, j}^{''}) .

V_{i, j} = U_{i, j} U_{i, j}^{'} = (1 + \mathbbm 1 (i \neq = j))^{2} (X_{i, j} - X_{i, j}^{''}) (X_{i, j}^{'} - X_{i, j}^{''}) .

x \in {v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]}, u_{t} u_{s} v_{t}^{[k]} v_{s}^{[k]}, v_{t} v_{s} u_{t}^{[k]} u_{s}^{[k]}, u_{t} u_{s} u_{t}^{[k]} u_{s}^{[k]}},

x \in {v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]}, u_{t} u_{s} v_{t}^{[k]} v_{s}^{[k]}, v_{t} v_{s} u_{t}^{[k]} u_{s}^{[k]}, u_{t} u_{s} u_{t}^{[k]} u_{s}^{[k]}},

∣ x - w_{t} w_{s} w_{t}^{[k]} w_{s}^{[k]} ∣ \leq \frac{4 ( lo g N ) ^{3 C}}{N ^{2 + α}} .

∣ x - w_{t} w_{s} w_{t}^{[k]} w_{s}^{[k]} ∣ \leq \frac{4 ( lo g N ) ^{3 C}}{N ^{2 + α}} .

v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]} = (w_{t} - δ_{t}) (w_{s} - δ_{s}) (w_{t}^{[k]} - δ_{t}^{[k]}) (w_{s}^{[k]} - δ_{s}^{[k]}) .

v_{t} v_{s} v_{t}^{[k]} v_{s}^{[k]} = (w_{t} - δ_{t}) (w_{s} - δ_{s}) (w_{t}^{[k]} - δ_{t}^{[k]}) (w_{s}^{[k]} - δ_{s}^{[k]}) .

max {∣ δ_{t} ∣, ∣ δ_{s} ∣, ∣ δ_{t}^{[k]} ∣, ∣ δ_{s}^{[k]} ∣} \leq N^{- \frac{1}{2} - α} and max {∣ w_{t} ∣, ∣ w_{s} ∣, ∣ w_{t}^{[k]} ∣, ∣ w_{s}^{[k]} ∣} \leq (lo g N)^{3 C} / N .

max {∣ δ_{t} ∣, ∣ δ_{s} ∣, ∣ δ_{t}^{[k]} ∣, ∣ δ_{s}^{[k]} ∣} \leq N^{- \frac{1}{2} - α} and max {∣ w_{t} ∣, ∣ w_{s} ∣, ∣ w_{t}^{[k]} ∣, ∣ w_{s}^{[k]} ∣} \leq (lo g N)^{3 C} / N .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Noise sensitivity of the top eigenvector of a Wigner matrix

††thanks: Gábor Lugosi was supported by the Spanish Ministry of Economy and Competitiveness, Grant PGC2018-101643-B-I00; “High-dimensional problems in structured probabilistic models - Ayudas Fundación BBVA a Equipos de Investigación Cientifica 2017”; and Google Focused Award “Algorithms and Learning for AI”. Charles Bordenave was supported by by the research grants ANR-14-CE25-0014 and ANR-16-CE40-0024-01. Nikita Zhivotovskiy was supported by RSF grant No. 18-11-00132.

Charles Bordenave Institut de Mathématiques de Marseille, CNRS & Aix-Marseille University, Marseille, France.

Gábor Lugosi Department of Economics and Business, Pompeu Fabra University, Barcelona, Spain, [email protected], Pg. Lluís Companys 23, 08010 Barcelona, SpainBarcelona Graduate School of Economics

Nikita Zhivotovskiy This work was prepared while Nikita Zhivotovskiy was a postdoctoral fellow at the department of Mathematics, Technion I.I.T. and researcher at National University Higher School of Economics. Now at Google Research, Brain Team.

Abstract

We investigate the noise sensitivity of the top eigenvector of a Wigner matrix in the following sense. Let $v$ be the top eigenvector of an $N\times N$ Wigner matrix. Suppose that $k$ randomly chosen entries of the matrix are resampled, resulting in another realization of the Wigner matrix with top eigenvector $v^{[k]}$ . We prove that, with high probability, when $k\ll N^{5/3-o(1)}$ , then $v$ and $v^{[k]}$ are almost collinear and when $k\gg N^{5/3}$ , then $v^{[k]}$ is almost orthogonal to $v$ .

1 Introduction

In this paper we study the noise sensitivity of top eigenvectors of Wigner matrices. For a positive integer $N$ , let $X=(X_{i,j})$ be a symmetric $N\times N$ matrix such that, for $i\leq j$ , the $X_{i,j}$ are independent real random variables, such that for some constant $\delta>0$ and for all $i\leq j$ , $\mathbb{E}X_{i,j}=0$ and $\mathbb{E}\exp(|X_{i,j}|^{\delta})\leq 1/\delta$ . Note that this assumption is satisfied for a wide class of distributions with a sufficiently light tail. Uniformly bounded, sub-gaussian, and sub-exponential distributions fall in this class. To guarantee that $X$ is a symmetric matrix, we set $X_{i,j}=X_{j,i}$ . Finally, we assume that the off-diagonal entries have the unit variance: for all $i\neq j$ , $\mathbb{E}X_{ij}^{2}=1$ and for all $i$ , $\mathbb{E}X_{ii}^{2}=\sigma_{0}^{2}$ , for some $\sigma_{0}\geq 0$ . Throughout this text, we call such matrix $X$ a Wigner matrix. In this paper we are concerned with large matrices and the main results are asymptotic, concerning $N\to\infty$ . The distribution of the entries $X_{i,j}$ may change with $N$ though we suppress this dependence in the notation. However, the values of $\sigma_{0}$ and $\delta$ are assumed to be the same for all $N$ .

Let $\lambda=\sup_{w\in S^{N-1}}\left\langle w,Xw\right\rangle$ be the top eigenvalue of $X$ and let $v$ denote the corresponding unit eigenvector. In this paper we study the noise sensitivity of $v$ . In particular, we are interested in the behavior of the top eigenvector $v^{[k]}$ of the symmetric matrix $X^{[k]}$ obtained by resampling $k$ random entries of $X$ . The main finding of the paper is that, with high probability, when $k\leq N^{5/3-o(1)}$ , then $v$ and $v^{[k]}$ are almost collinear and when $k\gg N^{5/3}$ , then $v^{[k]}$ is almost orthogonal to $v$ .

Related work and proof technique

Noise sensitivity is an important notion in probability that has been extensively studied since the pioneering work of Benjamini, Kalai, and Schramm [2]. Noise sensitivity has mostly been studied in the context of Boolean functions and it has been shown to have deep connections with threshold phenomena, measure concentration, and isoperimetric inequalities, see Talagrand [22], Friedgut and Kalai [11], Kahn, Kalai, and Linial [16], Bourgain, Kahn, Kalai, Katznelson, and Linial [4] for some of the key early work and Garban [12], Garban and Steif [14], Kalai and Safra [17], O’Donnell [20] for surveys. The key techniques for studying noise sensitivity typically use elements of harmonic analysis, in particular, hypercontractivity ([22], [16]) but also the “randomized algorithm” approach of Schramm and Steif [21] and other techniques, see Garban, Pete, and Schramm [13].

Our approach is inspired by Chatterjee’s work [7] who shows that, for functions of independent standard Gaussian random variables, the notion of noise sensitivity (or “chaos” as Chatterjee calls it) is deeply related to the notion of “superconcentration”.

In fact, a result in a similar spirit to ours for the Gaussian Unitary Ensemble was proved by Chatterjee [7, Section 3.6]. However, instead of resampling random entries of the matrix, the perturbations considered in [7] are different. In Chatterjee’s model, every entry of the matrix $X$ is perturbed by replacing $X$ by $Y=e^{-t}X+\sqrt{1-e^{-2t}}X^{\prime}$ where $X^{\prime}$ is an independent copy of $X$ and $t>0$ . It is proved in [7] that the top eigenvectors of $X$ and $Y$ are approximately orthogonal (in the sense that the expectation of their inner product goes to zero as $N\to\infty$ ) as soon as $t\gg N^{-1/3}$ .

Chatterjee uses this example to illustrate how “superconcentration” implies “chaos”. His techniques crucially depend on the Gaussian assumption as in that case explicit formulas may be exploited. Our techniques are similar in the sense that our starting point is also “superconcentration” (i.e., the fact that the variance of the largest eigenvalue of a Wigner matrix is small). However, outside of the Gaussian realm, the notions of superconcentration and chaos are murkier. Starting from a general formula for the variance of a function of independent random variables, due to Chatterjee [5], we establish a monotonicity lemma that allows us to make the connection between the variance of the top eigenvalue and the inner product of interest. Then we use the fact that the top eigenvector has a small variance (i.e., in a sense, it is “superconcentrated”). The monotonicity lemma may be of independent interest and it may have further uses when one tries to prove that “superconcentration implies chaos” for functions of independent–not necessarily Gaussian–random variables.

Result

To formally describe the setup, let $X$ be a symmetric $N\times N$ Wigner matrix as defined above. For a positive integer $k\leq\binom{N}{2}+N=N(N+1)/2$ , let the random matrix $X^{[k]}$ be defined as follows. Let $S_{k}=\{(i_{1},j_{1}),\ldots,(i_{k},j_{k})\}$ be a set of $k$ pairs chosen uniformly at random (without replacement) from the set of all ordered pairs $(i,j)$ of indices with $1\leq i\leq j\leq N$ . We also assume that $S_{k}$ is independent of the entries of $X$ . The entries of $X^{[k]}$ above the diagonal are

[TABLE]

where $(X^{\prime}_{i,j})_{1\leq i\leq j\leq N}$ are independent random variables, independent of $X$ and $X^{\prime}_{i,j}$ has the same distribution as $X_{i,j}$ , for all $i\leq j$ . In words, $X^{[k]}$ is obtained from $X$ by resampling $k$ random entries of the matrix above and including the diagonal and also the corresponding terms below the diagonal. Clearly, $X^{[k]}$ has the same distribution as $X$ . Denote unit eigenvectors corresponding to the largest eigenvalues of $X$ and $X^{[k]}$ by $v$ and $v^{[k]}$ , respectively. Note that with overwhelming probability, the spectrum of a Wigner matrix is simple and, in particular, the top unit eigenvector is unique (up to changing the sign), see [1].

Our main results are the following.

Theorem 1.

Assume that $X$ is a Wigner matrix as above. If $k/N^{5/3}\to\infty$ , then

[TABLE]

Conversely, our second result asserts that when $k\leq N^{5/3-o(1)}$ then $v$ and $v^{[k]}$ are almost aligned.

Theorem 2.

Assume that $X$ is a Wigner matrix as above. There exists a constant $c>0$ such that, with $\varepsilon_{N}=(\log N)^{-c\log\log N}$ ,

[TABLE]

The proof of Theorem 2 actually establishes that $\max_{k}\min_{s}\sqrt{N}\|v-sv^{[k]}\|_{\infty}$ goes to [math] in probability.

The following heuristic argument may provide an intuition of why the threshold in the lower bound of Theorem 2 is at $k=N^{5/3-o(1)}$ . Since the seminal work of Erdős, Schlein, and Yau [10], it is well known that unit eigenvectors of random matrices are delocalized in the sense that $\|v\|_{\infty}=N^{-1/2+o(1)}$ with high probability. Denoting the top eigenvalue of $X^{[k]}$ by $\lambda^{[k]}$ , we might infer from the derivative of a simple eigenvalue as the function of the matrix entries that

[TABLE]

where $v_{i}$ is the $i$ -th component of $v$ Assuming that $v_{i}$ is nearly independent of any matrix entry $X_{ij}$ , since $X_{ij}$ is centered with unit variance, we would get from the central limit theorem that

[TABLE]

On the other hand, the known behavior of random matrices at the edge of the spectrum implies that the second largest eigenvalue of $X$ is at distance of order $N^{-1/6}$ from $\lambda$ . The above heuristic should thus break down when $\sqrt{k}/N^{1+o(1)}$ is of order $N^{-1/6}$ . It gives the threshold at $k=N^{5/3+o(1)}$ .

To get an idea of how Theorem 1 is proved, consider the variance of the largest eigenvalue $\lambda$ of $X$ . The key inequality we prove is that

[TABLE]

By the Tracy-Widom law [24, 25] for the largest eigenvalue, we expect that $\mathrm{Var}(\lambda)$ is of order $N^{-1/3}$ , which implies the desired asymptotic orthogonality whenever $k/N^{5/3}\to\infty$ . The proof of the inequality above is based on a variance formula for general functions of independent random variables due to Chatterjee [5], see Lemma 1 below. The variance formula suggests that small variance implies noise sensitivity of the top eigenvalue in a certain sense. This is made precise by Lemmas 2 and 3. Finally, noise sensitivity of the top eigenvalue translates to the inequality above.

Remark. We expect that the arguments of Theorem 1 for the noise sensitivity of the top eigenvalue may be modified to prove analogous results for the eigenvector corresponding to the $j$ -th largest eigenvalue, $1\leq j\leq N$ . However, the threshold is expected to occur at values different from $N^{5/3}$ . In particular, a simple heuristic argument suggests that for the $j$ -th eigenvector the threshold occurs around $N^{5/3+o(1)}\min(j,N-j+1)^{-2/3}$ . However, to keep the presentation transparent, in this paper we focus on the top eigenvalue.

Interestingly, the proof that the top eigenvalue is very sensitive to resampling more than $\Theta(N^{5/3})$ entries involves proving that it is insensitive to resampling just a single entry. As a consequence the proofs of Theorems 1 and 2 share common techniques.

The rest of the paper is dedicated to proving Theorems 1 and 2. In Section 2 we introduce a general tool for proving noise sensitivity that generalizes Chatterjee’s ideas based of “superconcentration” to functions of independent, not necessarily standard normal random variables. In Section 3 we summarize some of the tools from random matrix theory that are crucial for our arguments. In Sections 4 and 5 we give the proofs of Theorems 1 and 2.

2 Variance and noise sensitivity

The first building block in the proof of Theorem 1 is a formula for the variance of an arbitrary function of independent random variables, due to Chatterjee [5]. For any positive integer $i$ , denote $[i]=\{1,\ldots,i\}$ .

Lemma 1.

[5]* Let $X_{1},\ldots,X_{n}$ be independent random variables taking values in some set $\mathcal{X}$ and let $f:\mathcal{X}^{n}\to\mathbb{R}$ be a measurable function. Denote $X=(X_{1},\ldots,X_{n})$ . Let $X^{\prime}=(X_{1}^{\prime},\ldots,X_{n}^{\prime})$ be an independent copy of $X$ . Under the notation*

[TABLE]

and, in particular, $X^{[0]}=X$ and $X^{[n]}=X^{\prime}$ , we have

[TABLE]

In general, for $A\subseteq[n]$ let $X^{A}$ denote the random vector, obtained from $X$ by replacing the components indexed by $A$ by corresponding components of $X^{\prime}$ .

In the variance formula above, the order of the variables does not matter and the formula remains valid after permuting the indices $1,\ldots,n$ arbitrarily. In particular, one may take the variables in random order. Thus, if $\sigma=(\sigma(1),\ldots,\sigma(n))$ is a random permutation sampled uniformly from the symmetric group $S_{n}$ and $\sigma([i])$ denotes $\{\sigma(1),\ldots,\sigma(i)\}$ , then

[TABLE]

Note that on the right-hand side of (2.1) the expectation is taken with respect to both $X,X^{\prime}$ , and the random permutation $\sigma$ .

One would intuitively expect that the terms on the right-hand side of (2.1) decrease with $i$ , as the differences $f(X)-f(X^{(\sigma(i))})$ and $f(X^{\sigma([i-1])})-f(X^{\sigma([i])})$ become less correlated as more randomly chosen components get resampled. This is indeed the case and this fact is one of our main tools in proving noise sensitivity. We believe that the following lemma can be useful in diverse situations. The proof is given in Section 4.1 below.

Lemma 2.

Consider the setup of Lemma 1 and the notation above. For $i\in[n]$ , denote

[TABLE]

where the expectation is taken with respect to components of vectors and random permutations. Then $B_{i}\geq B_{i+1}$ for all $i=1,\ldots,n-1$ and $B_{n}\geq 0$ . In particular, for any $k\in[n]$ ,

[TABLE]

We also introduce a modification of Lemma 2 that will be more convenient for our purposes. To do so, we introduce the following notation. Let $j$ have uniform distribution on $[n]$ . Let $X^{(j)\circ\sigma([i-1])}$ denote the vector obtained from $X^{\sigma([i-1])}$ by replacing its $j$ -th component by an independent copy of the random variable $X_{j}$ , denoted by $X_{j}^{\prime\prime}$ . Observe that $j$ may belong to $\sigma([i-1])$ and in this case $X_{j}^{\prime\prime}$ is independent of $X_{j}^{\prime}$ appearing in $X^{\sigma([i-1])}$ . With this notation in mind we may prove the following version of Lemma 2.

Lemma 3.

Using the notation of Lemma 2, assuming that $j$ is chosen uniformly at random from the set $[n]$ and independently of other random variables involved, we have for any $k\in[n]$ ,

[TABLE]

where for any $i\in[n]$ ,

[TABLE]

3 Random matrix results

In the proof of Theorem 1 we apply Lemma 3 with $f$ being the top eigenvalue of a Wigner matrix. The usefulness of this bound crucially hinges on the fact that the variance of the top eigenvalue is small, that is, in a sense, the top eigenvalue is “superconcentrated”. This fact is quantified in this section.

Our first lemma on the variance of $\lambda$ is obtained as a combination of a result of Ledoux and Rider [19] on Gaussian ensembles and the universality of fluctuations for Wigner matrices as stated in Erdős, Yau and Yin [9].

Lemma 4.

Assume that $X$ is a Wigner matrix as in Theorem 1. Let $\lambda$ denote the largest eigenvalue of $X$ . Then,

[TABLE]

where $c>0$ is an absolute constant.

Remark. The result of Lemma 4 implies an improved version of the variance bound

[TABLE]

following from [9, Theorem 2.2].

We also need the following delocalization result of the top eigenvector of a Wigner matrix which can be found in Tao and Vu [23, Proposition 1.12].

Lemma 5.

[23].* Assume that $X$ is a Wigner matrix as in Theorem 1. For any real $c_{0}>0$ , there exists a constant $C>0$ , such that, with probability at least $1-CN^{-c_{0}}$ , any eigenvector $w$ of $X$ with $\|w\|_{2}=1$ satisfies*

[TABLE]

Our final lemma is a perturbation inequality in $\ell^{\infty}$ -norm of the top eigenvector of a Wigner matrix when a single entry is re-sampled. The proof uses precise estimates on the eigenvalue spacings in Wigner matrices proved in Tao and Vu [23] and Erdős, Yau, and Yin [9].

Lemma 6.

Let $X$ be a Wigner matrix as in Theorem 1 and $X^{\prime}$ be an independent copy of $X$ . For any $(i,j)$ with $1\leq i,j\leq N$ . Denote by $X^{(ij)}$ the symmetric matrix obtained from $X$ by replacing the entry $X_{ij}$ by $X^{\prime}_{ij}$ and $X_{ji}$ by $X^{\prime}_{ji}$ . For any $0<\alpha<1/10$ , there exists $\kappa>0$ such that, for all $N$ large enough, with probability at least $1-N^{-\kappa}$ ,

[TABLE]

where $v$ and $u^{(ij)}$ are any unit eigenvectors corresponding to the largest eigenvalues of $X$ and $X^{(ij)}$ .

4 Proof of Theorem 1

Now we are ready for the proof of the main results of the paper.

We start by fixing some notation. Let $\lambda$ denote the largest eigenvalue of the Wigner matrix $X$ of Theorem 1 and let $v\in S^{N-1}$ be a corresponding normalized eigenvector. Let $k\in\left[\binom{N}{2}+N\right]$ to be specified later and let $X^{[k]}$ be the random symmetric matrix obtained by resampling $k$ random entries above the diagonal and including the diagonal, as defined in the introduction. We denote by $S_{k}\subset\left[\binom{N}{2}+N\right]$ the set of random positions of the $k$ resampled entries. Let $\lambda^{[k]}$ denote the top eigenvalue of $X^{[k]}$ and $v^{[k]}$ a corresponding normalized eigenvector.

For $1\leq i\leq j\leq N$ , we denote by $Y_{(ij)}$ the symmetric matrix obtained from $X$ by replacing the entry $X_{ij}$ by $X^{\prime\prime}_{ij}$ where $X^{\prime\prime}$ is an independent copy of $X$ . We obtain $Y^{[k]}_{(ij)}$ from $X^{[k]}$ by the same operation. We denote by $(\mu_{(ij)},u_{(ij)})$ , and $(\mu^{[k]}_{(ij)},u^{[k]}_{(ij)})$ the top eigenvalue/eigenvector pairs of $Y_{(ij)}$ and $Y^{[k]}_{(ij)}$ , respectively. Let $(s,t)$ be a pair of indices chosen uniformly at random from $\left[\binom{N}{2}+N\right]$ and satisfying $1\leq s\leq t\leq N$ . For ease of notation, we set $Y=Y_{(st)}$ , $\mu=\mu_{(st)}$ and $u=u_{(st)}$ . We define similarly $Y^{[k]}=Y^{[k]}_{(st)}$ , $\mu^{[k]}=\mu^{[k]}_{(st)}$ and $u^{[k]}=u^{[k]}_{(st)}$ .

By applying Lemma 3 to the function of $n=\binom{N}{2}+N$ independent random variables $f\left((X_{i,j})_{1\leq i\leq j\leq N}\right)=\lambda$ , we obtain that, for any $k\in\left[\binom{N}{2}+N\right]$ ,

[TABLE]

In what follows, we show that the right-hand side of (4.1) satisfies

[TABLE]

This relation, combined with Lemma 4 and (4.1), implies

[TABLE]

which is sufficient for Theorem 1. We proceed with the formal argument.

Using the notation of the previous section we have

[TABLE]

Using the fact that $v$ maximizes $\left\langle v,Xv\right\rangle$ and $u$ maximizes $\left\langle u,Yu\right\rangle$ we have

[TABLE]

Observe that the elements of $X-Y$ are all zeros except at most two that correspond to resampled values. If the element $X_{t,s}$ of $X$ was resampled to get $Y$ , we have, for any vector $x$ ,

[TABLE]

with $U_{t,s}=(X_{t,s}-X^{\prime\prime}_{t,s})(1+\mathbbm{1}(t\neq s))$ . Similarly, if we set $U^{\prime}_{t,s}=(X^{\prime}_{t,s}-X^{\prime\prime}_{t,s})(1+\mathbbm{1}(t\neq s))$ , we have $\left\langle x,(X^{[k]}-Y^{[k]})x\right\rangle=U^{\prime}_{t,s}x_{t}x_{s}$ . Therefore, it is straightforward to see that

[TABLE]

where we have set,

[TABLE]

and for $1\leq i\leq j\leq N$ ,

[TABLE]

In order to have some extra independence, we introduce yet another independent copy of our random variables. For $1\leq i\leq j\leq N$ , let $Z_{(ij)}$ be the symmetric matrix obtained from $X$ by replacing the entry $X_{ij}$ by $X^{\prime\prime\prime}_{ij}$ where $X^{\prime\prime\prime}$ is an independent copy of $X$ , independent of $X^{\prime}$ and $X^{\prime\prime}$ . We obtain $Z^{[k]}_{(ij)}$ from $X^{[k]}$ by the same operation. As above, we denote by $w_{(ij)}$ , and $w^{[k]}_{(ij)}$ the top unit eigenvector of $Z_{(ij)}$ and $Z^{[k]}_{(ij)}$ , respectively. For ease of notation, with $(s,t)$ as above, we define $w=w_{(s,t)}$ and $w^{[k]}=w^{[k]}_{(st)}$ . The key observation is that $V_{i,j}$ is independent of $Z_{(ij)}$ and $Z^{[k]}_{(ij)}$ .

Fix $0<\alpha<1/10$ and let $C$ be as in Lemma 5 for $c_{0}=10$ . We define $\mathcal{E}=\mathcal{E}_{1}\cap\mathcal{E}_{2}$ to be the intersection of the following two events:

•

$\mathcal{E}_{1}$ : for all $1\leq i\leq j\leq N$ : $\max(\|v-w_{(ij)}\|_{\infty},\|u_{(ij)}-w_{(ij)}\|_{\infty},\|v^{[k]}-w_{(ij)}^{[k]}\|_{\infty},\|u_{(ij)}^{[k]}-w_{(ij)}^{[k]}\|_{\infty})\leq N^{-\frac{1}{2}-\alpha}$ .

•

$\mathcal{E}_{2}$ : $\|x\|_{\infty}\leq\frac{(\log N)^{C}}{\sqrt{N}}$ for all $x\in\left\{v,u_{(ij)},w_{(ij)},v^{[k]},u_{(ij)}^{[k]},w_{(ij)}^{[k]}:1\leq i,j\leq N\right\}$ .

By Lemmas 5, 6, and the union bound, we have, for all $N$ large enough, $\mathbb{P}(\mathcal{E}_{2}^{c})\leq N^{-6}$ and for some $\kappa>0$ , $\mathbb{P}(\mathcal{E}^{c})\leq N^{-\kappa}$ (provided that we choose properly the $\pm$ -phase for the eigenvectors $u$ , $w$ , $u^{[k]}$ and $w^{[k]}$ ). Observe that when $\mathcal{E}$ holds, for all

[TABLE]

we have, for all $N$ large enough,

[TABLE]

We show this, for brevity, only for $v_{t}v_{s}v^{[k]}_{t}v^{[k]}_{s}$ . Denoting $\delta_{t}=w_{t}-v_{t}$ and $\delta_{t}^{[k]}=v_{t}^{[k]}-w_{t}^{[k]}$ , we write

[TABLE]

Then open the brackets and use that, on $\mathcal{E}$ ,

[TABLE]

If $\mathcal{E}$ holds, we thus have

[TABLE]

On the other hand, if $\mathcal{E}_{2}\backslash\mathcal{E}$ holds, we get

[TABLE]

Finally, if $\mathcal{E}_{2}$ does not hold, using that all the vectors are of unit norm (and therefore, $\max\{|v_{t}|,|v_{s}|,|v^{[k]}_{t}|,|v^{[k]}_{s}|\}\leq 1$ ), we have

[TABLE]

The same bounds hold for $V_{t,s}w_{t}w_{s}w^{[k]}_{t}w^{[k]}_{s}$ on $\mathcal{E}_{2}\backslash\mathcal{E}$ and $\mathcal{E}_{2}^{c}$ . Note also that $\mathbb{E}V_{t,s}^{2}\leq c^{2}_{1}$ for some constant $c_{1}\geq 1$ depending on $\delta$ . Combining altogether the last three bounds, by the Cauchy-Schwarz inequality, we arrive at

[TABLE]

Recalling (4.1), we find

[TABLE]

Integrating over the random choice of $(s,t)$ , we have

[TABLE]

Now, using (4.3) and using $\frac{\binom{N}{2}+N+1}{\binom{N}{2}+N}\leq 2$ , we get

[TABLE]

where $\tilde{V}_{i,j}=V_{i,j}/2$ if $i\neq j$ , $\tilde{V}_{i,i}=V_{i,i}$ and

[TABLE]

Note that for $i\neq j$ , $\mathbb{E}\tilde{V}_{i,j}=2$ and $\mathbb{E}\tilde{V}_{i,i}=\sigma_{0}^{2}$ . We have

[TABLE]

Hence, using that the variable $V_{i,j}$ is independent of the vectors $w_{(ij)},w_{(ij)}^{[k]}$ , we deduce that

[TABLE]

where

[TABLE]

We now argue that in (4.5), we may replace the vectors $w_{(ij)}$ and $w_{(ij)}^{[k]}$ by $v$ and $v^{[k]}$ respectively. We repeat the above argument. Recall the event $\mathcal{E}=\mathcal{E}_{1}\cap\mathcal{E}_{2}$ defined above. As already pointed, on the event $\mathcal{E}$ , we have

[TABLE]

If $\mathcal{E}_{2}$ holds, we have

[TABLE]

Finally, there is the deterministic bound

[TABLE]

Combining the last three bounds we obtain that

[TABLE]

The right-hand side is upper bounded by $2\varepsilon_{N}$ . We thus have proved that

[TABLE]

with $\varepsilon^{\prime\prime}_{N}=\varepsilon^{\prime}_{N}+2\varepsilon_{N}$ . As already pointed, by Lemmas 5, 6, and the union bound, we have, for all $N$ large enough, $\mathbb{P}({\mathcal{E}^{\prime}_{2}}^{c})\leq N^{-6}$ and $\mathbb{P}({\mathcal{E}^{\prime}}^{c})\leq N^{-\kappa}$ . It follows that $\varepsilon^{\prime\prime}_{N}\to 0$ with $N$ .

Now, combining Jensen’s inequality and (4.6),

[TABLE]

From Lemma 4, the claim follows.

4.1 Proof of Lemma 2 and Lemma 3

We start with the following technical lemma.

Lemma 7.

Let $f:\mathcal{X}^{n}\to\mathbb{R}$ be a measurable function and let $\sigma\in S_{n}$ be any fixed permutation. Fix $i\in[n-1]$ and $j\in[n]$ such that $j\notin\sigma([i])$ . Let $X_{1},\ldots,X_{n}$ be independent random variables taking values in $\mathcal{X}$ . Then

[TABLE]

Proof. Without loss of generality, we may consider one particular permutation $\sigma$ , defined as follows: set $\sigma(k)=k$ for $k\notin\{1,i\}$ , $\sigma(i)=1$ , $\sigma(1)=i$ , and we may also assume that $j=i+1$ . The proof is identical for any other $\sigma$ and $j$ . In our case,

[TABLE]

Moreover, we have

[TABLE]

We introduce a simplifying notation. Denote $B=(X_{2},\ldots,X_{i})$ , $B^{\prime}=(X^{\prime}_{2},\ldots,X^{\prime}_{i})$ and $C=(X_{i+2},\ldots,X_{n})$ . Therefore, we may rewrite

[TABLE]

and

[TABLE]

Denote $h(X_{1},X_{1}^{\prime},X_{i+1},C)=\mathbb{E}[\left(f(X_{1},B,X_{i+1},C)-f(X_{1}^{\prime},B,X_{i+1},C)\right)\big{|}X_{1},X_{1}^{\prime},X_{i+1},C]$ . Using the independence of $B,B^{\prime}$ and their independence of the remaining random variables, we have

[TABLE]

At the same time, using the same notation for $h$ we have, by the Cauchy-Shwarz inequality and the fact that $X_{i+1}$ and $X_{i+1}^{\prime}$ have the same distribution,

[TABLE]

Now to prove that $A_{i}\geq 0$ , it is sufficient to show that $A_{n}\geq 0$ . Denoting $g(X_{1})=\mathbb{E}[f(X)|\ X_{1}]$ , we have

[TABLE]

where we used Jensen’s inequality and that $\mathbb{E}g(X_{1})=\mathbb{E}f(X)$ .

We proceed with the proof of Lemma 2.

Proof. In this proof by writing $i+1$ we mean $i+1\ (\text{mod}\ n)$ . For each permutation $\sigma\in S_{n}$ and fixed $i\in[n]$ we construct a corresponding permutation $\sigma^{\prime}$ by defining $\sigma^{\prime}(i)=\sigma(i+1),\ \sigma^{\prime}(i+1)=\sigma(i)$ and $\sigma^{\prime}(k)=\sigma(k)$ for $k\neq\{i,i+1\}$ .

It is straightforward to see that for any fixed $i$ there is a one-to-one correspondence between $\sigma\in S_{n}$ and $\sigma^{\prime}$ . By observing that $\sigma^{\prime}([i])=\sigma([i-1])\cup\sigma(i+1)$ and $\sigma^{\prime}([i+1])=\sigma([i+1])$ we have, conditionally on $\sigma$ ,

[TABLE]

where in the last step we used Lemma 7. Using the one to one correspondence between all $\sigma$ and $\sigma^{\prime}$ , we have

[TABLE]

The proof that $B_{n}\geq 0$ follows from Lemma 7 as well.

Finally, we prove Lemma 3.

Proof. To prove this Lemma we show an upper bound for $B_{i}^{\prime}$ . We have,

[TABLE]

Observe that $\mathbb{P}(j\in{\sigma[i-1]})=\frac{i-1}{n}$ and the second summand is equal to $B_{i}\frac{n-i+1}{n}$ . We proceed with the first summand. For $i\geq 1$ , we have

[TABLE]

Finally, we prove that

[TABLE]

Without loss of generality, we consider a particular choice of $\sigma$ and $j$ such that $\sigma(k)=k$ , for $k\in[n]$ and $j=1$ . Therefore, (4.7) will follow from

[TABLE]

Since $X^{(1)}=(X_{1}^{\prime\prime},X_{2},\ldots,X_{n})$ , we have $\mathbb{E}f(X)f(X^{[i-1]})=\mathbb{E}f(X^{(1)})f(X^{[i-1]})$ . This implies that (4.7) is valid whenever

[TABLE]

As in the proof of Lemma 7, this relation holds due to Jensen’s inequality. These lines together imply that

[TABLE]

which, using Lemma 2, proves the claim.

4.2 Proof of Lemma 4

We start with a special case. Let us say that a Wigner matrix as in Theorem 1 is standard if for all $i$ , $\mathbb{E}X_{ii}^{2}=2$ . In this case, the variance of the entries of $X$ is equal to the variance of the entries of a random matrix $Y$ sampled from the Gaussian Orthogonal Ensemble (GOE). If $\mu$ is the largest eigenvalue of $Y$ , it follows from [19, Corollary 3] that for some absolute constant $c>0$ ,

[TABLE]

On the other hand, it follows from [9, Theorem 2.4] (see also [18, Theorem 1.6] for a statement which can be used directly) that,

[TABLE]

We obtain the first claim of the lemma for standard Wigner matrices. To conclude the proof of the lemma for Wigner matrices, it suffices to prove that for any Wigner matrix $X$ , for some $\kappa\geq 1/3$ , we have for all $N$ large enough,

[TABLE]

where $\lambda_{0}$ is the largest eigenvalue of a matrix $X_{0}$ obtained from $X$ by setting to [math] all diagonal entries. We will prove it for any $\kappa<1/2$ (an improvement of the forthcoming Lemma 11 would give (4.8) for any $\kappa<1$ ). The proof requires some care since the operator norm of $X-X_{0}$ may be much larger than $1$ and the rank of $X-X_{0}$ could be $N$ .

There is an easy inequality which is half of (4.8). Let $v_{0}$ be a unit eigenvector of $X_{0}$ with eigenvalue $\lambda_{0}$ . We have

[TABLE]

where $(v_{0})_{i}$ is the $i$ -th coordinate of $v_{0}$ . We observe that $v_{0}$ is independent of $X_{ii}$ for all $i$ and $\mathbb{E}X_{ii}X_{jj}=0$ for $i\neq j$ . Denoting $(x)^{2}_{+}=\max(x,0)^{2}$ , by the Cauchy-Schwarz inequality, we deduce that

[TABLE]

We write, $\mathbb{E}\|v_{0}\|^{2}_{\infty}\leq(\log N)^{2C}/N+\mathbb{P}(\|v_{0}\|_{\infty}\geq(\log N)^{C}/\sqrt{N})$ . From Lemma 5 applied to $c_{0}=2$ , we deduce that for some constant $C>0$ ,

[TABLE]

It implies the easy half of (4.8) for any $\kappa<1$ .

The proof of the converse inequality is more involved. Fore ease of notation, we introduce the number for $N\geq 3$ ,

[TABLE]

We say that a sequence of events $(A_{N})$ holds with overwhelming probability if for any $C>0$ , there exists a constant $c>0$ such that $\mathbb{P}(A_{N})\geq 1-cN^{-C}$ . We repeatedly use the fact that a polynomial intersection of events of overwhelming probability is an event of overwhelming probability. We start with a small deviation lemma which can be found, for example, in [8, Appendix B].

Lemma 8.

Assume that $(Z_{i})$ $1\leq i\leq N$ are independent centered complex variables such that for some $\delta>0$ , for all $i$ , $\mathbb{E}\exp\left(|Z_{i}|^{\delta}\right)\leq 1/\delta$ . Then, for any $(x_{i})\in\mathbb{C}^{N}$ with overwhelming probability,

[TABLE]

For $z=E+{\mathbf{i}}\eta$ with $\eta>0$ and $E\in\mathbb{R}$ , we introduce the resolvent matrices

[TABLE]

where $I$ denotes the identity matrix. The following lemma asserts that the resolvent can be used to estimate the largest eigenvalue of $X$ and $X_{0}$ .

Lemma 9.

Let $X$ be a Wigner matrix as in Theorem 1 and let $\lambda_{1}\geq\ldots\geq\lambda_{N}$ be its eigenvalues. For any $1\leq k\leq N$ , there exists an integer $1\leq i\leq N$ such that for all $E$ and $\eta>0$

[TABLE]

Moreover, let $1\leq k\leq L$ . There exists $c_{0}>0$ such that with overwhelming probability, we have $|\lambda_{k}-2\sqrt{N}|<L^{c_{0}}N^{-1/6}$ and for all integers $1\leq i\leq N$ , and all $E$ such that $|E-2\sqrt{N}|<L^{c_{0}}N^{-1/6}$ ,

[TABLE]

Proof. From the spectral theorem, we have

[TABLE]

where $(v_{1},\ldots,v_{N})$ is an orthonormal basis of eigenvectors of $X$ and $(v_{p})_{i}$ is the $i$ -th coordinate of $v_{p}$ . In particular,

[TABLE]

From the pigeonhole principle, for some $i$ , $({v_{p}})_{i}^{2}\geq 1/N$ and the first statement of the lemma follows.

Fix an integer $1\leq k\leq L$ . From [9, Theorem 2.2] and Lemma 5, for some constants $c_{0},C_{0}>0$ , we have, with overwhelming probability, that the following event $\mathcal{E}$ holds: $|\lambda_{k}-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ , for all integers $1\leq p\leq N$ ,

[TABLE]

and $\|v_{p}\|^{2}_{\infty}\leq L/N$ . We set $q=\lfloor CL^{3c_{0}}\rfloor$ for some $C$ . Let $E$ be such that $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ . On the event $\mathcal{E}$ , if $C$ is large enough, we have, for all $p>q$ , $E-\lambda_{p}\geq C_{0}p^{2/3}N^{-1/6}$ and

[TABLE]

On the other hand, on the same event $\mathcal{E}$ , we have

[TABLE]

It remains to adjust the value of the constant $c_{0}$ to conclude the proof.

The next step in the proof of (4.8) is a comparison between the resolvent of $X$ and $X_{0}$ for $z$ close to $2\sqrt{N}$ . The following result is a corollary of [9, Theorem 2.1 (ii)].

Lemma 10.

Let $X$ be a Wigner matrix as in Theorem 1. There exists $c>0$ such that, with overwhelming probability, the following event holds: for all $z=E+{\mathbf{i}}\eta$ such that $|2\sqrt{N}-E|\leq\sqrt{N}$ and $N^{-1/2}L^{c}\leq\eta\leq N^{1/2}$ , all $i\neq j$ , we have

[TABLE]

where $\Delta=L^{c}(|E-2\sqrt{N}|+\eta)^{1/4}N^{-7/8}\eta^{-1/2}+L^{c}N^{-2}\eta^{-1}$ .

Proof. Let $Y=X/\sqrt{N}$ and for $z\in\mathbb{C}$ , $\Im(z)>0$ , $G(z)=(Y-zI)^{-1}$ . We have $R(z)=N^{-1/2}G(zN^{-1/2})$ . Theorem 2.1 (ii) in [9] asserts that with overwhelming probability for all $w=a+{\mathbf{i}}b$ such that $|a|\leq 5$ and $N^{-1}L^{c}\leq b\leq 1$ , all $i\neq j$ , we have

[TABLE]

where $\delta=L^{c}\sqrt{\Im(m(w))/(Nb)}+L^{c}(Nb)^{-1}$ and $m(w)$ is the Cauchy-Stieltjes transform of the semi-circular law (for its precise definition see [9]). Then [9, Lemma 3.4] implies that, for some $C>0$ , for all $w=a+{\mathbf{i}}b$ , $|a|\leq 5$ and $0\leq b\leq 1$ , we have $|m(w)|\leq C$ and $|\Im(m(w))|\leq C\sqrt{|a-2|+b}$ . We apply the above result for $a=E/\sqrt{N}$ and $b=\eta/\sqrt{N}$ . We obtain the claimed statement for $R(z)=N^{-1/2}G(zN^{-1/2})$ .

We use Lemma 10 to estimate the difference between $R(z)$ and $R_{0}(z)$ .

Lemma 11.

Let $X$ be a Wigner matrix as in Theorem 1, let $X_{0}$ be obtained from $X$ by setting to [math] all diagonal entries, and let $c_{0}$ be as in Lemma 9. With overwhelming probability, the following event holds: for all $z=E+{\mathbf{i}}\eta$ such that $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/4}$ , all $i$ ,

[TABLE]

Proof. The resolvent identity states that if $A-zI$ and $B-zI$ are invertible matrices then

[TABLE]

Applying twice this identity, it implies that

[TABLE]

(where we omit to write the parameter $z$ for ease of notation). For any integer $1\leq i\leq N$ , we thus have

[TABLE]

Note that $X_{jj}$ is independent of $R_{0}$ . By Lemma 8 and Lemma 10 we find that, with overwhelming probability,

[TABLE]

For a given $z=E+{\mathbf{i}}\eta$ such that $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/4}$ , it is straightforward to check that, for some $c>0$ , $\Delta\leq L^{c}N^{-19/24}$ and $|I(z)|\leq L^{c}N^{-13/12}=o(1/(N\eta))$ .

Similarly, we have

[TABLE]

For a given $z$ , by Lemma 8 and Lemma 10, we have with overwhelming probability, for all $k$ , $|G_{k}|\leq L^{c}N^{-13/12}$ and $|J(z)|\leq L(\Delta N+cN^{-1/2})L^{c}N^{-13/12}=o(1/(N\eta))$ .

For a given $z$ , let $\mathcal{E}_{z}$ be the event that $\max_{1\leq i\leq N}|R(z)_{ii}-R_{0}(z)_{ii}|\leq(8N\eta)^{-1}$ and $\mathcal{E}^{\prime}_{z}$ the event that $\max_{1\leq i\leq N}|R(z)_{ii}-R_{0}(z)_{ii}|\leq(4N\eta)^{-1}$ . We have proved so far that for a given $z=E+{\mathbf{i}}\eta$ such that $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/4}$ , with overwhelming probability, $\mathcal{E}_{z}$ holds. By a net argument, it implies that with overwhelming probability, the events $\mathcal{E}_{z}^{\prime}$ hold jointly for all $z=E+{\mathbf{i}}\eta$ with $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/4}$ . Indeed, from the resolvent identity (4.10), we have $|R_{ij}(E+{\mathbf{i}}\eta)-R_{ij}(E^{\prime}+{\mathbf{i}}\eta)|\leq\eta^{-2}|E-E^{\prime}|$ . It follows that if $|E-E^{\prime}|\leq\eta^{2}(8N\eta)^{-1}\leq N^{-1}$ then $|R(z)_{ii}-R_{0}(z)_{ii}|\leq(8N\eta)^{-1}$ . Let $\mathcal{N}$ be a finite subset of the interval $K=\{E:|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}\}$ such that for all $E\in K$ , $\min_{E^{\prime}\in\mathcal{N}}|E-E^{\prime}|\leq N^{-1}$ . We may assume that $\mathcal{N}$ has at most $N$ elements. From what precedes we have the inclusion, with $\eta=N^{-1/4}$ ,

[TABLE]

From the union bound, the right-hand side holds with overwhelming probability. It concludes the proof of the lemma.

Now we have all ingredients necessary to conclude the proof of (4.8). Let $\eta=N^{-1/4}$ . We prove that for some $c>0$ , with overwhelming probability,

[TABLE]

By Lemma 9, with overwhelming probability, $|\lambda-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and for some $j$ ,

[TABLE]

and if $\lambda>\lambda_{0}$ ,

[TABLE]

By Lemma 11, we deduce that with overwhelming probability, if $\lambda>\lambda_{0}$ ,

[TABLE]

Hence, $\lambda\leq\lambda_{0}+2L^{c_{0}/2}\eta$ , concluding the proof of (4.8).

4.3 Proof of Lemma 6

Let $\lambda=\lambda_{1}\geq\cdots\geq\lambda_{N}$ be the eigenvalues of $X$ . For any $(i,j)$ , let $\lambda^{(ij)}$ be the largest eigenvalue of ${X^{(ij)}}$ . We start by proving that $\lambda$ and $\lambda^{(ij)}$ are close compared to their fluctuations. We have

[TABLE]

where $u^{(ij)}$ is as in Lemma 6. Since $X$ and ${X^{(ij)}}$ have the same distribution, we deduce from Lemma 5 that, for any $c_{0}>0$ , there exists $C>0$ such that with probability at least $1-CN^{2-c_{0}}$ , $\|v\|_{\infty}\leq(\log N)^{C}/\sqrt{N}$ and $\max_{ij}\|u^{(ij)}\|_{\infty}\leq(\log N)^{C}/\sqrt{N}$ . For all $N$ large enough, we have $(\log N)^{C}\leq L$ , where $L$ is defined in (4.9). Hence for any $c_{0}$ , for some new constant $C>0$ , with probability at least $1-CN^{2-c_{0}}$ , $\|v\|_{\infty}\leq L/\sqrt{N}$ and $\max_{ij}\|u^{(ij)}\|_{\infty}\leq L/\sqrt{N}$ . Since $c_{0}$ can be taken arbitrarily large, we deduce that with overwhelming probability, $\|v\|_{\infty}\leq L/\sqrt{N}$ , $\max_{ij}\|u^{(ij)}\|_{\infty}\leq L/\sqrt{N}$ and $\max_{ij}(|X_{ij}|+|X^{\prime}_{ij}|)\leq L/2$ . On this event, we get

[TABLE]

Reversing the role $X$ and ${X^{(ij)}}$ and using the union bound, we deduce that, with overwhelming probability,

[TABLE]

It follows from [23, Theorem 1.14] that, for any $\rho>0$ , there exists $\kappa>0$ such that, for all $N$ large enough,

[TABLE]

Let $(v_{1},\ldots,v_{p})$ be an orthonormal basis of eigenvectors of $X$ associated to the eigenvalues $(\lambda_{1},\ldots,\lambda_{N})$ with $v_{1}=v$ . We set $\theta=2/5-3\rho/5$ and $q=\lfloor N^{\theta}\rfloor$ . For some constant $c>0$ to be defined and $\rho\in(0,1/16)$ , we introduce the event $\mathcal{E}_{\rho}$ such that

•

$\lambda_{2}<\lambda-N^{-1/2-\rho}$ and $\lambda_{q}\leq\lambda-cq^{2/3}N^{-1/6}$ ;

•

$\max_{1\leq p\leq q}\|v_{p}\|_{\infty}\leq L/\sqrt{N}$ and $\max_{ij}\|u^{(ij)}\|_{\infty}\leq L/\sqrt{N}$ ;

•

$\max_{ij}(|X_{ij}|+|X^{\prime}_{ij}|)\leq L/2$ .

From what precedes, Lemma 5 and [9, Theorem 2.2], for some $c$ small enough, for any $\rho>0$ there exits $\kappa>0$ such that for all $N$ large enough, $\mathbb{P}(\mathcal{E}_{\rho})\geq 1-N^{-\kappa}$ . Note also, that we have checked that if $\mathcal{E}_{\rho}$ holds then $\max_{ij}|\lambda-\lambda^{(ij)}|\leq L^{3}/N$ .

On the event $\mathcal{E}_{\rho}$ , we now prove that $v$ and $u^{(ij)}$ are close in $\ell^{\infty}$ -norm. For a fixed $(i,j)$ , we write, $u^{(ij)}=\alpha v+\beta x+\gamma y$ , where $\alpha^{2}+\beta^{2}+\gamma^{2}=1$ with $\beta,\gamma$ non-negative real numbers, $x$ is a unit vector in the vector space spanned by $(v_{2},\ldots,v_{q})$ , and $y$ is a unit vector in the vector space spanned by $(v_{q+1},\ldots,v_{N})$ . Set

[TABLE]

We have

[TABLE]

Taking the scalar product with $y$ , we find

[TABLE]

Hence,

[TABLE]

Similarly, taking the scalar product with $x$ , we find

[TABLE]

Since $|\langle a,b\rangle|\leq\|a\|_{\infty}\|b\|_{1}\leq m\|a\|_{\infty}\|b\|_{\infty}$ where $m$ is the number of non-zeros entries of $b$ , we have $\left|\langle x,(X-{X^{(ij)}})u^{(ij)}\rangle\right|\leq\|x\|_{\infty}L^{2}/\sqrt{N}$ . By construction, $x=\sum_{p=2}^{q}\gamma_{p}v_{p}$ where $\sum_{p}|\gamma_{p}|^{2}=1$ . If $\mathcal{E}_{\rho}$ holds, using the Cauchy-Schwarz inequality and $\|v_{p}\|_{2}=1$ , we deduce that

[TABLE]

So finally,

[TABLE]

We deduce that $|\alpha|=\sqrt{1-\beta^{2}-\gamma^{2}}\geq 1-\beta-\gamma$ is positive for all $N$ large enough. We set $s=\alpha/|\alpha|$ . We find, since $\|y\|_{\infty}\leq\|y\|_{2}\leq 1$ ,

[TABLE]

For our choice of $\theta=2/5-3\rho/5$ , this last expression is $O(L^{4}N^{-3/5+8\rho/5})$ . Indeed, we have

[TABLE]

Since $\rho<1/16$ , we have $3/5-8\rho/5>1/2$ . Hence, finally, if we set ${\kappa^{\prime}}=1/10-8\rho/5>0$ , we get that $\|sv-u^{(ij)}\|_{\infty}=O(L^{4}N^{-1/2{-\kappa^{\prime}}})$ . This concludes the proof of the lemma.

5 Proof of Theorem 2

The proof of Theorem 2 relies on the rigorous justification of the heuristic argument sketched below the statement of Theorem 2, see the forthcoming Lemma 12. This is performed by a careful perturbation argument on the resolvent in Lemma 13. Indeed, the resolvent has nice analytical properties and it is intimately connected to the spectrum, as illustrated in Lemma 14.

Recall that $S_{k}=\{(i_{1},j_{1}),\ldots,(i_{k},j_{k})\}$ is the set of $k$ pairs chosen uniformly at random (without replacement) from the set of all ordered pairs $(i,j)$ of indices with $1\leq i\leq j\leq N$ which is used in the definition of $X^{[k]}$ . We denote by $\lambda$ and $\lambda^{[k]}$ the largest eigenvalues of $X$ and $X^{[k]}$ . Recall the definition of $L=L_{N}$ in (4.9) and the notion of overwhelming probability immediately below (4.9). The main technical lemma is the following:

Lemma 12.

Let $X$ be a Wigner matrix as in Theorem 2 and let $\lambda=\lambda_{1}\geq\cdots\geq\lambda_{N}$ be its eigenvalues. For any $c>0$ there exists a constant $c_{2}>0$ such that for all $\varepsilon>0$ , for all $N$ large enough, with probability at least $1-\varepsilon$ ,

[TABLE]

We postpone the proof of Lemma 12 to the next subsection. We denote by $R(z)=(X-zI)^{-1}$ and $R^{[k]}(z)=(X^{[k]}-zI)^{-1}$ the resolvent of $X$ and $X^{[k]}$ . The proof of Lemma 12 is based on this comparison lemma on the resolvents.

Lemma 13.

Let $X$ be a Wigner matrix as in Theorem 1. Let $c_{0}>0$ be as in Lemma 9 and let $c_{1}>0$ . There exists $c_{2}>0$ such that, with overwhelming probability, the following event holds: for all $k\leq N^{5/3}L^{-c_{2}}$ , for all $z=E+{\mathbf{i}}\eta$ such that $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ ,

[TABLE]

We postpone the proof of Lemma 13 to the next subsection. Our next lemma connects the resolvent with eigenvectors.

Lemma 14.

Let $X$ be a Wigner matrix as in Theorem 1 and let $\varepsilon>0$ . There exist $c_{1},c_{2}$ such that the following event holds for all $N$ large enough with probability at least $1-\varepsilon$ : for all $k\leq N^{5/3}L^{-c_{2}}$ , we have, with $z=\lambda+{\mathbf{i}}\eta$ , $\eta=N^{-1/6}L^{-c_{1}}$ ,

[TABLE]

Proof. Let $\lambda=\lambda_{1}\geq\lambda_{2}\geq\cdots\geq\lambda_{N}$ be the eigenvalues of $X$ . Let $(v_{1},\ldots,v_{N})$ be an eigenvector basis of $X$ . Recall that

[TABLE]

As in the proof of Lemma 9, from [9, Theorem 2.2] and Lemma 5, for some constants $c_{0},C>0$ , we have with overwhelming probability that the following event $\mathcal{E}$ holds: $|\lambda-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ , for all integers $1\leq p\leq N$ , $\|v_{p}\|^{2}_{\infty}\leq L/N$ and for all $q>p$ with $q=\lfloor L^{c_{0}}\rfloor$ and $E$ such that $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ we have

[TABLE]

On the other hand, let $\mathcal{E}_{\delta}$ be the event that $\lambda_{2}\leq\lambda-\delta N^{-1/6}$ . Fix $\varepsilon>0$ . From [3, Theorem 2.7] and, e.g., [1, Chapter 3], there exists $\delta>0$ such that

[TABLE]

On the event $\mathcal{E}\cap\mathcal{E}_{\delta}$ , if $|\lambda-E|\leq(\delta/2)N^{-1/6}$ , we have

[TABLE]

Finally, if $|\lambda-E|\leq\eta/L^{2}$ , on the event $\mathcal{E}$ , we find easily, if $v_{i}$ is i-th coordinate of $v$ ,

[TABLE]

For some $c_{1}>0$ , we thus find, that if $\eta=N^{-1/6}L^{-c_{1}}$ then on the event $\mathcal{E}\cap\mathcal{E}_{\delta}$ , for all $E$ such that $|\lambda-E|\leq\eta/L^{2}$ we have

[TABLE]

We apply this last estimate $R$ and $E=\lambda$ . For each $k$ , let $\mathcal{E}^{[k]}$ be the event corresponding to $\mathcal{E}$ for $X^{[k]}$ instead of $X$ . We apply the above estimate on the event $\mathcal{E}^{\prime}_{k}=\mathcal{E}^{[k]}\cap\mathcal{E}_{\delta}\cap\{\max_{p=1,2}|\lambda_{p}-\lambda_{p}^{[k]}|\leq\eta/L^{2}\}$ to $R^{[k]}$ and $E=\lambda$ . By Lemma 12 and the union bound $\cap_{k\leq N^{5/3}L^{-c_{2}}}\mathcal{E}^{\prime}_{k}$ has probability at least $1-2\varepsilon$ . It concludes the proof.

We may now conclude the proof of Theorem 2. Let $c_{1},c_{2}$ be as in Lemma 14, $k\leq N^{5/3}L^{-c_{2}}$ and $\eta=N^{-1/6}L^{-c_{1}}$ . Up to increasing the value of $c_{2}$ , we may also assume that the conclusion of Lemma 13 holds. By Lemma 5, Lemma 13 and Lemma 14, for any $\varepsilon>0$ , for all $N$ large enough, with probability at least $1-\varepsilon$ , it holds that for some $c>0$ : $\sqrt{N}\|v\|_{\infty}\leq(\log N)^{c}$ , $\sqrt{N}\|v^{[k]}\|_{\infty}\leq(\log N)^{c}$ and

[TABLE]

Applied to $i=j$ , we get that for some $s_{i}\in\{-1,1\}$ ,

[TABLE]

Notably, we find

[TABLE]

Let $J=\{1\leq i\leq N:\sqrt{N}|v_{i}|\geq L^{-1/3}\}$ . It follows from the above inequality that for $i,j\in J$ , $s_{i}=s_{j}$ . Let $s$ be this common value. We have for all $i\in J$ ,

[TABLE]

Moreover, for all $i\notin J$ , by definition,

[TABLE]

It concludes the proof of Theorem 2.

5.1 Proof of Lemma 13

The proof of Lemma 13 is based on a technical martingale argument. Thanks to the resolvent identity (4.10), we will write $R^{[k]}_{ij}(z)-R_{ij}(z)$ as a sum of martingale differences up to small error terms, this is performed in Equation (5.5). These martingales will allow us to use concentration inequalities. Each term of the martingale differences will be estimated thanks to the upper bound on resolvent entries given in Lemma 10.

We apply many times the resolvent identity and for technical convenience, it will be easier to have a uniform bound on our random variables. We thus start by truncating our random variables $(X_{ij})$ . Set $\tilde{X}_{ij}=X_{ij}\mathbbm{1}(|X_{ij}|\leq(\log N)^{c})$ and $\tilde{X}^{\prime}_{ij}=X^{\prime}_{ij}\mathbbm{1}(|X^{\prime}_{ij}|\leq(\log N)^{c})$ with $c=2/\delta$ . The matrix $\tilde{X}$ has independent entries above the diagonal. Moreover, since $\mathbb{E}\exp(|X_{ij}|^{\delta})\leq 1/\delta$ , with overwhelming probability, $X=\tilde{X}$ and $X^{\prime}=\tilde{X}^{\prime}$ . It is also straightforward to check that $\mathbb{E}|X_{ij}|^{2}\mathbbm{1}(|X_{ij}|\geq(\log N)^{c})=O(\exp(-(\log N)^{2}/2))$ . It implies that $|\mathbb{E}\tilde{X}_{ij}|=O(\exp(-(\log N)^{2}/2)$ and $\mathrm{Var}(\tilde{X}_{ij})=1+O(\exp(-(\log N)^{2}/2))$ for $i\neq j$ . We define the matrix $\bar{X}$ with for $i\neq j$ ,

[TABLE]

The matrix $\bar{X}$ is a Wigner matrix as in Theorem 2 with entries in $[-L/4,L/4]$ . Moreover, from Gershgorin’s circle theorem [15, Theorem 6.6.1], with overwhelming probability, the operator norm of $X-\bar{X}$ satisfies $\|X-\bar{X}\|=O(N\exp(-(\log N)^{2}/2)$ . Observe that from the spectral theorem, for any Hermitian matrix $A$ , $\|(A-z)^{-1}\|\leq|\Im(z)|^{-1}$ . In particular, from the resolvent identity (4.10), we get $\|(X-z)^{-1}-(\bar{X}-z)^{-1}\|=\|(X-z)^{-1}(X-X^{\prime})(\bar{X}-z)^{-1}\|\leq\Im(z)^{-2}\|X-\bar{X}\|=O(N^{3}\exp(-(\log N)^{2}/2)$ if $\Im(z)\geq N^{-1}$ . The same truncation procedure applies for $X^{[k]}$ . In the proof of Lemma 13, we may thus assume without loss of generality that the random variables $X_{ij}$ have support in $[-L/4,L/4]$ .

It will also be convenient to assume that the random subset $S_{k}$ does not contain too many points on a given row or column. To that end, for $0\leq t\leq k$ , let ${\cal F}_{t}$ be the $\sigma$ -algebra generated by the random variable $X$ , $S_{k}$ and $(X^{\prime}_{i_{s},j_{s}})_{1\leq s\leq t}$ . For $1\leq i,j\leq N$ , we set

[TABLE]

Note that $T_{ij}$ is ${\cal F}_{0}$ -measurable. We have

[TABLE]

Besides, from [6, Proposition 1.1], for any $u>0$ ,

[TABLE]

If $k\leq N^{5/3}L^{-c_{2}}$ , it follows that with overwhelming probability, the following event, say $\mathcal{T}$ , holds: $\max_{ij}|T_{ij}|\leq 4k^{\prime}/N$ where for ease of notation we have set

[TABLE]

Now, let $c$ be as in Lemma 10 and, for $0\leq t\leq k$ , we denote by $\mathcal{E}_{t}\in{\cal F}_{t}$ the event that $\mathcal{T}$ holds and that the conclusion of Lemma 10 holds for $X^{[t]}$ and $R^{[t]}$ (with the convention $X^{[0]}=X$ ). If $\mathcal{E}_{t}$ holds, then for all $z=E+{\mathbf{i}}\eta$ with $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ , we have,

[TABLE]

where $c^{\prime}=1+c+\max(c_{0}/2,c_{0}/4+c_{1}/2)$ .

After these preliminaries, we may now write the resolvent expansion. Our goal is to write $R^{[k]}_{ij}(z)-R_{ij}(z)$ as a sum of martingale differences up to error terms. The outcome will be Equation (5.5) below. We define $X_{0}^{[t]}$ as the symmetric matrix obtained from $X^{[t]}$ by setting to [math] the entries $(i_{t},j_{t})$ and $(j_{t},i_{t})$ . By construction $X^{[t+1]}_{0}$ is ${\cal F}_{t}$ -measurable. We denote by $R_{0}^{[t]}$ the resolvent of $X_{0}^{[t]}$ . The resolvent identity (4.10) implies that

[TABLE]

(we omit to write the parameter $z$ for ease of notation). Now, we set for $i\neq j$ , $E^{s}_{ij}=e_{i}e_{j}^{*}+e_{j}e_{i}^{*}$ and $E_{ii}^{s}=e_{i}e_{i}^{*}$ , where $e_{i}$ denotes the canonical vector of $\mathbb{R}^{n}$ with all entries equal to [math] except the $i$ -th entry equal to $1$ . We have

[TABLE]

We use that $|X_{ij}|\leq L/4$ and $(R^{[t+1]}_{0})_{ij}\leq\eta^{-1}$ . If $\mathcal{E}_{t}$ holds, we deduce that for all $z=E+{\mathbf{i}}\eta$ with $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ , we have

[TABLE]

Similarly, the resolvent identity (4.10) with $R^{[t+1]}$ and $R^{[t]}$ implies that, if $\mathcal{E}_{t}$ holds, for all $z=E+{\mathbf{i}}\eta$ with $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ , we have

[TABLE]

Finally, the resolvent identity with $R^{[t+1]}$ and $R_{0}^{[t+1]}$ gives

[TABLE]

Note that, $\mathbb{E}[X^{\prime}_{i_{t+1}j_{t+1}}|{\cal F}_{t}]=0$ . We use $|X_{i_{t}j_{t}}|\leq L/4$ , from (5.1)-(5.2)-(5.3), we deduce that

[TABLE]

where $s^{[t]}_{ij}=((R_{0}^{[t]}E^{s}_{i_{t}j_{t}})^{2}R_{0}^{[t]})_{ij}$ and, if $\mathcal{E}_{t}$ holds,

[TABLE]

We rewrite, one last time the resolvent identity with $R_{0}^{[t+1]}$ and $R^{[t]}$ :

[TABLE]

If $\mathcal{E}_{t}$ holds, we arrive at,

[TABLE]

where $r^{[t]}_{ij}=(R_{0}^{[t]}E^{s}_{i_{t}j_{t}}R_{0}^{[t]})_{ij}$ . We have thus found that

[TABLE]

where we have set, with $Y_{ij}=X_{ij}^{2}-\mathbb{E}X_{ij}^{2}$ , $Y^{\prime}_{ij}={X^{\prime}_{ij}}^{2}-\mathbb{E}X_{ij}^{2}$ ,

[TABLE]

In this final step of the proof, we use concentration inequalities to estimate the terms in (5.5). We set $Z_{t+1}=(R^{[t+1]}_{ij}-\mathbb{E}[R^{[t+1]}_{ij}|{\cal F}_{t}])\mathbbm{1}_{\mathcal{E}_{t}}$ . We write, for any $u\geq 0$ ,

[TABLE]

By Lemma 10, we have for any $c>0$ , $\sum_{t=0}^{k-1}\mathbb{P}(\mathcal{E}_{t}^{c})=O(N^{-c})$ . Since $\mathcal{E}_{t}\in{\cal F}_{t}$ , we have that $\mathbb{E}[Z_{t+1}|{\cal F}_{t}]=0$ . Also, from (5.2)-(5.4), $|Z_{t}|\leq 2b_{t}$ . On the event $\mathcal{T}$ , we have

[TABLE]

Azuma-Hoeffding martingale inequality implies that, for $u\geq 0$ ,

[TABLE]

We apply the later inequality to $u=\log N$ . We deduce that, with overwhelming probability,

[TABLE]

We may treat similarly the random variable $s^{\prime}_{ij}$ in (5.5). We set $Z^{\prime}_{t+1}=s^{[t+1]}_{ij}Y^{\prime}_{i_{t+1}j_{t+1}}\mathbbm{1}_{\mathcal{E}_{t}}$ . Note that $s^{[t+1]}_{ij}$ is ${\cal F}_{t}$ -measurable and $\mathbb{E}[Y^{\prime}_{i_{t+1}j_{t+1}}|{\cal F}_{t}]=0$ . Thus $\mathbb{E}[Z^{\prime}_{t+1}|{\cal F}_{t}]=0$ . Moreover, since $|Y_{ij}|\leq L^{2}/16$ , from (5.2), we find $|Z^{\prime}_{t+1}|\leq b^{\prime}_{t}=L^{2}(\delta^{2}\delta_{0}+\delta_{0}^{3}\mathbbm{1}_{(t\in T_{ij})})$ . If $\mathcal{T}$ holds, we get

[TABLE]

We write, for $u\geq 0$ ,

[TABLE]

From Azuma-Hoeffding martingale inequality, we deduce that, with overwhelming probability,

[TABLE]

We now estimate the random variable $r_{ij}$ in (5.5). We will also use Azuma-Hoeffding inequality but we need to introduce a backward filtration (because we have to deal with the random variables $X_{i_{t},j_{t}}$ instead of $X^{\prime}_{i_{t},j_{t}}$ as in $s^{\prime}_{ij}$ ). We define ${\cal F}^{\prime}_{t}$ as the $\sigma$ -algebra generated by the random variables, $X^{\prime}$ , $S_{k}$ and $\{(X_{ij}):\{i,j\}\notin\{i_{s},j_{s}\},s\leq t\}$ . By construction $X^{[t]}$ and $X_{0}^{[t]}$ are ${\cal F}^{\prime}_{t}$ -measurable random variables. Let $\mathcal{E}^{\prime}_{t}\in{\cal F}^{\prime}_{t}$ be the event that $\mathcal{T}$ holds and that the conclusion of Lemma 10 holds for $X^{[t]}$ . If $\mathcal{E}^{\prime}_{t}$ holds, then for all $z=E+{\mathbf{i}}\eta$ with $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ , we have,

[TABLE]

Arguing as in (5.2), if $\mathcal{E}^{\prime}_{t}$ holds then

[TABLE]

The variable $r^{[t]}_{ij}$ is ${\cal F}^{\prime}_{t}$ -measurable and $\mathbb{E}(X_{i_{t}j_{t}}|{\cal F}^{\prime}_{t})=0$ . We write, for $u\geq 0$ ,

[TABLE]

where $\tilde{Z}_{t+1}=r^{[t]}_{ij}X_{i_{t}j_{t}}\mathbbm{1}_{\mathcal{E}^{\prime}_{t}}$ . We have $\mathbb{E}(\tilde{Z}_{t+1}|{\cal F}^{\prime}_{t})=0$ and

[TABLE]

Arguing as above, from Azuma-Hoeffding martingale inequality, we deduce that with overwhelming probability,

[TABLE]

Similarly, repeating the argument leading to (5.7) with $s_{ij}$ and the filtration $({\cal F}^{\prime}_{t})$ gives with overwhelming probability,

[TABLE]

We note also that if $\mathcal{T}$ holds then

[TABLE]

where the last inequality holds provided that $k\leq N^{5/3}$ . So finally, from (5.5)-(5.6)-(5.7)-(5.8)-(5.9), we have proved that for a given $z=E+{\mathbf{i}}\eta$ such that $|E-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and $\eta=N^{-1/6}L^{-c_{1}}$ , with overwhelming probability

[TABLE]

where the inequality holds provided that $k\leq N^{5/3}$ . Recall that $|R_{ij}(E+{\mathbf{i}}\eta)-R_{ij}(E^{\prime}+{\mathbf{i}}\eta)|\leq\eta^{-2}|E-E^{\prime}|$ . By a net argument (as in the proof of Lemma 11), we deduce that with overwhelming probability for all $z=E+{\mathbf{i}}\eta$ such that $|2\sqrt{N}-E|\leq L^{c_{0}}N^{-1/6}$ , $\left|R^{[k]}_{ij}(z)-R_{ij}(z)\right|\leq 4L^{2}\sqrt{k^{\prime}}\delta^{2}$ . It concludes the proof of Lemma 13.

5.2 Proof of Lemma 12

Let $c_{0}$ be as in Lemma 9 and $c>0$ . We set $c_{1}=c_{0}/2+2c$ and let $\eta=N^{-1/6}L^{-c_{1}}$ . Let $p\in\{1,2\}$ . We start with by bounding $\min_{j}|\lambda_{p}-\lambda_{j}^{[k]}|$ and $\min_{j}|\lambda^{[k]}_{p}-\lambda_{j}|$ . Since $X$ and $X^{[k]}$ have the same distribution, we only prove that with overwhelming probability

[TABLE]

By Lemma 9, with overwhelming probability, $|\lambda_{p}-2\sqrt{N}|\leq L^{c_{0}}N^{-1/6}$ and for some integer $1\leq i\leq N$ ,

[TABLE]

and,

[TABLE]

By Lemma 13, we deduce that if $k\leq N^{5/3}L^{-c_{2}}$ , with overwhelming probability,

[TABLE]

It proves (5.10).

We may now conclude the proof of Lemma 12. Fix $\varepsilon>0$ . As already noticed, from [3, Theorem 2.7], there exists $\delta>0$ such that, with probability at least $1-\varepsilon$ , $\lambda_{2}<\lambda-\delta N^{-1/6}$ . From what precedes, with probability at least $1-2\varepsilon$ , $\mathcal{E}_{\delta}$ holds and for all $k\leq N^{5/3}L^{-c_{2}}$ , we have

[TABLE]

with $\alpha=2L^{c_{0}/2}\eta$ . On this event, we readily find $|\lambda-\lambda^{[k]}|\leq\alpha$ and for some $p$ , $|\lambda_{p}-\lambda^{[k]}_{2}|\leq\alpha$ . Assume that this last inequality is false for $p\neq 2$ . Since $2\alpha<\delta N^{-1/6}$ , if $p\neq 2$ , then $p\leq 3$ and we deduce that $\lambda_{2}>\lambda_{2}^{[k]}+\alpha$ . We note that, on our event, for some $q$ , we have $|\lambda_{2}-\lambda_{q}^{[k]}|\leq\alpha$ . In particular, $\lambda_{q}^{[k]}>\lambda_{2}^{[k]}$ . So necessarily, $q=1$ and, from the triangle inequality, $|\lambda_{2}-\lambda_{1}|\leq 2\alpha$ . This is a contradiction since $2\alpha<\delta N^{-1/6}$ . It concludes the proof of Lemma 12.

Acknowledgments We would like to thank Jaehun Lee for pointing out a mistake in the proof of Lemma 3 in an early version of this paper. We also would like to thank the referees for their valuable reports.

Bibliography25

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni. An introduction to random matrices , volume 118 of Cambridge Studies in Advanced Mathematics . Cambridge University Press, Cambridge, 2010.
2[2] I. Benjamini, G. Kalai, and O. Schramm. Noise sensitivity of Boolean functions and applications to percolation. Publications Mathématiques de l’Institut des Hautes Etudes Scientifiques , 90(1):5–43, 1999.
3[3] Paul Bourgade, László Erdős, and Horng-Tzer Yau. Edge universality of beta ensembles. Comm. Math. Phys. , 332(1):261–353, 2014.
4[4] J. Bourgain, J. Kahn, G. Kalai, Y. Katznelson, and N. Linial. The influence of variables in product spaces. Israel Journal of Mathematics , 77(1-2):55–64, 1992.
5[5] Sourav Chatterjee. Concentration inequalities with exchangeable pairs (Ph. D. thesis) . Ph D thesis, Stanford University, 2005.
6[6] Sourav Chatterjee. Stein’s method for concentration inequalities. Probab. Theory Related Fields , 138(1-2):305–321, 2007.
7[7] Sourav Chatterjee. Superconcentration and related topics . Springer, 2016.
8[8] László Erdős, Horng-Tzer Yau, and Jun Yin. Bulk universality for generalized Wigner matrices. Probab. Theory Related Fields , 154(1-2):341–407, 2012.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Noise sensitivity of the top eigenvector of a Wigner matrix

Abstract

1 Introduction

Related work and proof technique

Result

Theorem 1**.**

Theorem 2**.**

2 Variance and noise sensitivity

Lemma 1**.**

Lemma 2**.**

Lemma 3**.**

3 Random matrix results

Lemma 4**.**

Lemma 5**.**

Lemma 6**.**

4 Proof of Theorem 1

4.1 Proof of Lemma 2 and Lemma 3

Lemma 7**.**

4.2 Proof of Lemma 4

Lemma 8**.**

Lemma 9**.**

Lemma 10**.**

Lemma 11**.**

4.3 Proof of Lemma 6

5 Proof of Theorem 2

Lemma 12**.**

Lemma 13**.**

Lemma 14**.**

5.1 Proof of Lemma 13

5.2 Proof of Lemma 12

Theorem 1.

Theorem 2.

Lemma 1.

Lemma 2.

Lemma 3.

Lemma 4.

Lemma 5.

Lemma 6.

Lemma 7.

Lemma 8.

Lemma 9.

Lemma 10.

Lemma 11.

Lemma 12.

Lemma 13.

Lemma 14.