Tail bounds for gaps between eigenvalues of sparse random matrices
Patrick Lopatto, Kyle Luh

TL;DR
This paper establishes the first eigenvalue repulsion bounds for sparse random matrices, demonstrating simple spectra and applying these results to Erdős–Rényi graphs to unify weak and strong nodal domains.
Contribution
It introduces novel eigenvalue gap bounds for sparse matrices, extending previous work and improving sparsity and error probability ranges.
Findings
Sparse matrices have simple spectra due to eigenvalue repulsion.
Eigenvalue tail bounds are established for sparse random matrices.
Weak and strong nodal domains coincide in sparse Erdős–Rényi graphs.
Abstract
We prove the first eigenvalue repulsion bound for sparse random matrices. As a consequence, we show that these matrices have simple spectrum, improving the range of sparsity and error probability from the work of the second author and Vu. As an application of our tail bounds, we show that for sparse Erd\H{o}s--R\'enyi graphs, weak and strong nodal domains are the same, answering a question of Dekel, Lee, and Linial.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Tail bounds for gaps between eigenvalues of sparse random matrices
Patrick Lopatto
and
Kyle Luh
Abstract.
We prove the first eigenvalue repulsion bound for sparse random matrices. As a consequence, we show that these matrices have simple spectrum, improving the range of sparsity and error probability from work of the second author and Vu. We also show that for sparse Erdős–Rényi graphs, weak and strong nodal domains are the same, answering a question of Dekel, Lee, and Linial.
P.L. is partially supported by the NSF Graduate Research Fellowship Program under grant DGE-1144152.
K. Luh was partially supported by NSF postdoctoral fellowship DMS-1702533.
Contents
- 1 Introduction
- 2 Main Results
- 3 Proof Strategy
- 4 Compressible and Dominated Vectors
- 5 Incompressible Vectors
- 6 Proofs of Main Results
1. Introduction
The gaps between eigenvalues of symmetric random matrices have been extensively studied by mathematicians and physicists. For the classical integrable ensembles, the Gaussian Orthogonal Ensemble and Gaussian Unitary Ensemble, the limiting spectral distribution follows the semicircle law. For an individual eigenvalue gap, however, the limiting distribution was only recently obtained [60]. Rapid progress in random matrix theory has permitted the extension of this result to a large class of random matrix models [61, 30, 57, 11, 25, 26, 27, 28, 69, 58, 59, 3, 2, 37, 17, 19, 18, 51, 16].
Much effort has been expended on understanding the extremal eigenvalue gaps, in particular the largest eigenvalue gap in the bulk of the spectrum, . Ben Arous and Bourgade [12] demonstrated that for the GUE normalized so that its spectrum is supported on , so that the typical inter-particle distance in the bulk is about , the largest bulk gap is of order . Figalli and Guionnet extended this result to -ensembles with [34]. In [32], Feng and Wei showed that the fluctuations of the largest gap are of order and computed the limiting distribution. In work of the first author with Landon and Marcinek, the largest gap results of [12, 32] were extended to generalized Wigner matrices [38], including those with discrete entry distributions. We note that recent work of Bourgade [14], which presents a concise analysis of the convergence to equilibrium of Dyson Brownian motion, is able to recover the same result at the cost of imposing a weak smoothness assumption on the matrix entries.
While we now have a substantial understanding of the largest eigenvalue gap, the smallest gap, , is more difficult to investigate because it lies well below the typical inter-particle distance. Bourgade and Ben Arous [12] showed using the determinantal structure of the GUE that its smallest gap is of order . In [31], Feng, Tian, and Wei identified the normalized limit of the smallest eigenvalue gap of the GOE and found that the gap is of order ; their argument builds on techniques previously developed by Feng and Wei to study circular -ensembles [33]. Currently, the smallest gap lies outside of the purview of traditional universality results such as the Four Moment Theorem [62], and the techniques in the recent work [38] are not applicable. The strongest available result is in the recent work of Bourgade [14], which shows universality of the smallest gap, but requires that the matrix entries possess a weak form of smoothness. At present, no universality results exist for the smallest gap for matrices that are sparse or have discrete entry distributions, such as a matrix of Bernoulli random variables.
While tail bounds are known for the individual gaps when the matrix entries are more general random variables [61, 58], the error rates are not strong enough to take a union bound to conclude anything about the minimum gap. We now scale the matrices so that their spectrum lies on , which makes the average inter-particle distance ; we take this convention to match the existing tail bound literature, and it remains in force throughout the rest of the paper. For Hermitian matrices, under stringent smoothness and decay assumptions on the random variables, a result of Erdős, Schlein, and Yau [29] implies that there exists a small constant such that
[TABLE]
for any . For discrete random variables, it was a milestone just to show that [63]. In particular, Tao and Vu showed that for any , with probability at least a random symmetric matrix has simple spectrum, meaning every eigenvalue appears with multiplicity one. In follow-up work with Nguyen [48], they showed the following tail bound for the eigenvalue gaps. Given eigenvalues labeled in ascending order, we denote the gaps by .
Theorem 1.1** ([48, Theorem 2.1]).**
There exists a constant such that the following holds for the eigenvalue gaps, , of a real symmetric Wigner matrix. For any and ,
[TABLE]
Setting , one can deduce that a real symmetric random matrix has simple spectrum with probability at least . A related problem, posed by Babai, is whether the adjacency matrix of an Erdős–Rényi random graph has simple spectrum. This was resolved affirmatively for all dense random graphs in [63, 48]. A consequence in complexity theory is that for such random graphs the graph isomorphism problem is in complexity class [6].
In this work we study the eigenvalue gaps of sparse random matrices. The theory of sparse random matrices is of interest in its own right, but it also has innumerable applications in computer science and statistics. In contexts where sparse random matrices have similar spectral guarantees as their dense counterparts, they offer significant advantages as they require less space to store, allow quicker multiplication, and need fewer random bits to generate [8, 7, 5, 22, 47, 21]. A popular model for such matrices is to consider the Hadamard (entrywise) product of a dense random matrix and a sparse matrix of independent (up to symmetry) indicator variables with expectation . Much work has been done to transfer the results known for dense random matrices to the sparse setting [9, 10, 15, 39, 37, 42, 68, 55, 10]. Although the results resemble their dense analogues, the sparsity brings about a variety of complications in the proofs. Only recently, the second author and Vu showed that for a large class of random variables and for with , a sparse random matrix has simple spectrum with probability at least [43], where this notation indicates that the implied constant depends on . This implies that the graph isomorphism problem restricted to this class of sparse random graphs is in complexity class .
Our main contribution is to go beyond verifying such matrices have simple spectrum and prove a tail bound for the minimal eigenvalue gap of sparse random matrices with . In comparison with [43], our results represent an improvement in both error probability and the range of sparsity considered. As an application of our tail bound, we show that for sparse Erdős–Rényi graphs, weak and strong nodal domains are the same, answering a question of Dekel, Lee, and Linial [24]. Our results also expand the range of sparse graphs for which the graph isomorphism problem is known to be in . Related to this last application is the graph matching problem, for which various algorithms contingent on simple spectrum are known [65, 44, 1]; our results similarly extend their range of applicability.
Acknowledgments. The authors thank the anonymous referees for their detailed comments, which substantially improved the paper.
2. Main Results
We begin with a formal definition of our random matrix model.
Definition 2.1**.**
We let denote a symmetric random matrix with entries
[TABLE]
where the are independent (for ), mean zero, variance one, and subgaussian with subgaussian moment , and the are independent (for ) Bernoulli random variables with .
Theorem 2.2**.**
Let be as in Definition 2.1, and fix . There exist constants , depending only on the subgaussian moment , such that for
[TABLE]
and
[TABLE]
the following holds for the gaps between the eigenvalues, . For any ,
[TABLE]
Observe that there is a trade-off in the strength of the error bound and the size of the eigenvalue gap, determined by the value of . For example, if we choose , we obtain the following result.
Corollary 2.3**.**
Let be as in Definition 2.1, and fix . There exist , such that for
[TABLE]
for . By a union bound,
[TABLE]
At the other extreme, setting and , we have the following result.
Corollary 2.4**.**
Let be as in Definition 2.1, and fix . For
[TABLE]
Observe that when , which is the dense case considered in [48], the above two corollaries recover [48, Corollary 2.2] and [48, Corollary 2.3], which are the analogous extreme cases of the bound in [48, Theorem 2.1].
Remark 2.5**.**
This result improves the range of sparsity in [43] from for some to . Even in the regime , our result improves on the bound in [43] where the probability of not having a simple spectrum was less than . However, we suspect that the optimal bound should be for some constant . The sparsity range of Theorem 2.2 is near optimal as yields multiple rows and columns entirely of zeros. This generates repeated eigenvalues at 0.
We also have the same result for adjacency matrices of random Erdős–Rényi graphs. Let denote the random graph on vertices with edges appearing independently and with probability .
Theorem 2.6**.**
Let be the adjacency matrix of the random Erdős–Rényi graph , and fix . There exist constants , depending only on the subgaussian moment , such that for
[TABLE]
and
[TABLE]
the following holds for the gaps between the eigenvalues, . For any ,
[TABLE]
Remark 2.7**.**
Note that an upper bound on is necessary in this case as generates a deterministic matrix with repeated eigenvalues. Additionally, our argument can be easily applied to random perturbations of a finite rank matrix; see Remark 6.2. However, for perturbations of an arbitrary matrix, new ideas are needed as many of the delicate net arguments cannot be adapted when the operator norm of the perturbed matrix is large. For dense random graphs, this was done in [48, Theorem 2.6].
2.1. Non-degeneration of Eigenvectors and Nodal Domains of a Random Graph
Consider the eigenfunctions of the Laplacian on a Riemannian manifold. The zero sets of these eigenfunctions partition the space into so-called nodal domains. These domains are of great interest to geometers and have been intensively studied (see [20, 46, 40] and the references therein). Here we consider a discrete analogue, the nodal domains of eigenvectors for adjacency matrices of random graphs, which has its roots in graph theory and has recently found uses in data science [35, 23, 24]. Given an eigenvector of an adjacency matrix , we call a subset of the vertices a weak nodal domain if it is connected, for , and is a maximal subset under these two conditions. A strong nodal domain is defined similarly using the strict inequality . Dekel, Lee, and Linial conjectured that the notions of strong and weak domains are equivalent for random graphs [24], and this was shown for with constant in [48]. A consequence of the following non-degeneration result is that we are able to resolve this conjecture for .
Theorem 2.8**.**
Let be the adjacency matrix of the random graph , and fix . For any , there exists a such that for
[TABLE]
the probability that there exists an eigenvector of with for some is at most .
Theorem 2.8 provides a quantitative lower bound on the mass of the eigenvector components, complementing the vast literature on eigenvector delocalization, which provides upper bounds (see [50, Section 4] and [13]).
Corollary 2.9**.**
For any , there exists such that with probability at least , the strong and weak nodal domains of are the same.
Arora and Bhaskara [4] showed that for random graphs with , where is a constant that may be determined explicitly,111The authors give an exact value. However, the published version of an eigenvector delocalization estimate used to prove the result differs slightly from the version given in [4], where it is cited by the authors in pre-publication form. The value of the constant should be adjusted in light of this. all non-first eigenvectors of the adjacency matrix of have exactly two weak nodal domains with high probability. Recall that since the adjacency matrix is not centered, the eigenvector corresponding to the largest eigenvalue behaves differently, tending to align itself with the all ones vector [45]. Combining this result with our previous corollary yields the following simple statement.
Corollary 2.10**.**
There exists such that the following holds. For any and , there exists such that with probability at least , each eigenvector of (except the first) has exactly two strong nodal domains which partition the vertices.
An identical non-degeneration result applies to matrices defined in Definition 2.1.
Theorem 2.11**.**
Fix . For any , there exists a such that for
[TABLE]
the probability that there exists an eigenvector of with for some is at most .
Remark 2.12**.**
Theorems 2.8 and 2.11 represent specific examples of a range of possible results. Specifically, varying in Theorem 2.2 can lead to trade-offs in the size of the entries and the strength of the probability bound. We have chosen to give a simple polynomial bound on the size and probability for the sake of simplifying the presentation.
We also remark that nodal domains were studied in the recent work [36], which showed that there exists a constant such that for the two nodal domains identified in [4] are balanced, meaning they each contain close to vertices with high probability. Further, [54] shows that, with high probability, any vertex is connected to some vertex in the other domain.
The remainder of the paper is organized as follows. In Section 3, we outline the key steps and intuition for the proof of Theorem 2.2. In Sections 4 and 5, we prove several preliminary results about eigenvectors of sparse random matrices. In Section 6.1, we provide the proof of Theorem 2.2. In Section 6.2 we provide the necessary modifications to extend Theorem 2.2 to non-centered random matrices, such as the adjacency matrices of Erdős–Rényi graphs, proving Theorem 2.6. Finally, in Section 6.3, we prove Theorem 2.8.
3. Proof Strategy
The proof follows the same broad outline as [43]. For as in Definition 2.1, we decompose the matrix as
[TABLE]
where . For a matrix , let be the eigenvalues of . Fix an integer such that and let (where and ) be the unit eigenvector associated to . By definition we have
[TABLE]
For the top coordinates this gives (writing for )
[TABLE]
Let be the eigenvector of corresponding to . Multiplying on the left by , we obtain
[TABLE]
By the Cauchy interlacing theorem, we have .
Since the entries of are subgaussian, we have with high probability that
[TABLE]
for some constant that depends only on the subgaussian moment of the entries. Therefore, the average size of an eigenvalue gap is roughly For any , let denote the event that
[TABLE]
We also let be the intersection of the event with the event that the eigenvector with eigenvalue has . Therefore, by (3.2) and using , on the event , we have
[TABLE]
We wish to show this is unlikely.
Recall that the theory of small ball probability (e.g. [49]) examines the probability that a random variable takes values in a small interval. Therefore, we have reduced the problem to understanding the small ball probability of the inner product of a random vector with the eigenvector . It is known that this small ball probability is related to the amount of “disorder” in the coordinates of the eigenvector. Broadly speaking, a large amount of disorder implies the small ball probability is small. We deal with the case that has high disorder eigenvectors using these results. To exclude all eigenvectors with low disorder, we employ a covering argument, varying our approach according to the structure of the eigenvector.
The covering argument is completed in multiple stages. For a fixed , we consider acting on the unit sphere, where is the identity operator. Following the prescription initiated in a series of works [41, 64, 56, 53, 9, 10], we decompose the sphere into several sets that each offer their own advantages. Compressible vectors are those vectors that are close to -sparse vectors for some parameter . In [9], it was shown that the product of the matrix with a compressible vector has many large coordinates and therefore large norm. We adapt this argument to our symmetric matrix case to exclude compressible vectors. We next consider dominated vectors, which are those vectors whose coordinates outside the largest coordinates have a small ratio of norm to norm. This type of vector was introduced in [9]. As these vectors are also nearly sparse, they can be excluded similarly to the compressible vectors.
Finally, for vectors that are neither compressible nor dominated, we use a stratification according to a measure of structure, the LCD. The LCD was introduced in [56] and is defined later. As our random matrix is symmetric, there is dependence between the rows which prevents us from applying small ball probability estimates to each coordinate independently.222This obstacle is what prevents us from reaching the optimal threshold for by simply following the argument in [9], which considered non-symmetric matrices for . To address this problem, for a fixed we partition the coordinates of into small subsets; this is similar to the method used in [66]. For a fixed subset, after conditioning on the columns of outside of the subset, we can extract more independent coordinates to use in small ball estimates. There is some flexibility in the size of these subsets, and this ultimately results in the trade-off between the error probability and gap size in Theorem 2.2.
The previous steps are done for a fixed and hold with exponentially high probability. Taking a union bound over a fine enough net of the interval completes the argument.
A similar approach was applied in [43], under the assumption that for some and therefore small polynomial terms could often be neglected. In our current setting, where is on the order of , it turns out that the above decomposition is insufficient primarily because the vectors that are not dominated or compressible can have a wide range of mass in their coordinates outside of the largest. Therefore, we further decompose the vectors by their mass in the relevant coordinates. Working in each of these classes allows some key technical estimates that bypass the small polynomial losses from [43]. These technical improvements generate the improvement in the range of sparsity and the error probability. Furthermore, in [43], the result was only concerned with a non-zero separation of the eigenvalues. A more careful accounting of the small ball probability greatly improves the (implicit) small ball estimate in [43].
4. Compressible and Dominated Vectors
The goal of this section is to prove Proposition 4.6, which shows that any eigenvector of cannot be close to a sparse vector, in a certain quantitative sense (with high probability). Before proceeding to its proof, we introduce a few necessary definitions and lemmas.
4.1. Decomposition of the sphere
We now formally define the decomposition of the unit sphere used in the proof sketch of Section 3.
Definition 4.1**.**
Fix . The set of -sparse vectors is given by
[TABLE]
Furthermore, for , we define the compressible and incompressible vectors by
[TABLE]
and
[TABLE]
For any , we let denote the set and denote the set .
Definition 4.2**.**
For any , let be a permutation which arranges the absolute values of the coordinates of in non-increasing order. For denote by the vector with coordinates
[TABLE]
For any and , define the set of vectors with dominated tail by
[TABLE]
This definition was first given in [9]. Like compressible vectors, vectors with dominated tail are close to being sparse, though in a different way. This approximate sparsity facilitates the proof of the following key bound, Proposition 4.4.
4.2. Bounds for compressible and dominated vectors
We first state a high probability bound on the operator norm of , which was defined in Definition 2.1.
Lemma 4.3** ([43, Proposition 5.2] and [67, Proposition 1.10]).**
For defined in Definition 2.1, there exist constants depending only on the subgaussian moment , such that for and ,
[TABLE]
For the remainder of this work, all references to the constant refer to the provided by Lemma 4.3.
The compressible and dominated vectors were previously resolved in [43] down to the optimal scale . Given some , we define the parameters
[TABLE]
Proposition 4.4** ([43, Proposition 5.3]).**
There exist constants , depending only on the subgaussian moment of Definition 2.1, such that the following holds. If satisfy
[TABLE]
then with probability at least ,
[TABLE]
for all and .
Remark 4.5**.**
Note that if for some constant , then is bounded below by a constant. At the optimal scale , there exist constants such that
[TABLE]
We now come to the main result of this section, which combines the previous two proposition to exclude the possibility of compressible or dominated eigenvectors.
Proposition 4.6**.**
Let be as in Definition 2.1 with . For and ,
[TABLE]
for some constant .
Proof.
Let denote a -net of the interval with
[TABLE]
If there exists a compressible or dominated eigenvector with eigenvalue , then there exists a such that
[TABLE]
By a union bound and Proposition 4.4, the probability of this event is bounded by
[TABLE]
for large enough and small enough ; to bound , we used Remark 4.5. Finally, the event that that there exists an eigenvalue outside of the interval is bounded by , by Lemma 4.3. Shrinking allows us to take a union bound to include this event, and concludes the proof. ∎
5. Incompressible Vectors
In this section, we show that does not have structured eigenvectors. We begin with Section 5.1, where we elucidate the connection between small ball probability and our measure of structure, the Least Common Denominator (LCD). Section 5.2 and Section 5.3 are devoted to the proof of Proposition 5.17, which shows it is unlikely an eigenvector of has an LCD lying in a given level set. This proposition is the main technical achievement of this section. Finally, we derive Proposition 5.18 as a straightforward consequence of Proposition 5.17 and a union bound, which excludes the possibility of structured eigenvectors altogether. Together with Proposition 4.6, Proposition 5.18 will allow us to complete the outline of Section 3 and prove our main theorems in the next section.
5.1. Small Ball Probability
Recall from the proof sketch in Section 3 that we wish to bound the probability that the inner product of an eigenvector and a random vector is small. This motivates the definition of Lévy concentration, which bounds the small ball probabilities of a random vector .
Definition 5.1**.**
The Lévy concentration of a random vector is defined to be
[TABLE]
When is a random vector and is a fixed vector, the structure of will greatly influence the Lévy concentration of the random variable . To formalize this concept, we begin with a measure of arithmetic structure for a unit vector.
Definition 5.2** ([66, Definition 6.1]).**
Let be as in Theorem 2.2. We define the least common denominator (LCD) of as
[TABLE]
where is an appropriate constant that is defined in Remark 5.3 below.
Remark 5.3**.**
There exist constants such that for any ,
[TABLE]
where is a Bernoulli random variable such that and is a subgaussian random variable with unit variance. We fix such a in Definition 5.2.
Proposition 5.4** ([9, Proposition 4.2]).**
Let be a random vector with i.i.d. coordinates of the form , where the ’s are Bernoulli random variables with and the ’s are random variables with unit variance and finite fourth moment. Then for any ,
[TABLE]
where depends only on the fourth moment of .
We may tensorize Proposition 5.4 to obtain a bound on the Lévy concentration of . The argument is almost identical to the proof of [9, Proposition 4.3], and we note only the necessary modifications here. Recall the notation from Definition 4.2. For any index set , we extend this notation to in the canonical way.
Proposition 5.5** (Small ball probabilities of via regularized LCD).**
There exists a constant such that for any and index set of size ,
[TABLE]
Proof Sketch.
We first observe that conditioning on elements of never decreases (and may increase) . We therefore condition on all elements not in columns indexed by elements of , and also condition the elements whose indices satisfy . The remaining elements are i.i.d. and consist of rows. The remainder of the argument is nearly identical to the one leading to [9, Proposition 4.3], where an analogous statement was shown for non-symmetric matrices. ∎
The following lemma provides a lower bound for the LCD in terms of the norm.
Proposition 5.6** (Lemma 6.2, [66]).**
For all ,
[TABLE]
As in [66], we define a regularized version of the LCD. However, our definition is slightly different than the one in [66]. Recall the notation given after Definition 4.1, and observe that the set in the following definition takes a distinguished role and is not included in the maximum. Here, represents a parameter that will be fixed later, in the material preceding (5.2).
Definition 5.7** (Regularized LCD).**
Let be any partition of with elements.. We define the regularized LCD of a vector as
[TABLE]
In our use of Definition 5.7 below, will be (approximately) the largest coordinates of . Hence gives a measure of the structure of the elements of left over after approximating by an -sparse vector.
5.2. Decomposition of Incompressible Vectors
In this section, we define a way to decompose incompressible vectors, which is used in the proof of Proposition 5.15 below. In order to give this decomposition, we first introduction a classification of the incompressible vectors, which allows us to control the amount of mass that is not in the largest coordinates.
Definition 5.8**.**
For and , define
[TABLE]
Remark 5.9**.**
By definition, for any , which gives rise to the condition in the preceding definition.
We will consider the sets of incompressible vectors for , where is a parameter that will be chosen later. For brevity, we introduce the shorthand
[TABLE]
For the remainder of this section we primarily use the fact that the vectors in are not dominated. That they are not compressible is used only in the proof of Proposition 5.17.
We begin with a straightforward upper bound. Recall was defined in Proposition 4.4. Fix and consider a vector . Since ,
[TABLE]
Furthermore, since by definition,
[TABLE]
On the other hand, we can also find a large set of coordinates that are uniformly lower-bounded.
Lemma 5.10**.**
For , the set
[TABLE]
satisfies .
Proof.
For the sake of contradiction, assume that . Then by (5.1),
[TABLE]
contradicting the definition of . ∎
We now define a partitioning procedure. For this, we introduce some new notation.
Definition 5.11**.**
For a set with , we use to denote all the elements from the -th to the -th in (inclusive), where we order the elements from least to greatest. For example, if then .
Let be a vector, let be a parameter satisfying
[TABLE]
and set . We define as the largest number of disjoint subsets with elements one can have of whose union does not contain the indices of the largest elements of . We consider disjoint index sets , each of size , each not containing any indices of the largest elements of . Therefore,
[TABLE]
In our definition, the index sets depend on , but we suppress this dependence in the notation. For a vector , let denote the set of indices of the largest coordinates. By Lemma 5.10, we can choose a subset of size exactly , where was defined in the statement of that lemma. We observe that and are disjoint.
Let For , we define
[TABLE]
[TABLE]
For the rest of this work, we drop floor and ceiling functions because they do not influence the argument in a substantial way.
Finally, we define . In words, contains the largest coordinates and the smaller coordinates left over from divisibility issues. In particular, . Since the sets were chosen to be disjoint for , it follows that is a partition of .
The primary objective of this partition is recorded in the following lemma, where we also define the constants .
Lemma 5.12**.**
For and ,
[TABLE]
Also,
[TABLE]
Proof.
The bounds on follow from the coordinate-wise bounds of our construction. For the lower bound, we ignore all elements not in . We obtain
[TABLE]
The claim (5.4) then follows from Lemma 5.10, (5.1), and (5.2).
For the second claim, applying Proposition 5.6 and recalling Definition 5.7 yields
[TABLE]
Then the claim follows from the lower bound on in the previous paragraph and (5.1). ∎
5.3. Vectors with Small LCD
We now exclude vectors with small regularized LCD as potential eigenvectors of . This is the content of the next proposition, Proposition 5.15, which shows that any vector in with small regularized LCD is unlikely to be near an eigenvector. We first define level sets of vectors according to their regularized LCD.
Definition 5.13**.**
For any , we define the level sets
[TABLE]
We also require a preliminary lemma. Recall was defined in Remark 5.3.
Lemma 5.14** (Lemma 6.13, [43]).**
Let , and let be a function such that
[TABLE]
Then for , the set of unit vectors
[TABLE]
admits a -net of size at most
[TABLE]
where is a universal constant and
[TABLE]
We now state and prove the main technical result of this section. Recall was defined in (5.13), was defined in Lemma 5.12, and is the constant given by Lemma 4.3.
Proposition 5.15**.**
Fix . There exist constants such that for , , , and for any
[TABLE]
and
[TABLE]
the following holds for :
[TABLE]
where
[TABLE]
Proof.
We set , and define
[TABLE]
In outline, this proof implements the following steps:
- (1)
Construct a suitable net for . 2. (2)
Upper bound the size of . 3. (3)
Show the claim holds for all . 4. (4)
Extend the result from all to all .
For Step 1, let be a vector and consider the partition of the coordinates of constructed in (5.3) with the parameter . For the coordinates , by a standard volume estimate,333See for example [52, (5.7)]. there exists a -net, , of the values such that
[TABLE]
where we recall .
For the coordinates in with , we use a construction that exploits the LCD structure. Observe that the hypothesis of Lemma 5.14 holds for because
[TABLE]
as shown in the proof of Lemma 5.12 (see (5.5)), and the lower bound tends to infinity as . For with , let denote the -net guaranteed by Lemma 5.14 applied to .444Observe we are applying this lemma when the upper limit is , according to the definition of , not . The definition of is adjusted accordingly below.
We next implement a net of scaling factors. Let be a -net of such that
[TABLE]
As observed earlier, the partition of the coordinates of is entirely determined by the sets of indices and . To approximate all , we define the preliminary set
[TABLE]
We currently have no guarantee that
[TABLE]
However, this is easily fixed. If there exists such that
[TABLE]
we replace by any such . Otherwise, we discard . This creates a new net such that . This completes Step 1.
We now enter Step 2 of the proof and upper bound the size of . We may combinatorially determine the size of using the sizes of the and . This leads to the following bound on the cardinality of our net:
[TABLE]
The combinatorial factors come from the choices of and in (5.7).
We now proceed to simplify this bound. From the elementary bound
[TABLE]
we have the following exponential bound for :
[TABLE]
For the second factor, we recalled that , so that the product from to in (5.8) has at most individual terms. Using , , , and (from (5.2)), we find
[TABLE]
Recall that was defined in terms of in Lemma 5.12, and by Remark 4.5. Note also that . Then there exists such that
[TABLE]
From this, we find
[TABLE]
This completes Step 2.
We now begin Step 3 of the outline and prove the result for all the points in our net . Set
[TABLE]
By Proposition 5.5 applied with , for any and such that ,
[TABLE]
where we recall from Lemma 5.12 that . Since , by the definition of we find there exists such that . We use this in the above expression to find
[TABLE]
Straightforward computations show
[TABLE]
Recall that as defined as the minimum of the two upper bounds in (5.10), so
[TABLE]
Then
[TABLE]
Setting and applying a union bound over all elements , we obtain
[TABLE]
To bound from (5.12), we use (5.9) and divide into two cases. First, suppose . By (5.9), we have
[TABLE]
Combining this with (5.12) and absorbing the into the exponential yields
[TABLE]
so
[TABLE]
In the last line we used and (the latter is by direct calculation), so and the term inside the brackets tends to .
For the case , recalling the definition of and that gives
[TABLE]
[TABLE]
Now (5.11) shows that
[TABLE]
This, along with the stipulated range of , implies that
[TABLE]
Therefore, taking small enough in (5.15), we have
[TABLE]
This completes Step 3.
We now proceed to Step 4. Having shown the result for all the points in the net, we now extend to the entire level set . Again, we divide into cases.
We assume first that
[TABLE]
For any , let be the closest element of the net . Then, by the definition of ,
[TABLE]
In the third inequality, we used that there are terms in the sum, that the form a -net, and the upper bound on from (5.4). In the fourth inequality, we used from (5.2) and the inequality
[TABLE]
where is a constant that depends only on . The inequality (5.17) follows from the definition of and the hypothesized upper bound on . The last inequality follows by direct calculation using the value of given in (5.16) and the assumed lower bound on .
For the other case, suppose
[TABLE]
For any , let be the closest element of the net . Then, by the definition of ,
[TABLE]
In the third line, we used that there are terms in the sum, that the ’s form a -net, and the upper bound on from (5.4). The fourth line follows from the definition of . The fifth line is a result of the observation that is a decreasing function for large , , and . We also used the bound from (5.2). In the sixth line, we used the definition of in (5.18). For the the last line, we used and took large enough.
Therefore, if , then using Lemma 4.3,
[TABLE]
with exponentially small error probability, which contradicts the conclusion of Step 3 above. After adjusting by a factor of , this completes the proof. ∎
Remark 5.16**.**
As noted in Remark 2.5, the optimal result should permit as small as . The restriction that in the above proof comes from the requirement that .
We now extend the previous result to all vectors with small LCD.
Proposition 5.17**.**
Fix . There exists a constant such that for , , and for any
[TABLE]
the following holds. The probability that there exists such that
[TABLE]
is at most for , where
[TABLE]
Proof.
We set and recall that by . We can decompose the relevant vectors as
[TABLE]
where we used . Recall by Remark 4.5. Similarly, the number of indices in the union is because each of and are . Therefore, taking a union bound, applying Proposition 5.15, and observing and for the defined in Proposition 5.15 yields the result. ∎
5.4. Eigenvector Bound
We now come to a key proposition used in the proof of the main theorem.
Proposition 5.18**.**
For as in Definition 2.1, there exists a constant such that for
[TABLE]
the probability that has an eigenvector v such that
[TABLE]
is at most , for .
Proof.
Consider a -net of , where was defined in Proposition 5.17. For an eigenvalue , there exists a point of the net such that for corresponding eigenvector we have
[TABLE]
However, by a union bound and Proposition 5.17, the probability of this event is bounded by for some . By Lemma 4.3, decreasing the value of can account for the event that there exists an eigenvalue of outside the interval . This concludes the proof. ∎
6. Proofs of Main Results
6.1. Proof of Theorem 2.2
In preparation for the main proof, we record the following lemma from [43].
Lemma 6.1** ([43, Lemma 6.1]).**
For any ,
[TABLE]
Proof of Theorem 2.2.
We repeat the decomposition described in Section 3. Let
[TABLE]
where . Let (where and ) be the unit eigenvector associated to . Because is an eigenvector with eigenvalue ,
[TABLE]
Considering the top coordinates gives
[TABLE]
Let be the eigenvector of corresponding to . After multiplying on the left by , we arrive at
[TABLE]
Since by the Cauchy–Schwarz inequality, this implies
[TABLE]
By the Cauchy interlacing law, we must have . For any , let denote the event that
[TABLE]
On , (6.3) implies
[TABLE]
Now note that the decomposition (6.1) can be done along any coordinate, not just the last. For any , let be the number of coordinates with absolute value at least , and let be a parameter. Therefore, repeating the argument leading to (6.5) with the coordinate chosen uniformly at random, and considering the probability that we choose a coordinate with absolute value at least , and obtains, we find
[TABLE]
Setting in Proposition 4.6 shows that any eigenvector will not be in with exponentially high probability. When , by Lemma 6.1, there are greater than coordinates whose absolute values are larger than . We set and in (6.7) to find
[TABLE]
With probability at least ,
[TABLE]
by Proposition 4.6 (applied with ) and Proposition 5.18. At this point, we would like to apply Proposition 5.4 to control the probability in (6.8). However, this proposition applies to the LCD , not the regularized LCD , so a slightly more delicate argument is required.
By the definition of regularized LCD, there exists some subset of coordinate indices such that
[TABLE]
To adjust for the regularized LCD, we observe that conditioning on a subset of can only increase the Lévy function for any . We condition on all the random variables in whose indices do not lie in the subset . Also, to apply Proposition 5.4, we need to normalize this subset to be on the unit sphere. Therefore, by Proposition 5.4,
[TABLE]
for all . By Lemma 5.12, . Therefore, putting (6.9) into (6.8), we find
[TABLE]
We set . Then the above holds for . Recall that . Thus, we obtain the theorem after lowering , which constrains the range of . ∎
6.2. Proof of Theorem 2.6
Let denote the Erdős–Rényi random graph on vertices with edge probability , and let denote the adjacency matrix of . In other words, is a symmetric matrix of Bernoulli variables with parameter , with all [math] entries on the diagonal. We have where is the matrix of all ones, so our main theorem does not apply. However, only small modifications are necessary to handle this case, which we detail in this section, following closely the analogous argument in [43, Section 8].
First, we observe that Proposition 4.4 can be adapted so that the proposition holds for in place of . This was proved in [43, Appendix B]. It follows that Proposition 4.6 also holds for (by repeating the proof of Proposition 4.6 using the analogue of Proposition 4.4 for ).
Next, we claim that Proposition 5.17 can be adapted to hold for the matrix in place of , with the additional restriction that we must suppose . The restriction is due to the fact that we will write the off-diagonal entries of this matrix as , where is Bernoulli with parameter and is Bernoulli with parameter (as in the definition of ). Our arguments for Proposition 5.17 revolved around Lévy concentration and nets. The use of Lévy concentration in Proposition 5.5 does not need to be modified for the random graph case, since it is invariant under changes in the mean of the matrix.555However, it does require the aforementioned decomposition , giving rise to the restriction. For the nets, we required the operator norm bound Lemma (4.3); we claim the analogue of this statement for also holds. A straightforward modification of the proof of [9, Theorem 1.7] shows
[TABLE]
for some . We obtain that Proposition 5.17 holds for , if .
Additionally, we need a slight generalization of Proposition 5.17, which lower bounds not just , but
[TABLE]
for any fixed vector . This generalization holds because the high probability lower bounds used to prove Proposition 5.17 come from Proposition 5.5, and the latter proposition concerns Lévy concentration, which is by definition translation invariant.
We now turn to the proof of Theorem 2.6.
Proof of Theorem 2.6.
Above, we established that the analogue of Proposition 5.17 holds for , if . This restriction motivates the following division into cases.
Case I: . Our preliminary goal to is establish that Proposition 5.18 holds for . We have
[TABLE]
where is the vector of all ones. Set . Let be a -net of such that
[TABLE]
For , the reverse triangle inequality yields
[TABLE]
so any with can be well approximated by for some .
Define
[TABLE]
By (6.15), a union bound over the net , and the analogue of Proposition 5.17 for (6.12) stated above, we obtain
[TABLE]
for any single . After observing that
[TABLE]
we find
[TABLE]
Using (6.18) in place of Proposition 5.17 in the proof of Proposition 5.18, we find that Proposition 5.18 holds for in place of .
We can now repeat the proof of Theorem 2.2 to prove theorem in this case, with the appropriate analogues for substituting for Proposition 5.18 and Proposition 4.6. (The latter was noted at the beginning of Section 6.2.)
Case II: . Observe that the adjacency matrix of is equal in distribution to . Hence controlling
[TABLE]
is equivalent to controlling
[TABLE]
This reduces the problem to Case I and completes the proof. ∎
Remark 6.2**.**
The size of the one-dimensional net in (6.14) is compensated by the error probability used for the union bound in (6.16). For general finite-rank perturbations by a finite linear combination of matrices of the form for , one simply adds more one-dimensional nets and completes the argument in the same way. However, for perturbations whose rank grows even moderately quickly, the combined size of the necessary supplemental nets becomes too large.
6.3. Proof of Theorem 2.8
The following is essentially Lemma 9.1 of [48]. We provide the proof for completeness.
Lemma 6.3**.**
For any there exists such that the following holds with probably at least . If there exist and such that , then has an eigenvector and corresponding eigenvalue such that
[TABLE]
Proof.
From our main result, Theorem 2.6, we may suppose that all eigenvalue gaps satisfy . Let express as a linear combination of unit eigenvectors of . There must exist such that . So
[TABLE]
implies, assuming , that . This implies the first conclusion. Then because all gaps satisfy we have that for all . But then we must have for , implying the second conclusion. ∎
Proof of Theorem 2.8.
We follow the proof of Theorem 3.3 in [48]. After adjusting by adding , it suffices to prove the claim for a single coordinate and use a union bound. Write and let its first column be where is a vector of coordinates. Let be an eigenvector with eigenvalue so that
[TABLE]
Suppose that where will be chosen later. By taking large enough, using that the entries of are bounded, and adding mass to the first component of to make it unit norm, it suffices to show that
[TABLE]
occur jointly with low probability. By Lemma 6.3, if the first condition holds then there exists an eigenvector of with . Then implies . We claim this contradicts a statement established in the proof of Theorem 2.2.
In (6.9) and the following lines, we showed
[TABLE]
where was defined below (6.10) (in terms of ). Now we take , and , which proves the theorem after taking large enough. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Yonathan Aflalo, Alex Bronstein, and Ron Kimmel. Graph matching: relax or not? ar Xiv preprint ar Xiv:1401.7623 , 2014.
- 2[2] Amol Aggarwal. Bulk universality for generalized Wigner matrices with few moments. Probability Theory and Related Fields , 173(1-2):375–432, 2019.
- 3[3] Amol Aggarwal, Patrick Lopatto, and Horng-Tzer Yau. GOE statistics for Lévy matrices. ar Xiv preprint ar Xiv:1806.07363 , 2018.
- 4[4] Sanjeev Arora and Aditya Bhaskara. Eigenvectors of random graphs: delocalization and nodal domains. https://theory.epfl.ch/bhaskara/files/deloc.pdf , 2011.
- 5[5] Enrico Au-Yeung. Sparse signal recovery using a new class of random matrices. Adv. Pure Appl. Math. , 8(2):79–89, 2017.
- 6[6] László Babai, D. Yu. Grigoryev, and David Mount. Isomorphism of graphs with bounded eigenvalue multiplicity. In Proceedings of the fourteenth annual ACM symposium on Theory of computing , pages 310–324. ACM, 1982.
- 7[7] Bubacarr Bah and Jared Tanner. On construction and analysis of sparse random matrices and expander graphs with applications to compressed sensing. ar Xiv preprint ar Xiv:1307.6477 , 2013.
- 8[8] Grey Ballard, Aydin Buluc, James Demmel, Laura Grigori, Benjamin Lipshitz, Oded Schwartz, and Sivan Toledo. Communication optimal parallel multiplication of sparse random matrices. In Proceedings of the twenty-fifth annual ACM symposium on Parallelism in algorithms and architectures , pages 222–231. ACM, 2013.
