Kesten-McKay law for random subensembles of Paley equiangular tight frames
Mark Magsino, Dustin G. Mixon, Hans Parshall

TL;DR
This paper proves a conjecture about the singular value distribution of random subensembles of Paley equiangular tight frames using the method of moments, with potential for broader applications.
Contribution
It introduces a novel proof of a conjecture on singular value distribution for Paley equiangular tight frames and extends the analysis to more general real equiangular tight frames.
Findings
Confirmed the conjecture for Paley equiangular tight frames
Extended analysis to real equiangular tight frames of redundancy 2
Suggests potential for broader generalizations in frame theory
Abstract
We apply the method of moments to prove a recent conjecture of Haikin, Zamir and Gavish (2017) concerning the distribution of the singular values of random subensembles of Paley equiangular tight frames. Our analysis applies more generally to real equiangular tight frames of redundancy 2, and we suspect similar ideas will eventually produce more general results for arbitrary choices of redundancy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Kesten–McKay law for random subensembles
of Paley equiangular tight frames
Mark Magsino
Dustin G. Mixon
Hans Parshall
Abstract
We apply the method of moments to prove a recent conjecture of Haikin, Zamir and Gavish [16] concerning the distribution of the singular values of random subensembles of Paley equiangular tight frames. Our analysis applies more generally to real equiangular tight frames of redundancy , and we suspect similar ideas will eventually produce more general results for arbitrary choices of redundancy.
1 Introduction
Frame theory concerns redundant representation in a Hilbert space. A frame [13] is a sequence in a Hilbert space for which there exist such that
[TABLE]
for every . If every has unit norm, then we say the frame is unit norm, and if , we say the frame is tight [9]. In the special case where , a frame is simply a spanning set, but unit norm tight frames are still interesting and useful [4, 21]. For example, equiangular tight frames are unit norm tight frames with the additional property that is constant over the choice of pair . Equiangular tight frames are important because they necessarily span optimally packed lines, which in turn find applications in multiple description coding [25], digital fingerprinting [22], compressed sensing [1], and quantum state tomography [24]; see [14] for a survey.
Various applications demand control over the singular values of subensembles of frames. In quantum physics, Weaver’s conjecture [28] (equivalent to the Kadison–Singer problem [18, 8], and recently resolved in [19]) concerns the existence of subensembles of unit norm tight frames with appropriately small spectral norm. Compressed sensing [7, 11] has spurred the pursuit of explicit frames with the property that every subensemble is well conditioned [10, 5, 1]. Motivated by applications in erasure-robust analog coding, Haikin, Zamir and Gavish [16, 15] recently launched a new line of inquiry: identify frames for which the singular values of random subensembles exhibit a predictable distribution. (One might consider this to be a more detailed analogue to Tropp’s estimates on the conditioning of random subensembles [27].) Of particular interest are random subensembles of equiangular tight frames, and in this paper, we consider equiangular tight frames comprised of vectors in , which correspond to symmetric conference matrices. (Note that such frames have already received some attention in the context of compressed sensing [1, 2].)
An matrix is said to be a conference matrix if
- (i)
for every ,
- (ii)
for every with , and
- (iii)
.
A symmetric conference matrix of order exists whenever is a prime power (by a Paley–based construction), and only if and is a sum of two squares [17]. Explicitly, the Paley conference matrices are obtained by building a circulant matrix from the Legendre symbol and then padding with ones, for example:
[TABLE]
where “” denotes . One may verify that the above example satisfies . For every symmetric conference matrix , it holds that is the Gram matrix of an equiangular tight frame consisting of vectors in [25]. In particular, the equiangular tight frames that arise from the Paley conference matrices are known as Paley equiangular tight frames. In what follows, we consider random principal submatrices of symmetric conference matrices with the understanding that they may be identified with the Gram matrix of a random subensemble of the corresponding equiangular tight frame.
Given an symmetric matrix with eigenvalues , we let denote the uniform probability measure over the spectrum of (counted with multiplicity):
[TABLE]
This is known as the empirical spectral distribution of . If is a random matrix, then its empirical spectral distribution is a random measure. We say a sequence of random measures converges almost surely to a non-random absolutely continuous measure if for every with , it holds that the random variable converges to almost surely.
We are interested in random matrices of a particular form. Let denote a random subset of such that the events are independent with probability . Then for any fixed matrix , we write to denote the (random) principal submatrix of with rows and columns indexed by . Following [12], we define the Kesten–McKay distribution with parameter by
[TABLE]
Recall that a lacunary sequence is a set of natural numbers for which there exists such that for every . We are now ready to state our main result, which corresponds to one of many conjectures posed in [16]; see Figure 1 for an illustration.
Theorem 1**.**
Fix , take any lacunary sequence for which there exists a sequence of symmetric conference matrices of increasing size , and consider the corresponding random matrices . Then the empirical spectral distribution of converges almost surely to the Kesten–McKay distribution with parameter .
In the next section, we prove this theorem using the method of moments, saving the more technical portions for Section 3.
1.1 Notation
Given , let denote the diagonal matrix whose diagonal entries are the entries of . Given , let denote the induced -norm of (i.e., the largest singular value of ), and let denote the Schatten -norm of (i.e., the -norm of the singular values of ). Throughout this paper, we will investigate how quantities relate as . For example, suppose we are interested in a quantity that depends on both and some additional parameters . Then we write if for every , it holds that as . We write if there exists such that for all and , and we write if for every , there exists such that for all . Finally, we write if both and .
2 Proof of the main result
Our proof makes use of a standard sufficient condition for the almost sure convergence of random measures, which is a consequence of the moment continuity theorem, the Borel–Cantelli lemma, and Chebyshev’s inequality, cf. Exercise 2.4.6 in [26]:
Proposition 2**.**
Let be a sequence of uniformly subgaussian random probability measures, and let be a non-random subgaussian probability measure. Suppose that for every , it holds that
- (i)
, and
- (ii)
\displaystyle\sum_{i=1}^{\infty}\operatorname{Var}\bigg{(}\int_{\mathbb{R}}x^{k}d\zeta_{i}(x)\bigg{)}<\infty.
Then converges almost surely to .
As we will see, verifying hypothesis (i) in our case reduces to a combinatorics problem, whereas hypothesis (ii) can be treated separately with the help of Talagrand concentration:
Proposition 3** (Talagrand concentration, Theorem 2.1.13 in [26]).**
There exists a universal constant for which the following holds: Suppose is both convex and -Lipschitz in , and let be a random vector in with independent coordinates satisfying almost surely. Then for every , it holds that
[TABLE]
Throughout, denotes an symmetric conference matrix, we draw and put . We typically suppress the subscript . While the size of is random, its average size is , and so we use as a proxy for . As one might expect, this is a good approximation:
Lemma 4**.**
Put and . Then
[TABLE]
Proof.
Since is a submatrix of , it holds that
[TABLE]
almost surely. Similarly, almost surely. Next, let denote the (random) size of . Then , and so our bound on gives
[TABLE]
where the last step applies the fact that has binomial distribution. This immediately implies the desired bound on . Finally, since almost surely, we have
[TABLE]
which completes the result. ∎
As such, to demonstrate hypothesis (i) from Proposition 2 in our case, it suffices to prove
[TABLE]
The Kesten–McKay moments are implicitly computed in [20], and are naturally expressed in terms of entries of Catalan’s triangle:
[TABLE]
Proposition 5** (Lemma 2.1 in [20]).**
For every and , it holds that
[TABLE]
Recalling that , then Proposition 5 gives that (1) is equivalent to
[TABLE]
where denotes an entry of Borel’s triangle:
[TABLE]
To compute these limits, we first find a convenient expression for . To this end, recall that is the submatrix of with index set , and let denote the random diagonal matrix such that . Then
[TABLE]
Considering , it follows that
[TABLE]
It remains to show that these coefficients converge to the corresponding coefficients in (2).
First, we introduce some additional notation. Taking inspiration from Bargmann invariants [3], it is convenient to write
[TABLE]
Next, we say is a partition of into blocks if such that , and we let denote the set of all such partitions. For each partition of , we consider the set of functions whose level sets are the blocks of , namely
[TABLE]
With this, we define
[TABLE]
Considering (3), it therefore holds that
[TABLE]
As such, to demonstrate (2), it suffices to determine the limit of for every partition of . We start with a quick calculation:
Lemma 6**.**
For every with , it holds that .
Proof.
Estimate using the triangle inequality to obtain a sum of terms, each of size at most . ∎
For each , this establishes that the coefficient of in (4) approaches zero, i.e., the corresponding coefficient in (2). Now we wish to tackle the limiting value of in general. In light of the related literature [23], it comes as no surprise that depends on whether is a so-called crossing partition. We say a partition of is crossing if there exist with for which there exist and such that . Otherwise, is said to be non-crossing. Next, for each , we let denote the unique member of such that . Consider the graph with vertex set and edges given by for every ; here, we interpret modulo so that . Let denote the set of non-crossing for which the edges of partition into simple even cycles. Finally, let denote the th Catalan number. With these notions, we can describe the limit of each :
Lemma 7** (Key combinatorial lemma).**
- (i)
Suppose . Then .
- (ii)
Suppose and the edges of partition into simple cycles of sizes . Then and
[TABLE]
The proof of Lemma 7 is rather technical (involving multiple rounds of induction), and so we save it for Section 3. In the meantime, we demonstrate how Lemma 7 can be applied to prove that the coefficients in (4) converge to the coefficients in (2). Recall that a Dyck path of semi-length is a path in the plane from to consisting of steps along the vector , called up-steps, and steps along the vector , called down-steps, that never goes below the -axis. We say a Dyck path is strict if none of the path’s interior vertices reside on the -axis. Each (strict) Dyck path determines a sequence of letters from that represent up- and down-steps in the path; this sequence is known as a (strict) Dyck word. With these notions, we may prove the following result by leveraging the fact that Borel’s triangle counts so-called marked Dyck paths [6]; see Figure 2 for an illustration.
Lemma 8**.**
It holds that
[TABLE]
Proof.
When , the result follows from Lemma 6, and when is odd, the edges in each fail to partition into even simple cycles, and so the result follows from Lemma 7(i). Now suppose is even and . For , recall that the edges of are indexed by and partitioned into simple even cycles. Define to be the words such that for every simple cycle in with edges indexed by , the restriction is a strict Dyck word with all but its first up-steps marked (here, denotes a marked up-step). Note that strict Dyck words of semi-length are in one-to-one correspondence with Dyck words of semi-length , and so there are of them. As such, Lemma 7 implies that for every , it holds that
[TABLE]
Let denote the set of marked Dyck words with marked up-steps, none of which are at ground level. We observe that
[TABLE]
Then equations (5) and (6) together give
[TABLE]
where the last step applies Theorem 2 in [6]. ∎
At this point, we are in a position to verify hypothesis (i) from Proposition 2 in our case. For hypothesis (ii), we follow the approach suggested by Remark 2.4.5 in [26] of leveraging Talagrand concentration to bound the variance. First, we pass to a setting that is more amenable to analysis with Talagrand concentration. Here and throughout, for each , we fix an matrix such that .
Lemma 9**.**
It holds that \displaystyle\operatorname{Var}\big{(}\operatorname{tr}\big{(}(\tfrac{1}{p\sqrt{n}}X)^{k}\big{)}\big{)}\lesssim_{p,k}\max_{j\in[k]}\operatorname{Var}\big{(}\|FP\|_{S^{2j}}^{2j}\big{)}+n^{1/2}.
Proof.
Define , and observe that
[TABLE]
Since commutes with and , the binomial theorem gives
[TABLE]
and so rearranging gives
[TABLE]
The following estimate holds for any choice of random variables :
[TABLE]
The lemma follows from applying this estimate to (7) by induction on . ∎
Next, we establish the convexity and Lipschitz continuity required by Talagrand:
Lemma 10**.**
For each , consider the mapping defined by . Then is convex and -Lipschitz.
Proof.
We adopt the shorthand notation . First, is convex since satisfies the triangle inequality and is convex:
[TABLE]
To compute a Lipschitz bound, we apply the factorization
[TABLE]
with and to get
[TABLE]
where the last step follows from the fact that , meaning (and similarly for ). Next, we apply the reverse triangle inequality to get
[TABLE]
which implies the result. ∎
Finally, we apply Talagrand concentration to obtain a variance bound:
Lemma 11**.**
It holds that \displaystyle\operatorname{Var}\big{(}\tfrac{1}{pn}\operatorname{tr}\big{(}(\tfrac{1}{p\sqrt{n}}X)^{k}\big{)}\big{)}\lesssim_{p,k}n^{-1/k}.
Proof.
Given the mapping from Lemma 10, define in terms of subgradients by
[TABLE]
This is known as the smallest convex extension of to , and it is straightforward to verify that is convex and -Lipschitz with . Let have independent entries, each equal to with probability and [math] otherwise. Since almost surely, it holds that has the same distribution as , and we let denote its expectation. By Talagrand concentration (Proposition 3), there exists such that
[TABLE]
Combining with Lemma 9 then gives
[TABLE]
as desired. ∎
We may now verify hypotheses (i) and (ii) from Proposition 2 in our case.
Proof of Theorem 1.
Put and . First, we modify the random measure so that we may apply Proposition 2 to prove the result. Indeed, fails to be a probability measure with probability , since when is the empty set. To rectify this, we define
[TABLE]
Then it suffices to prove almost surely, since the Borel–Cantelli lemma implies almost surely, and so
[TABLE]
for every with . Conveniently, for every and , it holds that
[TABLE]
almost surely, and so the left-hand side inherits moments from the right-hand side.
To apply Proposition 2, we first observe that
[TABLE]
almost surely, and so are uniformly bounded, and therefore uniformly subgaussian. Similarly, is bounded and therefore subgaussian. Fix . As a consequence of Lemma 8, it holds that
[TABLE]
and so by Lemma 4, we have
[TABLE]
As such, satisfies hypothesis (i) from Proposition 2. Next, Lemma 11 establishes that \operatorname{Var}\big{(}\tfrac{1}{pn}\operatorname{tr}(Z_{n}^{k})\big{)}\lesssim_{p,k}n^{-1/k}, and so Lemma 4 implies
[TABLE]
Writing , select such that for every . Then
[TABLE]
As such, also satisfies hypothesis (ii) from Proposition 2, and so almost surely, as desired. ∎
3 Proof of Lemma 7
It remains to compute, for each , the limit of
[TABLE]
where is the set of whose level sets are the blocks of and
[TABLE]
We begin with some basic properties of .
Lemma 12**.**
For every , each of the following holds:
- (i)
If , then . 2. (ii)
If for any or , then . 3. (iii)
If is any cyclic permutation of , then . 4. (iv)
If , then . 5. (v)
If and , then .
Proof.
First, (i) follows from the fact that is symmetric with off-diagonal entries in . Next, (ii) follows from the fact that the diagonal entries of are 0. Recalling the definition of , then (iii) follows from commutativity. Next suppose . Then
[TABLE]
and (iv) follows since is the entry of . Finally, in the case where , we have
[TABLE]
and (v) follows since provided . ∎
Let be a partition of . Recall that for , we let denote the block of containing . We extend this notation to any integer by considering to be the block of containing a representative of modulo . For convenience, we record the following immediate consequence of Lemma 12(iii).
Lemma 13**.**
Let be a partition of and fix . Define to be the partition of with for all . Then
To establish Lemma 7(i), we will show separately that for every crossing partition and that for every non-crossing partition such that contains an odd cycle.
Lemma 14**.**
Let be a crossing partition. Then .
Proof.
For to be a crossing partition, it must hold that and . Observe that the case follows immediately from Lemma 6 since . Now consider and suppose the lemma has been established for every crossing partition on blocks. By Lemma 6, we may further suppose that satisfies . Then for , the pigeonhole principle guarantees that contains a singleton block . By Lemma 13, we may assume . We proceed in cases:
Case I: . We may apply Lemma 12(v) to obtain
[TABLE]
The restriction of to results in a crossing partition of into blocks. Moreover, the above expression for implies
[TABLE]
and so our induction hypothesis provides .
Case II: . Writing out , we choose representatives with . Then by Lemma 12(iv), we have
[TABLE]
For , we define new blocks
[TABLE]
and the corresponding crossing partition . Then
[TABLE]
Since each by our induction hypothesis, we see as well. ∎
For non-crossing partitions, we will study the structure of the graph for , which we recall has vertex set and edges for all . Observe that if has a loop, then for some , and so by Lemma 12(ii). For this reason, we direct our attention to loop-free partitions , that is, partitions for which is loop-free.
Given a loop-free graph on vertices with edges , we say is a cut vertex if the induced subgraph of on is disconnected. A graph with no cut vertices is called biconnected, and the biconnected components of a graph are its maximal biconnected subgraphs. When the biconnected components of are all simple cycles, we call a cactus.
Lemma 15**.**
If is a loop-free non-crossing partition, then and is a cactus whose edges partition into simple cycles.
Proof.
First, suppose is a cactus whose edges partition into simple cycles. Since has no loops, the number of cycles is at least at least half the number of edges, that is, . Rearranging then gives . It remains to verify that is, indeed, a cactus whose edges partition into simple cycles.
Fixing , we proceed by induction on . If , then is itself a simple cycle and hence a cactus. For , we now consider a loop-free non-crossing partition . By the pigeonhole principle, we may select such that contains at least two elements of . Let denote the least element of . Writing , we consider defined by . Since is also loop-free and non-crossing, our induction hypothesis guarantees that is a cactus with simple cycles. Our task is to use this information to show that is a cactus with simple cycles.
Suppose first that and reside in the same simple cycle of . Then the simple cycles of not containing and remain simple cycles and biconnected components of . Moreover, by identifying and , we see that the simple cycle of containing and corresponds to two simple cycles of sharing the cut vertex . As such, contains biconnected components, each of which is a simple cycle.
We now claim that and must reside in the same simple cycle of , in which case we are done by the previous paragraph. Suppose instead that there exists a cut vertex that separates and within , and select with . Since is the least element of , we necessarily have . Furthermore, since separates and , we can traverse along a trail in from to , to , and back to to obtain indices and with . These indices satisfy and , contradicting our assumption that is non-crossing. Hence, and must reside in the same simple cycle of as claimed. ∎
Lemma 16**.**
For every loop-free non-crossing , each of the following holds:
- (i)
If contains any odd cycles, then .
- (ii)
If the edges of partition into simple cycles of sizes , then
[TABLE]
Proof.
Any loop-free non-crossing partition must have at least two blocks. When , we may assume by Lemma 6 so that the only partition under consideration is , in which case and . Lemma 12(i) allows us to verify the result in this case:
[TABLE]
Now consider , and suppose the lemma has been established for every loop-free non-crossing partition on blocks. By Lemma 15, we may assume that satisfies . Then for , the pigeonhole principle guarantees that contains a singleton block . By Lemma 13, we may assume . We proceed in cases:
Case I: . We may apply Lemma 12(v) to obtain
[TABLE]
The restriction of to results in a loop-free non-crossing partition of into blocks. Moreover, the above expression for implies
[TABLE]
For (i), observe that if contains any odd cycles, then must also contain odd cycles. In this case, we may apply our induction hypothesis to to conclude . For (ii), the edges of partition into simple cycles of sizes with , and so the edges of partition into simple cycles of sizes . Then (8) and our induction hypothesis together imply
[TABLE]
Since , this establishes (ii).
Case II: . In this case, necessarily resides in a cycle of length . Select representatives with and so that the vertices in the cycle are given by . Then we may apply Lemma 12(iv) to obtain
[TABLE]
For , define new blocks
[TABLE]
and the corresponding partitions , we have
[TABLE]
By Lemma 14, whenever is a crossing partition. Since is obtained from by merging blocks and , we can argue as in the proof of Lemma 15 to conclude that is crossing if and only if and do not reside in the same simple cycle of . Hence,
[TABLE]
where each is non-crossing for . Both and contain loops, so by Lemma 12(ii). When , this gives , as desired by (i). Supposing for the remainder that , we must still compute the limit of
[TABLE]
Observe that our cycle in of length corresponds to the two simple cycles in of and with lengths and and share the cut vertex . Moreover, all other simple cycles are identical between the two graphs.
If is odd, then for each , either or is odd, and so must have an odd cycle. Since each has blocks and an odd cycle, we can apply our induction hypothesis to conclude that each so that , as desired by (i). Suppose instead that is even, but has an odd cycle. This odd cycle is also contained in each for , and again we can apply our induction hypothesis to conclude that , thereby establishing (i).
Finally, for (ii), suppose that is even and that the edges of partition into cycles of lengths with . Notice that if is odd, then contains an odd cycle, and . Since the contribution of these terms is negligible, we must compute the limit of
[TABLE]
The cycles of lengths are common to both and , while the cycle of length in corresponds to two cycles of length and in . Applying our induction hypothesis, we have
[TABLE]
Reindexing and applying the convolution identity for Catalan numbers, we have
[TABLE]
Hence, , thereby establishing (ii). ∎
Proof of Lemma 7.
To prove (i), consider . Then either is crossing, contains a loop, or contains an odd cycle. If is crossing, then by Lemma 14. If contains a loop, then by Lemma 12(ii). If is loop-free and non-crossing but contains an odd cycle, then by Lemma 16(i). This establishes (i). Finally, (ii) follows from applying both Lemmas 15 and 16(ii). ∎
Acknowledgments
MM and DGM were partially supported by AFOSR FA9550-18-1-0107. DGM was also supported by NSF DMS 1829955 and the Simons Institute of the Theory of Computing.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] A. S. Bandeira, M. Fickus, D. G. Mixon, P. Wong, The road to deterministic matrices with the restricted isometry property, J. Fourier Anal. Appl. 19 (2013) 1123–1149.
- 2[2] A. S. Bandeira, D. G. Mixon, J. Moreira, A conditional construction of restricted isometries, Int. Math. Res. Notices 2017 (2017) 372–381.
- 3[3] V. Bargmann, Note on Wigner’s theorem on symmetry operations, J. Math. Phys. 5 (1964) 862–868.
- 4[4] J. J. Benedetto, M. Fickus, Finite normalized tight frames, Adv. Comput. Math. 18 (2003) 357–385.
- 5[5] J. Bourgain, S. Dilworth, K. Ford, S. Konyagin, D. Kutzarova, Explicit constructions of RIP matrices and related problems, Duke Math. J. 159 (2011) 145–185.
- 6[6] Y. Cai, C. Yan, Counting with Borel’s triangle, Discrete Math. 342 (2019) 529–539.
- 7[7] E. J. Candès, J. Romberg, T. Tao, Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information, IEEE Trans. Inf. Theory 52 (2006) 489–509.
- 8[8] P. G. Casazza, M. Fickus, J. C. Tremain, E. Weber, The Kadison–Singer problem in mathematics and engineering—a detailed account, in: Operator Theory, Operator Algebras, and Applications, D. Han, P. E. T. Jorgensen, D. R. Larson (eds.), Contemporary Mathematics, vol. 414, Providence, RI: American Mathematical Society, 2006, pp. 299–355.
