Finite-rank perturbations of random band matrices via infinitesimal free probability
Benson Au

TL;DR
This paper investigates the infinitesimal spectral distribution of banded GUE matrices, establishing a phase transition at band width proportional to the square root of matrix size, and extends results on finite-rank perturbations and outlier detection.
Contribution
It proves a sharp phase transition for the infinitesimal distribution of banded GUE matrices and extends infinitesimal free probability results to this setting.
Findings
Sharp $\sqrt{N}$ transition for infinitesimal distribution
Model is infinitesimally free from matrix units and all-ones matrix for large band widths
Finite-rank perturbations produce outliers at classical positions
Abstract
We prove a sharp transition for the infinitesimal distribution of a periodically banded GUE matrix. For band widths , we further prove that our model is infinitesimally free from the matrix units and the normalized all-ones matrix. Our results allow us to extend previous work of Shlyakhtenko on finite-rank perturbations of Wigner matrices in the infinitesimal framework. For finite-rank perturbations of our model, we find outliers at the classical positions from the deformed Wigner ensemble.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
\MHInternalSyntaxOn\MHInternalSyntaxOff
Finite-rank perturbations of random band matrices via infinitesimal free probability
Benson Au
University of California, San Diego
Department of Mathematics
9500 Gilman Drive # 0112
La Jolla, CA 92093-0112
USA
Abstract.
We prove a sharp transition for the infinitesimal distribution of a periodically banded GUE matrix. For band widths , we further prove that our model is infinitesimally free from the matrix units and the normalized all-ones matrix. Our results allow us to extend previous work of Shlyakhtenko on finite-rank perturbations of Wigner matrices in the infinitesimal framework. For finite-rank perturbations of our model, we find outliers at the classical positions from the deformed Wigner ensemble.
Key words and phrases:
BBP transition; finite-rank perturbation; infinitesimal free probability; random band matrix; traffic probability; Wigner matrix
2010 Mathematics Subject Classification:
15B52; 46L53; 46L54; 60B20
Contents
1. Introduction
1.1. Motivation
The contact between random matrices and free probability first appeared in the seminal work of Voiculescu [Voi91]. By now, a well-developed theory exists to illustrate the depth of this connection: see, for example, the monographs [VDN92, NS06, AGZ10, MS17]. We summarize the basic paradigm as follows: in many generic situations, independent random matrices become freely independent in the large limit. The analytic machinery of free probability then allows us to understand various joint asymptotics associated to such multi-matrix models.
Despite the tremendous success of this approach, the standard free probability framework comes with inherent limitations. In particular, free independence only prescribes the zeroth order behavior of our random variables: for random matrices, this shortcoming already manifests itself at the level of outliers. To make this precise, we introduce some notation. In this article, we restrict our attention to self-adjoint matrices. For such a matrix , we write for its eigenvalues, counting multiplicity, arranged in a non-increasing order. We further write for the empirical spectral distribution (ESD) of . Thus,
[TABLE]
Hereafter, when we refer to a matrix , we implicitly refer to a sequence of matrices .
Now, suppose that we have random matrices and such that the ESDs converge weakly in expectation to some compactly supported probability measures and respectively. If we further assume that and are asymptotically free, then we can even compute the limiting spectral distribution (LSD) of rational functions in the pair [HMS18]. In particular, the freeness relationship completely determines the LSD of the sum from the marginals and . By analogy with the classical case, this operation is known as the free (additive) convolution, for which we use the notation . We recall the following characterization of the free convolution in terms of subordination functions (see, for example, [MS17, Chapter 3]). For a probability measure on , we denote its Cauchy transform by , where
[TABLE]
We use the notation for the reciprocal Cauchy transform.
Theorem 1.1** ([Voi93, Bia98]).**
For any pair of probability measures on , there exists a unique pair of analytic functions such that
- (i)
; 2. (ii)
.
Moreover, the common function in property (i) corresponds to the Cauchy transform of a unique probability measure on . We define the free convolution as this unique probability measure, namely
[TABLE]
The tools of free harmonic analysis enable a great deal of practical computations. For example, if one takes to be a normalized matrix from the Gaussian unitary ensemble (GUE), then a classical result of Wigner shows that the LSD is the so-called semicircle distribution [Wig55]. At the same time, the unitary invariance of the GUE implies that is asymptotically free from a large class of random matrices [Voi91]. In the setting above, one can take to be an independent diagonal matrix with i.i.d. Rademacher entries, in which case . Implementing Theorem 1.1, we obtain the LSD of the sum :
[TABLE]
Such additive perturbations appear naturally as models of interaction and noise. Under suitable conditions, we see that free probability allows us to understand the spectral distribution at the aggregate level; however, this approach fails to capture the behavior of the extremal eigenvalues. Indeed, consider the case of a rank one perturbation , where is the matrix unit in the -th coordinate and . For GUE as before, the free convolution calculation reduces to the trivial identity
[TABLE]
From this perspective, the effect of the perturbation appears no different than the unperturbed model .
In actuality, we know that the behavior of the extremal eigenvalue exhibits a phase transition depending on the magnitude of the perturbation (a so-called BBP transition in view of the original work [BBAP05] on complex sample covariance matrices). In the case of the deformed GUE, Péché showed that the fluctuations of the extremal eigenvalue deviate from the Tracy-Widom distribution [TW94] when with the extremal eigenvalue even separating from the bulk when [Péc06]. The unitary invariance of the GUE implies that the same result holds for any rank one self-adjoint perturbation with nontrivial eigenvalue . Féral and Péché then extended the result to complex sub-Gaussian Wigner matrices under the perturbation , where is the all-ones matrix [FP07]. Notably, they proved the universality of the fluctuations of the extremal eigenvalue (cf. [FK81, Sos99]). Maïda established a large deviation principle for the extremal eigenvalue of the deformed Gaussian ensembles: as a corollary, this proves the same bulk separation phenomenon for the deformed Gaussian orthogonal ensemble (GOE) [Maï07]. Capitaine, Donati-Martin, and Féral generalized the bulk separation phenomenon to finite-rank perturbations: for example, of the form for some fixed . In this case, multiple eigenvalues exit the bulk, one for each value of . Their result holds for general Wigner matrices, real and complex, under the technical assumption that the entries satisfy a Poincaré inequality. At the same time, they extended the universality of the fluctuations of the extremal eigenvalue under perturbations of the form to real Wigner matrices. In contrast, they also proved the non-universality of the fluctuations of the extremal eigenvalue under perturbations of the form [CDMF09]. In a later work, the same authors also determined the joint fluctuations of the extremal eigenvalues [CDMF12]. Pizzo, Renfrew, and Soshnikov [PRS13] and later Renfrew and Soshnikov [RS13] removed the technical assumptions for these results: the version we state below is due to them. For additional reading and related results, see the surveys [Péc14, CDM17].
Theorem 1.2** (BBP transition).**
For each , let be a family of independent random variables, the off-diagonal entries possibly being complex-valued. We assume that the diagonal entries are centered with uniformly bounded variance satisfying the Lindeberg condition:
[TABLE]
For real-valued, we assume that the off-diagonal entries are centered with identical variance and uniformly bounded fourth moments satisfying a Lindeberg type condition:
[TABLE]
For complex-valued, we assume that the real and imaginary parts of each off-diagonal entry are independent with identical variance in addition to the conditions above. As a consequence,
[TABLE]
Let denote the corresponding (unnormalized) Wigner matrix with the usual normalization . Assume that is a deterministic self-adjoint matrix of the same symmetry class as with fixed rank independent of the dimension. We further assume that the non-trivial eigenvalues of are independent of , say , where occurs with multiplicity for . Let and , and define
[TABLE]
Then we have the following asymptotic behavior at the edge of the spectrum of the deformed Wigner ensemble :
- (i)
For any and ,
[TABLE] 2. (ii)
; 3. (iii)
; 4. (iv)
For any and ,
[TABLE]
where denotes convergence in probability.
Recall that our earlier free convolution calculation failed to identify such outliers. Nevertheless, it turns out that the behavior of the outlying eigenvalues (as well as their eigenvectors) can be understood in terms of the subordination functions from Theorem 1.1 [CDMFF11, Cap13, BBCF17] (see also [BGN11] for related results). This suggests that free probability may yet prove useful to this end. Shlyakhtenko explained this connection using the framework of infinitesimal free probability, an extension of free probability to the first order. In particular, by calculating a type B free convolution, one obtains the correction to the LSD of such deformed ensembles. The outlying eigenvalues then appear in this correction in the form of Dirac masses [Shl18]. We review this framework in the next section.
1.2. Background
We begin by recalling the usual free probability framework.
Definition 1.3** (Free probability).**
By a non-commutative (NC) probability space , we mean a unital algebra over paired with a unital linear functional . We say that is tracial if for all . The distribution of a family of random variables is the linear functional
[TABLE]
where is a set of non-commuting indeterminates and is the usual evaluation of NC polynomials. A sequence of families , each living in a possibly different NC probability space , converges in distribution if the sequence converges pointwise. Note that the limit defines a new NC probability space .
Unital subalgebras of are said to be freely independent (or simply free) if for any and consecutively distinct indices ,
[TABLE]
where denotes the subspace of centered elements. We say that collections of random variables are free if the unital subalgebras that they generate are free. If a sequence of families converges in distribution, then we say that the random variables are asymptotically free if the indeterminates are free in .
Remark 1.4*.*
The reader might wonder how the notion of a distribution above relates to the usual notion of a distribution for a real-valued random variable. If we assume both existence and uniqueness for the moment problem defined by , then the two notions coincide. The moment sequences we consider in this paper will satisfy this assumption, so we speak of the two notions interchangeably. In particular, if are free with determinate moment problems, then .
Example 1.5** (Random matrices).**
Let denote the algebra of random matrices whose entries, possibly complex-valued, have finite absolute moments of all orders. Then defines a tracial NC probability space.
Voiculescu showed that independent unitarily invariant random matrices are asymptotically free [Voi91], the GUE being a prototypical example. Dykema later extended this result to general Wigner matrices [Dyk93]. We now know freeness to be an ubiquitous phenomenon for invariant/mean-field multi-matrix models in the large limit [MS17] (see also [Spe17]).
Understanding the spectral behavior of non mean-field ensembles constitutes a major ongoing program of research, where random band matrices emerge as an attractive interpolative model (see [Bou18] and the references therein). Here, the primary questions concern the local eigenvalue statistics and localization versus delocalization for the eigenvectors. In a different direction, we showed that freeness governs random band matrices for band widths [Au18], motivating the investigations in this paper at the infinitesimal level. The results in [Au18] rely on an extension of free probability introduced by Male called traffic probability [Mal]: we make use of the traffic framework again, this time in conjunction with the infinitesimal framework. We refer the reader to [Mal17, MP, Gaba, Gabb, Gabc, CDM, ACD*+*] for additional reading on traffic probability and its applications.
Belinschi and Shlyakhtenko introduced infinitesimal free probability in [BS12] to provide an analytic interpretation of the type free probability of Biane, Goodman, and Nica [BGN03]. We content ourselves with the basic framework: for more on the interplay between these two notions, see [FN10]. For recent work on infinitesimal free probability and its applications to random matrices, we mention the contributions [Min, DF, Tse].
Definition 1.6** (Infinitesimal free probability).**
By an infinitesimal NC probability space , we mean a NC probability space with an additional linear functional satisfying . The infinitesimal distribution of a family of random variables is the linear functional
[TABLE]
We refer to the pair as the type distribution of .
Unital subalgebras of are said to be infinitesimally free if
- (i)
the are free in ; 2. (ii)
for any and consecutively distinct indices ,
[TABLE]
Conditions (i) and (ii) are equivalent to the following asymptotic:
[TABLE]
where and for . Thus, heuristically, we think of infinitesimal freeness as “freeness to the first order”.
Remark 1.7*.*
In view of Remark 1.4, the reader might wonder how the notion of an infinitesimal distribution relates to the usual notion of a signed measure on the real line. If we assume both existence and uniqueness for the signed moment problem defined by , then the two notions coincide. The signed moment sequences we consider in this paper will typically satisfy this assumption, so we speak of the two notions interchangeably when possible. Note that the condition implies that the corresponding signed measure has total mass zero.
Example 1.8** (Random matrices, revisited).**
Let be a family of random matrices in . Assume that converges in distribution with limit . If we further assume that the limit
[TABLE]
exists, then defines a tracial infinitesimal NC probability space (both and vanish on the commutators). By a slight abuse of terminology, we often refer to (resp., ) as the infinitesimal distribution (resp., type distribution) of .
In the single matrix case, say , the infinitesimal distribution corresponds to the correction to the LSD . Indeed, by definition,
[TABLE]
where we recall that is assumed to be self-adjoint. For example, in the case of , the infinitesimal distribution is null , a consequence of the genus expansion [HZ86]. On the other hand, a result of Johansson [Joh98] shows that the situation becomes much different for , where
[TABLE]
We mention that such corrections also exist for complex Wishart matrices [MN04, Min] and -ensembles [DE06].
Note that the eigenvalues appear in (1) via the unnormalized trace. This suggests that the infinitesimal distribution is sensitive to outliers. To see this, we will need the following subordination result for the type free (additive) convolution.
Theorem 1.9** ([BS12]).**
Suppose that are infinitesimally free with compactly supported type distributions . By this, we mean that both coordinates of the type distribution have compact support. Then, in the notation of Theorem 1.1, the sum also has a compactly supported type distribution characterized by
- (i)
; 2. (ii)
,
where denote the usual derivatives. We define the type convolution as this unique type distribution, namely
[TABLE]
Theorem 1.10** ([Shl18]).**
Let . Then for any fixed , the matrices and are asymptotically infinitesimally free.
Of course, we can easily compute the type distribution of the matrix units. For , we see that
[TABLE]
Using Theorem 1.9, one obtains the correction to the LSD of the deformed Gaussian ensemble (cf. Theorem 1.2).
Corollary 1.11** ([Shl18]).**
If and , then the type distribution of is given by
[TABLE]
where
[TABLE]
is a probability measure if ; otherwise, is a signed measure of total mass zero with Jordan decomposition , where
[TABLE]
If instead , then the type distribution of is given by
[TABLE]
where is as before.
The proof of Theorem 1.10 relies on Wick’s formula for Gaussian integration. Naturally, one can ask if the result extends to general Wigner matrices. In this case, one needs to first prove the existence of an infinitesimal distribution for the single matrix model, a calculation carried out by Enriquez and Ménard (see also [KKP96]). We state a slight generalization of their result to allow for entries with possibly different distributions: the proof remains unchanged.
Theorem 1.12** ([EM16]).**
For each , let be a family of independent random variables, the off-diagonal entries possibly being complex-valued. We assume that the diagonal entries are centered with identical variance:
[TABLE]
For real-valued (), we assume that the off-diagonal entries are centered with identical variance and fourth moments:
[TABLE]
For complex-valued (), we assume that the pseudo-variance of each off-diagonal entry vanishes in addition to the conditions above:
[TABLE]
Lastly, we assume a strong uniform control on the moments:
[TABLE]
Then the corresponding Wigner matrix has an infinitesimal distribution \nu=\frac{1}{2}\Big{[}\frac{\mathbbm{1}\{\beta=1\}}{2}\delta_{\pm 2\sigma}+\nu_{\operatorname{ac}}\Big{]}, where
[TABLE]
1.3. Statement of results
Our first result extends Theorem 1.10 to general Wigner matrices. We also consider perturbations of the form , where we recall that is the all-ones matrix.
Theorem 1.13**.**
Let be a Wigner matrix of the form in Theorem 1.12. Then for any fixed , the matrices , , and are asymptotically infinitesimally free.
Note that the type distribution of is identical to that of , allowing us to essentially repeat the calculation of Corollary 1.11.
Corollary 1.14**.**
The type distribution of the deformed Wigner ensemble
[TABLE]
is given by
[TABLE]
where is as in Theorem 1.12 and is as in Corollary 1.11.
Remark 1.15*.*
The result above shows that while the infinitesimal distribution is sensitive to outliers, it fails to distinguish their fluctuations. Indeed, recall that the fluctuations of the extremal eigenvalue under perturbations of the form (resp., ) are non-universal (resp., universal) for , whereas the infinitesimal distribution of and are identical. In general, the fluctuations of the extremal eigenvalues depend on the geometry of the eigenvectors of the perturbation: localized (as in the case of ) versus delocalized (as in the case of ) [CDMF12].
The usual strategy for studying outliers relies on a fine analysis of the resolvent, using delicate estimates currently unavailable for non mean-field ensembles. In contrast, the purview of the infinitesimal framework extends quite naturally to random band matrices. We restrict ourselves to the idealized situation of a periodically banded GUE matrix.
Definition 1.16** (Random band matrix).**
Let . For a band width , we define to be the corresponding periodic band matrix of ones:
[TABLE]
where
[TABLE]
We assume that the band width , and we set
[TABLE]
We call the random matrix
[TABLE]
a (normalized) periodically banded GUE matrix (of band width ). Of course, if , then .
Bogachev, Molchanov, and Pastur proved that the ESD converges weakly almost surely to the semicircle distribution [BMP91]. In particular, this holds regardless of the rate because of the periodic band width structure (2). We considered the multi-matrix case in [Au18], where it was shown that independent copies of are asymptotically free, regardless of the relative rates of growth of the band widths . So, for example, it could be that
[TABLE]
We highlight this homogeneity around because of its conjectural role, confirmed at the level of physical rigor, as the critical value for the localization-delocalization transition for random band matrices (again, see [Bou18] for a recent survey).
While the rate did not play a role in our calculations at the zeroth order, a factor appears quite naturally at the first order. Our next result proves a sharp transition for the infinitesimal distribution around this rate.
Theorem 1.17**.**
Let be a periodically banded GUE matrix of band width . Then for any ,
[TABLE]
where is the th Catalan number, , and for . In particular, if , then the type distribution of exists and agrees with that of a usual GUE matrix .
The numbers correspond to sums of volumes of regions cut out of a hypercube and satisfy
[TABLE]
Thus, a solution to the signed moment problem defined by the sequence
[TABLE]
would necessarily be unique; however, we do not prove existence. Nevertheless, given a finite limit for the infinitesimal distribution, we can consider the question of finite-rank perturbations.
Theorem 1.18**.**
Let be a periodically banded GUE matrix of band width such that or . Then for any fixed , the matrices , , and are asymptotically infinitesimally free.
For band widths , this allows us to repeat the calculation of Corollary 1.11. In particular, we find outliers at the classical positions from the deformed Wigner ensemble.
Corollary 1.19**.**
For , the type distribution of the deformed RBM
[TABLE]
is given by
[TABLE]
where is as in Corollary 1.11.
Remark 1.20*.*
A solution to the signed moment problem at the rate would allow us to deduce the type distribution of the corresponding deformed model: one simply needs to add the hypothetical signed measure to the infinitesimal distribution in Corollary 1.19.
In this article, we consider the BBP transition for random band matrices exclusively within the infinitesimal framework. Naturally, one can ask if the usual form of these results hold, namely, convergence in probability of the extremal eigenvalues and convergence in distribution of the fluctuations. This will be the subject of future work. In the next section, we record the outcome of numerical simulations for various band widths. Notably, the data suggests that the position of the outliers and their fluctuations extend below the rate .
2. Numerical simulations
We consider the fluctuations of the largest eigenvalue under both localized and delocalized perturbations of our model separately. In particular, we record the data
[TABLE]
for 5000 realizations of the matrix , where , , and . The peculiar choice of dimension allows for the precise band widths and . For reference, we also consider the band width , in which case reduces to the usual GUE and by a result of Péché [Péc06]. We emphasize the difference in scaling between and . Indeed, the data strongly suggests that we still have the convergence under the respective normalizations (even at the rate ). The simulations were performed in Julia [BEKS17] and the data plotted using Gadfly [JAN*+*18].
The scaling in should come as no surprise. To see this, note that the periodic band width structure in some sense reduces the trace expansion at each entry locally to that of a matrix. So, heuristically, we think of as a perturbation of . On the other hand, in the case of , adding forces us to consider the entire matrix, removing any notion of homogeneity. Moreover, the entries of this perturbation come in at a different scale than our matrix entries . We can still make (non-rigorous) sense of the scaling in by considering the trace expansion as a choice at each entry between the original matrix and the perturbation , where the first option is available iff . But this precisely balances with the normalization of the entries in , and so the scaling should follow the usual case of .
For (undeformed) random band matrices, Sodin proved that the extremal eigenvalues converge to the edge of the support for band widths with the fluctuations exhibiting a crossover at the rate [Sod10]. The simulations do not support the idea of a similar crossover for the deformed model, suggesting that the perturbations regularize the fluctuations of the extremal eigenvalues.
3. The infinitesimal distribution of a random band matrix
For convenience, we fix the variance in this section: the general result follows from a simple scaling. Section 3.1 proves the existence of an infinitesimal distribution for a periodically banded GUE matrix in the regime using a band variant of the genus expansion. Section 3.2 then proves the asymptotic infinitesimal freeness of our model from the matrix units and the normalized all-ones matrix, allowing us to carry out the advertised type free convolution calculation.
3.1. A band variant of the genus expansion
We consider traces in powers of our matrix . To begin, note that
[TABLE]
This follows from the usual symmetry argument, which still holds even in the presence of the band width condition. We turn our attention to the even powers, where we must now account for the band width explicitly:
[TABLE]
where . Using Wick’s formula, we obtain the expansion
[TABLE]
where . Here, we consider a pair partition as a -permutation when computing the composition . Interchanging the sums, we arrive at the expression
[TABLE]
where
[TABLE]
Note that we have the simple upper bound
[TABLE]
where denotes the number of cycles of . Indeed, starting with an arbitrary cycle of , say the cycle that contains 1, we have choices for the common index of the elements in this cycle. After making this choice, we must then choose the indices of the remaining cycles to satisfy the band width condition , for which there are at most choices at each step. In general, this upper bound is strict: by the time you arrive to choose the index of a cycle of , you might have fewer than choices if the cycle is neighboring two cycles whose indices have already been chosen. As an example, take and . In this case, . Suppose that we pick the indices and for the cycles and respectively. Then the index of the cycle must satisfy both
[TABLE]
If we assume that , then we only have choices for .
We quickly see the problem. By using up all of our leeway, we could potentially leave the indices too far apart to meet up again. For a simple parallel, consider placing three points in such that any pair of points must be within unit distance of each other. Choosing the first point arbitrarily, say at the origin, and placing at , we can no longer place at an arbitrary point in the unit circle. This analogy gives us a lower bound for our original problem. If we instead divide our leeway by and pick the indices of the successive cycles arbitrarily at periodic distance less than or equal to this quotient, then we will stay within the permitted region (essentially just the triangle inequality). Thus,
[TABLE]
We define a graph to keep track of the constraints on induced by cycles of with adjacent elements . Let be the directed cycle graph on the vertices with edges in the direction . We equate the map with a labeling of the vertices in the obvious way. The edges then indicate the band width constraint by virtue of the equivalence
[TABLE]
At the moment, the direction of the edges do not play a role.
For a pair partition , we define as the directed multigraph obtained from by identifying the vertices according to the blocks of as follows: if is a block of , then we identify the source of the edge with the target of the edge (so ) and the source of the edge with the target of the edge (so ). In other words, for each block , we overlay the edges and head-to-tail. The vertices in the graph correspond to the cycles of with the edges indicating a constraint on the labels of the cycles induced by the constraint on the labels of the vertices. Note that the graph might have loops. Of course, the constraint from a loop is vacuous, nor do the multiplicity/direction of the edges indicate any additional constraint at the level of the cycles of . So, we define as the underlying simple graph.
At the same time, we know that
[TABLE]
where
[TABLE]
by a result of Biane [Bia97]. In particular, if , then the graph is a double tree in the sense of Male [Mal]. By this, we mean that has no loops and is a tree such that the multiplicity of each edge in is two (so-called twin edges). To see this, note that the graph is connected with , which implies that . At the same time, is obtained from by overlaying pairs of edges, whence . Thus,
[TABLE]
as was to be shown. In the case of a tree , we do not run into a problem when choosing the vertices greedily using the entire leeway at each step, and so the upper bound for in (6) becomes an equality. In other words,
[TABLE]
Indeed, recall that our earlier counterexample
[TABLE]
Let . Applying (7) to our earlier (4) and rearranging, we obtain
[TABLE]
Using our bounds (6) for , we see that
[TABLE]
Since , we also have the lower bound
[TABLE]
At this point, we use the genus expansion to count the number of pair partitions that contribute to a given exponent appearing in the summands of our bounds. Altogether, this allows us to write
[TABLE]
where . Naturally, this calculation recovers the semicircle law. To see this, we simply take the normalized limit
[TABLE]
however, without this normalization, the ratio arises in (8) as the leading order term. In particular, if , then and
[TABLE]
In this case, the infinitesimal distribution of a periodically banded GUE matrix is null, which matches the calculation for the usual GUE. On the other hand, if , then the lower bound in (8) implies that
[TABLE]
Finally, in the intermediate regime , we see that
[TABLE]
where and
[TABLE]
the equality (resp., asymptotic) following from the three-term recurrence of Harer and Zagier [HZ86] (resp., Stirling’s formula).
Of course, one expects to be able to say more than just (9), namely, that a limit exists in the intermediate regime (and hopefully with some nice formula or interpretation). This amounts to calculating
[TABLE]
for such that ; however, the value of this limit crucially depends on the particular geometry of . For example, consider the pair partitions
[TABLE]
Then , and so . Going through the graph construction, we see that is a tree, which means that attains the upper bound. On the other hand, is the undirected cycle graph , which means that for some .
In fact, the graph contains precisely the information we need to compute the limit. To see this, note that we can rewrite (5) as
[TABLE]
Indeed, the vertices of correspond to the cycles of by construction and the edges function precisely to keep track of the band width constraint. This suggests computing the limit of as an integral over the hypercube after scaling by (for example, as in [Au18]); however, in this case, the band width and (recall that ). So, the integral interpretation must take into account the vanishing scale of the mesh size and the difference in the scaling exponent . To accomplish this, we use the fact that the periodic band width structure implies a certain homogeneity in our choice of admissible maps . In particular, fixing a vertex , we see that
[TABLE]
So, we consider the equivalent expression
[TABLE]
where
[TABLE]
This reduces the problem to computing
[TABLE]
which we can interpret using (11). After fixing the label , we must choose the labels of the remaining vertices according to the band width constraint. The exponent of the normalization now matches the remaining degrees of freedom, and the base matches the maximum step size .
The remaining issue concerns the region of integration. After the normalization, the step size becomes a single unit length. To ensure that the integral captures the full range of possibilities, we must choose a hypercube of appropriate side length. If , then the image will necessarily be disjoint from . So, the hypercube will ensure that the region of integration is sufficiently large.
We can now give the integral representation for our limit. For , we define an integral associated to the graph as follows. Pick an arbitrary vertex , and let be the set of edges adjacent to . We write for the remaining edges. By construction, the integral
[TABLE]
then satisfies
[TABLE]
where
[TABLE]
For , we set by convention, in which case the integral reduces to for the only genus one partition . As an example, consider the case of , , and . Then
[TABLE]
One can easily verify that this agrees with the direct calculation
[TABLE]
Thus, for , we see that
[TABLE]
Altogether, this proves Theorem 1.17.
We use our earlier bound (9) and the asymptotic (10) for to see that
[TABLE]
At the same time,
[TABLE]
where . Note that is “one-crossing”. In particular, since . Moreover, the corresponding graph is the star graph with vertices. Since is a tree, we know that , whence
[TABLE]
which proves (3). The hypothetical signed measure associated to the infinitesimal distribution in the intermediate regime then satisfies
[TABLE]
In fact, we expect that , but we do not prove this here.
Remark 3.1*.*
Naturally, one can ask about the joint infinitesimal distribution of independent periodically banded GUE matrices . To answer this question, we partition the index set according to the rates , where
[TABLE]
Repeating our banded genus expansion for a mixed trace in shows that the joint infinitesimal distribution of is null, which implies that the are asymptotically infinitesimally free. Similarly, any mixed trace in the two families and vanishes in the limit, which implies that and are asymptotically infinitesimally free as well.
On the other hand, the same calculation shows that the are not asymptotically infinitesimally free. For example, if , then
[TABLE]
whereas asymptotic infinitesimal freeness would insist that this limit be zero. Note that our integral interpretation still holds and prescribes a rule for computing the infinitesimal distribution in this case. To account for the possibly different ratios , we must adjust both the integrand and the region of integration via a straightforward combination of the ideas above and [Au18, 4.3]. We leave the details to the interested reader.
Note that the infinitesimal calculation is very specific (GUE and periodically banded) and cannot be extended to regular band matrices
[TABLE]
In particular, let now denote the banded GUE matrix constructed with as above. For , we know that still converges weakly almost surely to the semicircle distribution [BMP91]. A simple calculation shows that
[TABLE]
In this case,
[TABLE]
and so an infinitesimal distribution does not exist for any such band width.
3.2. Finite-rank perturbations
We now consider the multi-matrix model
[TABLE]
Most of our calculations in this section remain valid in a more general setting. In particular, we extend our definition of
[TABLE]
to independent unnormalized Wigner matrices of the form
[TABLE]
where the band widths satisfy
[TABLE]
Dykema proved that the family converges in distribution to a semicircular system [Dyk93]. We generalized this result to the family in [Au18] (recall that if , then ). For concreteness, we write for this limiting distribution, where
[TABLE]
The results in [Au18] as well as the remainder of this section make use of the traffic probability framework [Mal], which we briefly review.
Definition 3.2** (Traffic probability).**
By a multidigraph , we mean a non-empty set of vertices , a set of edges , and a pair of functions specifying the source and target of each edge. A test graph is a finite multidigraph with edge labels . For a partition , we define as the test graph obtained from by identifying the vertices of according to blocks of . Formally, we construct as
- (i)
and ; 2. (ii)
and ; 3. (iii)
.
Since , we often omit the superscript and use the same notation for the edge set of the quotient . We write for the set of all test graphs in and for the complex vector space spanned by .
We define the traffic state as the unique linear functional
[TABLE]
where denotes the number of connected components of . For convenience, we abbreviate as . Similarly, we define the injective traffic state as the unique linear functional
[TABLE]
Henceforth, we use the notation to indicate an injective map. The functionals and satisfy the relations
[TABLE]
where denotes the singleton partition and Möb is the usual Möbius function on the poset of partitions.
Example 3.3**.**
Let be a monomial . Then
[TABLE]
where
[TABLE]
We can now prove the following generalization of Lemma 3.2 in [Shl18].
Lemma 3.4**.**
For any NC polynomials ,
[TABLE]
where .
Proof.
Note that we can rewrite the desired trace as
[TABLE]
which reduces the problem to computing
[TABLE]
Furthermore, by linearity, it suffices to prove the result for monomials . For concreteness, we write
[TABLE]
where . To convert this to the traffic notation, let be the test graph
[TABLE]
where , , , and . We define as the disjoint union of the , in which case
[TABLE]
where we recall that . Note that the inner sum on the previous line might be empty: for example, if for some . Conversely, if for some , then we must have . Thus, taking into account the various indices, we can restrict the outer summation over to
[TABLE]
We now analyze the contribution from a quotient . First, we decompose into its connected components , where the notation is meant to distinguish between the test graphs and . For an injective map , the independence of our matrix entries implies that
[TABLE]
Let us then focus on a connected component . We define as the set of loop edges of , which divides . As before, we write for the underlying simple graph. For an edge , we define
[TABLE]
Naturally, we can think of . By a slight abuse of notation, we also write . Separating the normalization
[TABLE]
we can again use the injectivity of to decompose the remaining expectation as
[TABLE]
The asymptotic follows from our strong moment assumption (12), which bounds the contribution from such a term uniformly in and , where is the total degree of our monomials . Strictly speaking, the asymptotic depends on both and the finite set , but both are fixed independent of by our monomials . For convenience, we omit this last detail from the notation.
We would then like to bound the number of injective maps that actually contribute (i.e., the number of such that the term in (16) is non-zero). Note that since the off-diagonal entries of our matrices are centered, we can assume that
[TABLE]
otherwise, one of the factors in the product above vanishes. Of course, the graph is still connected with the same vertex set as , whence
[TABLE]
We must also remember to include the band width constraint in our bound. In particular, a contributing map satisfies
[TABLE]
We introduce some notation for the set of admissible maps
[TABLE]
Similarly, we define
[TABLE]
Note that
[TABLE]
Consider a spanning tree of . We think of a spanning tree as recording a minimal working subset of the band width constraints. In particular, bounds the number of contributing maps by
[TABLE]
To see this, pick an arbitrary initial vertex of . Clearly, we have options for at this stage. The bound then follow from walking through the rest of our graph while satisfying the band width constraints imposed by the edges . Note that this fails to account for the special vertices and , which have fixed labels and respectively. In particular, each connected component has at least one such special vertex. Choosing this special vertex to be the initial vertex removes the factor of in our earlier bound, and so
[TABLE]
where the asymptotic follows from (17). In view of (15)-(21), we conclude that
[TABLE]
Altogether, our analysis implies that the expectation survives the normalization, but only just barely. Indeed, in formulating the bound (21), we only considered one of the special vertices despite the fact that each test graph has two such vertices before the identifications by . Assume then that for some . In this case, are distinct vertices in some connected component . As a result, we lose an additional degree of freedom when choosing a contributing map . To see this, we return to our spanning tree . We denote the last edge on the unique path from to in by . Running through the same argument as before with as the initial vertex now gives the improved bound
[TABLE]
where the asymptotic follows from (13). Putting this back in to (22) proves that
[TABLE]
Let us now assume that for every . A partition then necessarily identifies for every . Furthermore, if for some , then must also identify . We imagine making these identifications first before carrying out the rest of the identifications prescribed by . At the first step, this corresponds to identifying the ends of the test graph (14), creating a directed cycle with a special vertex that we can think of as a root. We then identify the roots of different cycles if . It will be convenient to redefine (14) to account for this first step beforehand, namely
[TABLE]
Now, suppose that identifies vertices across different cycles:
[TABLE]
If , then we claim that
[TABLE]
To see this, let denote the connected component of that contains the vertex . By assumption, also contains the vertex . Our earlier work shows that satisfies the asymptotic (23) since the component has two special vertices to account for, which proves (25).
If , then we claim that (25) holds if . To see this, let be as before. If , then are distinct vertices in . In particular, there are four edge-disjoint paths from to . Indeed, there are two edge-disjoint paths from to using only the edges of , and there are two edge-disjoint paths from to using only the edges of . Thus, any spanning tree of will necessarily omit (at least) one of the total edges from these paths. In view of (17) and (13), we conclude that
[TABLE]
which again proves (25).
Thus, we are left to consider partitions
[TABLE]
where
[TABLE]
Note that factorizes into partitions of the test graphs (24) via the bijection
[TABLE]
where is the partition obtained from by first taking the disjoint union of the blocks of the and then identifying the vertices that satisfy . Of course, the resulting quotient test graph might have fewer than connected components; however, the defining property (26) of implies that can be obtained as follows: first, let be the factorization of as above. Next, apply the partitions to obtain the quotient test graphs . Finally, in the disjoint union of the , identify the vertices and if . The injectivity of the maps then implies that
[TABLE]
We would also like to factorize the set
[TABLE]
where
[TABLE]
however, in general, this map is not bijective since for . Nevertheless, we do have the asymptotic equality
[TABLE]
The contributions from the additional terms counted by the maps in can still be bounded uniformly via (16). In view of (27), this implies that such overcounting will not affect our calculations in the limit. In other words,
[TABLE]
So, we will be done if we can prove that
[TABLE]
The main result in [Au18] implies that
[TABLE]
where a colored double tree is a double tree whose twin edges each have the same color and
[TABLE]
In short, this follows from (17), (18), and the spanning tree argument. Similarly, we can strict the outer sum in (28) to the same class of partitions . Note that for large , the periodicity of the band width condition implies that
[TABLE]
We use the fact that a quotient of a directed cycle is a double tree only if each of its twin edges go in opposite directions and [Au18, Figure 5]. In that case,
[TABLE]
since the calculations in the expectation only involve variances (as opposed to pseudo-variances). This homogeneity allows us to conclude that averaging over the labels does not affect the calculation. Consequently,
[TABLE]
as was to be shown. ∎
Assuming an infinitesimal distribution for the family , Lemma 3.4 proves that and are asymptotically infinitesimally free. Indeed, this follows from a straightforward application of the following criteria for infinitesimal freeness.
Proposition 3.5** ([Shl18]).**
Let be a tracial infinitesimal NC probability space. Suppose that and are subalgebras of such that (in particular, is non-unital). Then and are infinitesimally free iff for any -tuples and , we have the identities
- (i)
; 2. (ii)
.
For example, this proves a preliminary version of Theorem 1.13 (resp., Theorem 1.18) restricted to the matrices (resp., ) and . We now extend the calculation to include the matrix . For this, we will need the following lemma concerning the formation of double trees as quotients of paths.
Lemma 3.6**.**
Let be a path graph of length , where
[TABLE]
If is such that is a double tree, then .
Proof.
Since a double tree has an even number of edges, we only need to prove the result for even values of . We proceed by induction on the length of the path. If , then the statement follows. So, assume the result is true for paths of length , and consider . If is a double tree, then it must identify with another vertex for some . Indeed, this follows from the fact that the degree of every vertex in a double tree is even. The edges then form a trail in starting and ending at the same vertex . This implies that the subgraph spanned by these edges is also a double tree. Since the remaining edges span a connected subgraph of , the fact that is a double tree implies that is a double tree as well. We can then apply the induction hypothesis to conclude that . ∎
We use this to prove the analogue of Lemma 3.4 for .
Lemma 3.7**.**
For any NC polynomials ,
[TABLE]
Proof.
We carry forward the notation from the proof of Lemma 3.4. In particular, restricting to monomials , we write for the test graphs (14); for their disjoint union; and for the connected components of a quotient . We redefine the set of admissible maps since we no longer have special vertices with fixed labels to account for, namely
[TABLE]
We still have the bounds (19) and (20), which imply the following analogue of (22):
[TABLE]
Thus, we can restrict to partitions such that has exactly connected components. Of course, since already has connected components, this means that we are simply considering the disjoint union of partitions for . As before, even though , the fact that
[TABLE]
allows us to factor
[TABLE]
So, we will be done if we can prove that
[TABLE]
but this follows from Lemma 3.6 and Example 3.3 (recall that [Au18] allows us to restrict to such that is a colored double tree). ∎
As in the case of the matrix units, assuming an infinitesimal distribution for , Lemma 3.7 proves that and are asymptotically infinitesimally free. To complete the proof of Theorems 1.13 and 1.18, we turn our attention to the non-unital algebra generated by and .
Lemma 3.8**.**
The algebra is spanned by elements of the form
- (i)
, where ; 2. (ii)
, where ; 3. (iii)
\mathopen{}\big{[}\prod_{s=1}^{t}(\mathbf{E}_{N}^{(j_{s-1},k_{s})}\mathbf{K}_{N})\big{]}\mathclose{}\mathbf{E}_{N}^{(j_{t},k_{t+1})}=\frac{1}{N^{t}}\mathbf{E}_{N}^{(j_{0},k_{t+1})}, where ; 4. (iv)
\mathopen{}\big{[}\prod_{s=1}^{t}(\mathbf{K}_{N}\mathbf{E}_{N}^{(j_{s-1},k_{s})})\big{]}\mathclose{}\mathbf{K}_{N}=\frac{1}{N^{t}}\mathbf{K}_{N}, where .
Proof.
The result follows from a simple computation using the fact that is idempotent and the identity . ∎
Corollary 3.9**.**
* and are asymptotically infinitesimally free.*
Proof.
The characterization of in Lemma 3.8 implies that
[TABLE]
Once again, a straightforward application of Proposition 3.5 proves the result. ∎
We adopt the notation to characterize the type distribution of . The following result implies that the only non-trivial values of have already been computed in Lemmas 3.4 and 3.7.
Lemma 3.10**.**
For any NC monomials and ,
[TABLE]
Otherwise, for each , in which case and
[TABLE]
where . In particular, since the index [math] only comes in pairs , the limit vanishes if there exist such that and .
Proof.
The proof of (30) will follow from our analysis of (31), which we prove first. By our earlier work, we need only to consider the case of such that and . Moreover, the cyclic invariance of the trace allows us to assume that this occurs precisely at the values
[TABLE]
We think of each occurrence of as providing its normalization to the test graph associated to . Similarly, each occurrence of a matrix unit creates a special vertex in each of the test graphs and .
To adapt our earlier work, we define
[TABLE]
where . Similarly, we redefine
[TABLE]
Note that each connected component of satisfies at least one of the following conditions:
- (i)
has at least one special vertex with a fixed label, in which case we can apply (21); 2. (ii)
contains the edges of a test graph that has been assigned the normalization of its adjacent term , in which case we can apply (20),
where the number of connected components of type (ii) satisfies . In particular, the connected component that contains the edges of will satisfy both of these conditions. Indeed, this follows from (32). Rearranging, we can count the connected components of type (ii) first, namely, with . The analogue of (22) and (29) in this case then follows:
[TABLE]
which proves (31).
To prove (30), we use the characterization of a monomial given in Lemma 3.8. In particular, we imagine replacing the terms in the trace
[TABLE]
according to the following scheme:
- (a)
if is of the form (i) or (ii), then we replace with ; 2. (b)
if is of the form (iii), then we replace with the corresponding matrix unit without the factor of ; 3. (c)
if is of the form (iv), then we replace with without the factor of .
After this procedure, our work above shows that the resulting trace satisfies
[TABLE]
however, based on our analysis of , the original trace then necessarily satisfies
[TABLE]
Indeed, consider the following interpretation of the replacement scheme. If the original term is of type (i) (resp., type (ii)), then it creates a special vertex in the test graph (resp., ) and contributes a factor of to the test graph , where . In contrast, its replacement only contributes a factor of to . Similarly, if the original term is of type (iii) or type (iv), then its replacement simply drops the factor of , where . In any case, since , we know that a replacement of type (a), (b), or (c) occurs with . Our work in establishing (33) then proves (34). The result now follows. ∎
Corollary 3.11**.**
Assume that the family has an infinitesimal distribution. Then the matrices , , and are asymptotically infinitesimally free.
Proof.
Under the assumption for , we already know that each pair of the families , , and are asymptotically infinitesimally free by Lemmas 3.4 and 3.7 and Corollary 3.9. So, we will be done if we can prove that and are asymptotically infinitesimally free. Again, this follows from applying the criteria in Proposition 3.5 to Lemma 3.10. ∎
This completes the proof of Theorems 1.13 and 1.18. The type free convolution calculations in Corollaries 1.14 and 1.19 essentially already appear in [Shl18, §4.1.1], so we do not repeat them.
Acknowledgements
The author thanks Paul Bourgade, James Mingo, and Dimitri Shlyakhtenko for helpful conversations. The figures in this article were produced in Inkscape.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[ACD + ] Benson Au, Guillaume Cébron, Antoine Dahlqvist, Franck Gabriel, and Camille Male, Large permutation invariant random matrices are asymptotically free over the diagonal , Preprint. https://arxiv.org/abs/1805.07045 v 1 .
- 2[AGZ 10] Greg W. Anderson, Alice Guionnet, and Ofer Zeitouni, An introduction to random matrices , Cambridge Studies in Advanced Mathematics, vol. 118, Cambridge University Press, Cambridge, 2010. MR 2760897 (2011 m:60016)
- 3[Au 18] Benson Au, Traffic distributions of random band matrices , Electron. J. Probab. 23 (2018), paper no. 77, 48 pp. MR 3858905 · doi ↗
- 4[BBAP 05] Jinho Baik, Gérard Ben Arous, and Sandrine Péché, Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices , Ann. Probab. 33 (2005), no. 5, 1643–1697. MR 2165575 · doi ↗
- 5[BBCF 17] Serban T. Belinschi, Hari Bercovici, Mireille Capitaine, and Maxime Février, Outliers in the spectrum of large deformed unitarily invariant models , Ann. Probab. 45 (2017), no. 6A, 3571–3625. MR 3729610 · doi ↗
- 6[BEKS 17] Jeff Bezanson, Alan Edelman, Stefan Karpinski, and Viral B. Shah, Julia: a fresh approach to numerical computing , SIAM Rev. 59 (2017), no. 1, 65–98. MR 3605826 · doi ↗
- 7[BGN 03] Philippe Biane, Frederick Goodman, and Alexandru Nica, Non-crossing cumulants of type B , Trans. Amer. Math. Soc. 355 (2003), no. 6, 2263–2303. MR 1973990 · doi ↗
- 8[BGN 11] Florent Benaych-Georges and Raj Rao Nadakuditi, The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices , Adv. Math. 227 (2011), no. 1, 494–521. MR 2782201 · doi ↗
