Large Deviations and the Lukic Conjecture
Jonathan Breuer, Barry Simon, Ofer Zeitouni

TL;DR
This paper employs large deviation techniques to establish higher order sum rules for orthogonal polynomials on the unit circle, providing partial proof of Lukic's conjecture in a specific singular case, which contrasts with the failure of Simon’s conjecture.
Contribution
It proves a significant part of Lukic's conjecture for the case of two singular points, advancing understanding of sum rules and orthogonal polynomials.
Findings
Proves one half of Lukic's conjecture for two singular points
Supports the validity of Lukic's conjecture where Simon's fails
Uses large deviation approach to sum rules in orthogonal polynomial theory
Abstract
We use the large deviation approach to sum rules pioneered by Gamboa, Nagel and Rouault to prove higher order sum rules for orthogonal polynomials on the unit circle. In particular, we prove one half of a conjectured sum rule of Lukic in the case of two singular points, one simple and one double. This is important because it is known that the conjecture of Simon fails in exactly this case, so this paper provides support for the idea that Lukic's replacement for Simon's conjecture might be true.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Large Deviations and the Lukic Conjecture
Jonathan Breuer1,4, Barry Simon2,5,
and Ofer Zeitouni3,6
Abstract.
We use the large deviation approach to sum rules pioneered by Gamboa, Nagel and Rouault to prove higher order sum rules for orthogonal polynomials on the unit circle. In particular, we prove one half of a conjectured sum rule of Lukic in the case of two singular points, one simple and one double. This is important because it is known that the conjecture of Simon fails in exactly this case, so this paper provides support for the idea that Lukic’s replacement for Simon’s conjecture might be true.
Key words and phrases:
sum rules, large deviations, orthogonal polynomials
2010 Mathematics Subject Classification:
60F10,35P05,42C05
1 Institute of Mathematics, The Hebrew University, 91904 Jerusalem, Israel. E-mail: [email protected]
2 Departments of Mathematics and Physics, Mathematics 253-37, California Institute of Technology, Pasadena, CA 91125. E-mail: [email protected]
3 Faculty of Mathematics, Weizmann Institute of Science, POB 26, Rehovot 76100, Israel and Courant Institute, NYU. E-mail: [email protected].
4 Research supported in part by the Israel Science Foundation (Grant no. 399/16) and in part by the United States-Israel Binational Science Foundation (Grant No. 2014337).
5 Research supported in part by NSF grant DMS-1265592 and in part by the United States-Israel Binational Science Foundation (Grant No. 2014337).
6 Research supported in part by a grant from the Israel Science Foundation.
1. Introduction
This paper is a contribution to the theory of sum rules in the spectral theory of orthogonal polynomials. The earliest such result is Szegő’s Theorem for orthogonal polynomials on the unit circle (OPUC) in Verblunsky’s form [29] of which we’ll say more soon. The modern theory was initiated by Killip–Simon [15] for orthogonal polynomials on the real line (OPRL) with considerable work by others [6, 12, 16, 17, 18, 19, 20, 27].
Here we’ll consider OPUC. Given a probability measure on , one can form the non–zero (in ), monic orthogonal polynomials where if has exactly points in its support and if has infinitely many points in its support. In the case there are exactly points, one defines to be the unique degree monic polynomial vanishing at all points (so in ). The recursion (aka Verblunsky) coefficients, , are given by the recursion relations, :
[TABLE]
For , (see [22]) and for , only are defined (since is only defined for ) and .
Verblunsky’s Theorem states that there is a one–one correspondence, , from probability measures to Verblunsky coefficients with the above restrictions, i.e. measures with infinite support) and –point measures) . Moreover in the natural topologies, is a homeomorphism.
Szegő’s Theorem in Verblunsky form says that
[TABLE]
where is the Kullback-Leibler (KL) divergence (aka the relative entropy, depending on the sign convention for the relative entropy):
[TABLE]
(1.2) always holds although both sides may be . (The latter is the case if, e.g., since in that case the term in the sum is ). In particular, the condition that both sides are finite at the same time implies
[TABLE]
where
[TABLE]
where is singular w.r.t . Simon [24] calls a result like (1.4) that gives equivalence of spectral data and coefficient data a “spectral theory gem”. (1.4) in particular implies the existence of measures with arbitrarily bad singular part mixed in with a.c. spectrum and with decaying Verblunsky coefficients.
The current paper is devoted to higher order sum rules of which the first is that of Simon [22, Section 2.8]:
[TABLE]
where . This implies the gem
[TABLE]
In the same section, Simon conjectured (wrongly as we’ll see!) that for distinct in and strictly positive integers we have that
[TABLE]
if and only if
[TABLE]
and
[TABLE]
In (1.9), is the operator
[TABLE]
Moreover, Simon–Zlatoš [27] proved this conjecture in case , i.e. and or . For simplicity the remainder of this section will mainly discuss the case although the next two sections will revert to the general case. We’ll use the symbol to describe this case.
In [18], Lukic found a counterexample to this conjecture for the case. He found an explicit example where (S1), (S2) hold but
[TABLE]
To have any hope of an equivalence one needs more that (S1), (S2). Lukic made an improved conjecture that replaced (S1), (S2) by
[TABLE]
[TABLE]
[TABLE]
Lukic also proved a flawed gem, i.e. an equivalence under an a priori condition on the Verblunsky coefficients, that provides evidence for his conjecture. In Section 9 we’ll obtain some additional evidence for the correctness of the Lukic conjecture. In Section 2, we’ll consider equivalent versions of Lukic’s conditions that are directly expressible in terms of without reference to a decomposition as a sum. In a sense, is the part of localized near in Fourier space, so that for the case Lukic’s conditions are equivalent to (the conditions will appear in the next section).
[TABLE]
[TABLE]
[TABLE]
In this case , and is an extra condition. The precise result we’ll prove in Section 9 is that the conditions imply the finiteness of the integral in (1.12) (at least when the s are real).
Recently, Gamboa, Nagel and Rouault [9] (henceforth GNR; see also [10, 11]) discovered a new approach to Szegő’s Theorem (and the Killip–Simon Theorem) using the theory of large deviations (LD). We wrote a pedagogical presentation of some of these ideas [4]. Our main goal in this paper is to use large deviation methods to study higher order sum rules. We note that GNR [11] discussed (1.6) using LD methods although for technical reasons, they were unable to prove the actual sum rules. Below we will assume the reader familiar with some of the basics of LD theory either from books [7, 5] or from our paper [4].
In Section 3, we will prove a sum rule and gem where one side of the gem is the integral in (1.8). In general, the other side of the gem will be a very complicated polynomial in the ’s (with some non–polynomial terms of the form ). This leads to a new insight. The Lukic conjecture (if true) provides much more humane conditions on the ’s than what one gets from the naive sum rule. We note that we suspect that our sum rules are identical to the ones found by Denissov–Kupin [6] who did not carry through the examples of Sections 4-9. Section 4 will use these ideas to get the sum rule (1.6) in a new way.
In the last four sections, we make two simplifying assumptions:
- (A1)
when (essentially is what is important) and we’ll also consider the symmetric situation where one has general points symmetrically arranged as the roots of unity with all .
- (A2)
for all .
These are mainly to make the sometimes involved calculations simpler. We have no doubt that one can do the calculations without (A2) and suspect one can drop (A1) although with some effort.
In Sections 5 and 6, we recover the Simon–Zlatoš gems (i.e. and ), at least under the assumptions (A1)–(A2). One thing we’ll see in these sections is that it is simpler to show that the conditions on Verblunsky coefficients imply the measure condition than the converse so in the last two sections, we’ll settle for the simpler half. In Section 7, we’ll prove this one direction for equally spaced points, all with order 1, that is we’ll prove that and in Section 8, we discuss an arbitrarily high order single singular point under the hypothesis that . The results in Sections 7 and 8 are not new but recover, using new methods, special cases of results of Golinskii–Zlatoš [12]. In Section 9, we will prove that for the case under (A1)/(A2), the Lukic conditions imply finiteness of the integral. Recall that this is a case where the Simon conditions do not imply finiteness of the integral so we regard this as strong evidence for the Lukic conjecture.
We believe our main results in this paper are the general sum rule and gem and the realization that the Lukic conjecture is just about finding a simpler version of the naive Verblunsky coefficient side. In addition we show how to use LD methods to recover the gems of Simon and Simon–Zlatoš and some results of Golinskii–Zlatoš. Finally, we provide evidence for the general Lukic conjecture by finding a situation where his conditions imply the finiteness of the relevant integral and where Simon’s do not.111Building on the general sum rules of this paper, Jun Yan developed an algebraic machinery that allowed him to obtain new examples where (one half of) the Lukic conjecture can be verified. We refer to [30] for details.
We thank Peter Yuditskii for telling two of us about [9] and Fabrice Gamboa, Jan Nagel and Alain Rouault for useful discussions.
2. The Lukic Condition
In this section, we want to discuss some equivalent forms of the Lukic conditions . This and some of the analysis in later sections will require some discrete hard analysis that we set up here. First, we’ll consider
[TABLE]
[TABLE]
In some sense, says that in “–space” is locally near . Our first result is
Theorem 2.1**.**
**
Remark**.**
The same argument shows that (S1-2) are equivalent to (2.1) and (2.2) but with replaced by . This illustrates the difference between the Simon and Lukic conditions.
The proof will depend on momentum space localization. We can view as a subspace of and define by restricting to . We can think of either as a map between spaces which clearly has norm or as a map of to itself whose range is those with for all . In the latter view, is a projection of norm . We can extend to by setting . This is an invertible isometry (on it doesn’t have a left inverse).
is unitary on with spectrum all of , so, by the spectral theorem, we can define on for any and then on by . These are sometimes called Laurent and Toeplitz operators respectively. is made most transparent by using Fourier transform, , mapping to by
[TABLE]
These are, of course, Fourier coefficients of in the orthonormal basis of , , so we can define from sequences to functions by defining (with convergence in –sense):
[TABLE]
Then .
If is a trigonometric polynomial so
[TABLE]
then, for ,
[TABLE]
i.e. is convolution with . If , by a simple argument (see [25, Section 6.3]), decays faster than any inverse polynomial, so, in particular, . Taking limits in (2.5), we see that formula still holds but with replaced by . Thus, since , we see that as maps on or , maps to itself, and since maps to itself, we see that map to itself i.e.
Proposition 2.2**.**
* maps any to itself and maps any for for any function, , on .*
In particular, we can localize in –space by picking a convenient partition of unity on and writing .
Corollary 2.3**.**
Let be a Laurent polynomial on . Let be a function on so that has no zeros in the support of . Suppose that lies in some . Let . Then
[TABLE]
Proof.
Suppose first we are dealing with the maps on . By the zero condition, it is easy to find a function, , on so that for all . Thus, if , then
[TABLE]
since maps to .
Now suppose and extend it to by for . Since is convolution with a function of very rapid decay, and both have rapid decay to the left so since lies in , we see that lies in . Since has rapid decay on the left, lies in and so lies in as well. By the argument in the first paragraph, lies in so lies in . ∎
Proof of Theorem 2.1.
() Let obey . Pick , functions on so that and vanishes in the neighborhood of . Let . follows from . Since commutes with any polynomial in , by (2.1),
[TABLE]
(with a small argument to deal with the P operator) so, by Corollary 2.3, (1.14) holds. A similar argument shows that (2.2) implies (1.15).
() Suppose obeys . Since polynomials in map to itself, (1.14), so by (1.13), we have (2.1). By (1.14), if , then . Also (1.15) implies . Therefore, by (1.13), we get (2.2). ∎
For comparison with Simon’s conjecture, the following version (which appeared already in the last section) is useful. Let ,
[TABLE]
[TABLE]
[TABLE]
Theorem 2.4**.**
**
Proof.
Clearly, (2.11) implies (2.2) when , so .
On the other hand, by Theorem 2.1, and trivially, (1.13) and (1.15) (2.11) ∎
To find some equivalent forms of the Lukic conditions, it will be useful to have the following:
Theorem 2.5**.**
For any sequence of finite support, we have that:
[TABLE]
Remarks**.**
- This is a discrete case of an inequality on derivatives due to Gagliardo [8] and Nirenberg [21]; see Simon [26, Section 6.3] and Taylor [28]. Here replaces . The general version (with essentially the same proof) is
[TABLE]
for . (2.12) is .
-
Once one has Theorem 2.5 then it is easy to show, by dominated convergence, that and that (2.12) holds even without the condition on finite support of .
-
This result is in [27] and probably other places but the proof is so simple that we give it for the reader’s convenience.
-
(2) below can be thought of as resulting from a summation by parts.
Proof.
Given , define by . We begin by noting that for , we have by the triangle inequality that
[TABLE]
so that if for all , then
[TABLE]
Note next that Leibniz rule takes the form (where )
[TABLE]
so
[TABLE]
Choose and use the fact that a sum of is zero when has finite support (because of telescoping) to see that if has finite support, then
[TABLE]
where we used (2.14) to bound by .
Hölder’s inequality and says that the first sum on the right is bounded by . The second sum has the same bound which shows that
[TABLE]
which implies (2.12) ∎
Let us focus on the case , so we have
[TABLE]
[TABLE]
[TABLE]
We want to note that
Theorem 2.6**.**
* for is equivalent to*
[TABLE]
[TABLE]
[TABLE]
Moreover, one also has that if these conditions hold, then
[TABLE]
Remarks**.**
-
The proof shows that when and hold, then is equivalent to for any k fixed .
-
The example obeys but doesn’t have .
Proof.
Clearly since maps to itself. So suppose we have . Applying (2.12) to and noting that , we conclude that proving (2.24).
Since , we see that . Thus
[TABLE]
∎
3. Sum Rules
In this section, we’ll explain how to use LD methods to obtain sum rules for any choice of and where one side is (1.8). The sum rules imply gems. In fact, it will be easier to obtain the gems and we’ll prove them first as part of the proof of sum rules. While we haven’t tried to prove it in general, we believe our sum rules are the same as those of Denisov–Kupin [6] obtained using the method of Nazarov et. al. [20].
We begin by finding matrix models whose LDP on the spectral side involves (1.8) up to constants. Our basic random matrix measures will have the form
[TABLE]
where is a normalization factor, is a function of of the form
[TABLE]
where is a Laurent polynomial
[TABLE]
(if and/or , we say that is the degree of or ) and where is Haar measure (aka circular unitary ensemble, ). GNR [9, 11] also discussed these models, especially the case (discussed first, in a different context, by Gross–Witten [13] whose name GNR assign to the model) but they do not prove sum rules or gems for these models.
There is a huge literature on these matrix models, discussed for example in [2, Section 2.7]. Much of the literature discusses perturbations of GUE rather than CUE but the results that we need extend to CUE, which is technically simpler because random unitary matrices, unlike random self–adjoint matrices, are automatically uniformly bounded. A major result (see, for example, [2, Section 2.6]) is that the associated limit of empirical measures (aka density of states), , obeys
[TABLE]
for some constant (which when we start with we will take to be zero).
Any fixed vector, , is a cyclic vector for a.e. . Associated to each such is a probability measure on which is an –point measure with masses at the eigenvalues of and weights the absolute square of the components of in the corresponding eigenvectors. Thus picking (conventionally to be ), we get a many-to-one correspondence between a set of unitaries of full measure and all -point spectral measures. Thus the measure in (3.1) induces a probability measure on -point probability measures and so on sets of Verblunsky coefficients. The unitaries and correspond to the same spectral measure if and only if there is a unitary which has as an eigenvector with . It is important to notice that the spectral measure determines the eigenvalues of and so for any k, so these traces are only functions of the Verblunsky coefficients and we can compute the traces in any convenient representation of one of the unitaries associated to a given spectral measure.
The measure in (3.1) induces a measure on –point measures (the spectral measures, viewed as elements of ), and the Verblunsky map drags that to a measure on the set of -point Verblunsky coefficients, i.e. .
The measure in (3.1) induces another measure on the sequence of empirical measures , where are the eigenvalues of . Recall that if obeys (3.4), then converges a.s. as to and by the method of Ben Arous–Guionnet [3], the sequence obeys a LDP (in the usual topology of weak convergence of probability measures) with speed and rate function at measure , where is the 2D Coulomb energy in external field which is minimized at (by (3.4)).
By the arguments in [4, Section 3], if the support of is all of and possesses a density with respect to Lebesgue’s measure which is strictly positive -almost everywhere, one finds that the spectral measure obeys an LDP in with speed and rate function
[TABLE]
where is given by (1.3). On the other hand, as discussed in [9] and [4], by the continuity of the map , the latter LDP induces a LDP on the infinite sequence of Verblunsky coefficients, viewed as elements of equipped with the product topology, with rate function given in terms of . By the uniqueness of the rate functions in large deviations theory, if one has an expression for the rate function in terms of the Verblunsky coefficients then one gets a sum rule with the integral in (1.8) on one side (up to constants due to the normalization of and a ) term.
We remark that the regularity assumptions stated above for (namely full support and a.e. positive density) make it possible to mimic the proof in [4, Section 3] and approximate the spectral measure throughout its support; to see what goes wrong when there are gaps in the support of , it is enough to consider the analogous problem for Hermitian matrices where is replaced by . In that case, there may be “stray eigenvalues” which are not controlled by the LDP for the empirical measure. We refer to [9] for a discussion of this issue, and [11] for a detailed proof of the LDP for the spectral measure in the cases treated in this paper.
In what follows, we will be interested in of the form
[TABLE]
which automatically satisfies the regularity assumption stated above.
In computing (3.4) with that , the following is useful
Proposition 3.1**.**
For any , , we have that
[TABLE]
If , the integral is zero.
Proof.
While this integral is in the tables, the proof is so simple we give it. Replacing by , we can suppose that . By taking complex conjugates, we can suppose that . Write and
[TABLE]
Then note that for
[TABLE]
by the Cauchy integral theorem. By the Cauchy formula for Taylor coefficients and the well known series , for (since the series only converges inside the disk, one needs to note that the integral over the unit circle is a limit of integrals over slightly smaller circles)
[TABLE]
∎
Thus, for of the form (3.6), defined by (3.4) is a Laurent polynomial with no constant term.
As a preliminary to the calculation of the Verblunsky coefficient side, we want to make two comments about the sum rules and their relation to the rate function. The first one regards the fact that rather than the integral in (1.8), the form of the rate function on the measure side is , which involves an additional term of the form
[TABLE]
Computing this constant term is important in writing the sum rule. As an example, rather than the left side of (1.6), the LD calculation will give where
[TABLE]
Noting that (which follows as in the proof of Proposition 3.1; see [22, Section 2.8]) we can write (1.6) as
[TABLE]
The right hand side has to vanish when the are the Verblunsky coefficients of the measure (since ). Let us confirm this not only as a check but because it will let us compute the constant in Section 4 when we only know the sum rule up to a constant.
The Verblunsky coefficients for the of (3.9) are not hard to compute [22, Example 1.6.4 and equation (1.6.14)]
[TABLE]
Since , we can cancel the terms in the sums on the right side in (3.10) and see that when the right side is
[TABLE]
The sum telescopes since so the sum is and . To evaluate the infinite product, note Euler’s formula that
[TABLE]
so
[TABLE]
and thus the term in (3.12) is which cancels the . Thus, we confirm that the expression in (3.12) is [math].
The other issue concerns a huge difference in getting sum rules once a is added to the mix. Recall that under the measure, i.e. in case , the measure on the Verblunsky coefficients has the property that if , then the Verblunsky coefficients are independent of so, with denoting the continuous projection from to , the rate function of is easy to compute (see [4, Section 2] for a discussion of ). Since has cross terms between and for suitable and (in (1.6) the terms), one no longer has independence and the exact calculation of involves the limiting distribution of . In the case of (1.6), we want to show that where has a piece and a piece from the term. Instead of computing exactly, we’ll show that (up to constants) . This fact and Rakhmanov’s Theorem (see [23, Chapter 9]) allow one to prove that has the required form.
We begin the analysis of the general case with
Theorem 3.2**.**
Let be a Laurent polynomial of degree and let be an unitary CMV matrix. Then there exist –independent polynomials and , depending on successive ’s and ’s and on such variables so that
[TABLE]
Moreover, .
Remarks**.**
-
The unitary, , associated to any spectral measure is multiplication by on . To get a matrix related with that spectral measure associated to , one needs to pick an orthonormal basis for this space with the function . [22, Chapter 4] discusses two natural bases for which the matrix elements are explicit functions of the ’s and . One choice is the set of orthonormal polynomials for . This yields the GGT matrix. The other is to orthonormalize which yields the CMV matrix. One issue is that for general , the orthonormal polynomials may not be a basis so the naive GGT matrix may not be unitary but for –point measures, it is unitary. The CMV matrix is 5 diagonal while the GGT matrix is a Hessenberg matrix, i.e. only one non-vanishing diagonal below the principal diagonal but, in general, all non–vanishing matrix elements above the diagonal. The proof of this theorem will discuss the explicit form of the CMV matrix and (9.9) below the explicit form of the GGT matrix.
-
These polynomials have degree at most . (The CMV matrix has matrix elements that are products of exactly two, , and so written in terms of the three variables is of homogeneous degree if but removing the ’s produces lower degree terms even in this special case.)
-
are not unique. If H is any function of successive pairs and
[TABLE]
then (3.13) holds for if and only if it holds for .
Proof.
Recall (see [22, Section 4.2]) the representation of the CMV matrix, , which we write when is even. Define the matrices
[TABLE]
Let . Then
[TABLE]
( is a direct sum of matrices while has matrices at the top and bottom and in between). And one has that (i.e. our parametrization of ) is given by
[TABLE]
We will also write for with replaced by zero (only remains in the direct sum) and similarly for (where are matrices. Thus we have that
[TABLE]
We note that
[TABLE]
For odd, there is a similar representation but now has a matrix at the bottom and only a matrix at the top.
We’ll prove the theorem when . For , the argument is similar since replacing by just interchanges and and replaces by (since ). And for , yields polynomials of the same form (since functions of fewer variables can be viewed as having more variables; there will be some lost ’s near the bottom but they can be made part of ).
We’ll show first that we have the required function of exact degree where it is a polynomial in and and then that each occurs as an even power so using we get the result without any ’s.
We write
[TABLE]
where a symbol like means the matrix element of the matrix . In (3.21), we sum from to running through even and odd integers respectively and running from [math] to . The only non–zero terms have , with further restrictions since, for example, is actually and not .
This clearly writes as a polynomial in of homogeneous degree . For each , group together all where the smallest index of is . It is easy to see that the resulting sum, call it , has independent of and gives the terms. The terms with (coming from , and hence ) we put into and those whose smallest so that we put into . It is easy to see that is –independent and that the –dependence of comes only from translating the indices. Thus we have proven (3.13) except we have some dependence.
For each product in (3.21), the terms come from increasing some to or a decrease in the opposite direction and it is only through such terms that such an increase or decrease can happen. Since and each step only increases or decreases by a single step, for every going in one direction, there must be one going in the other, so an even number in all.
To confirm the assertion that , we prove that no term in (3.21) can only have ’s, that is there must be at least one with . For if it is easy to see that either or with the same sign , that is one can’t change direction without an term. But to return where one started, one must change direction. ∎
Remark**.**
It is an interesting exercise to use the GGT representation [22, Section 4.1] to prove that the ’s only occurs in even powers and that every term in has at least one power of or .
In the next theorem, we use to denote the subset of consisting of the probability measures of infinite support on , i.e., not supported on finitely many points.
Theorem 3.3**.**
Let V be a potential of the form (3.4) with measure whose support is and let be given by (3.13). Let be the rate function from (3.5) on the measure side. Let mapping to its first Verblunsky coefficients. Let be the rate function corresponding to the LDP for , and write . There is a constant independent of and so that if and is the sequence of Verblunsky coefficients of , then for all such and ,
[TABLE]
Remarks**.**
-
Recall that is the Verblunsky map taking measures to Verblunsky coefficient sequences, defined in the Introduction. The mapping is the projection onto the first elements.
-
Recall (see [4, Theorem 2.6 and Theorem 2.7]) that obeys a LDP with speed and rate related to by
[TABLE]
Proof.
By writing the induced measures on Verblunsky coefficients according to Killip–Nenciu [14] (see Theorem 4.2 of [4]) and according to (3.13), we see that for and
[TABLE]
where and
[TABLE]
For fixed , the function , obtained by dropping the term and all the terms where is a product of a function of and a function of . Since (up to a set of zero measure), the integrals over in the numerator and denominator of the modified (3.24) cancel.
The modified formula defines a probability measure
[TABLE]
where
[TABLE]
Since and , and are polynomials, the dropped terms are bounded, so that for some constant, ,
[TABLE]
By an elementary argument (see [4, Theorems 2.1 and 2.2]), obeys a LDP with speed and rate function
[TABLE]
where is such that (forced by the condition on the function (different from our here) in [4, Theorem 2.2]).
With given by (3.23), we conclude by (3.28) that
[TABLE]
Taking for which is finite and using and , we conclude that is bounded as so is finite. (3.22) follows with . ∎
While not essential, the following lovely lemma of Nazarov et. al [20, Lemma 3.1] will simplify some arguments.
Proposition 3.4**.**
Let be a continuous function on where is compact. Suppose and that . Let be the sequences so that eventually i.e. only finitely many are non-zero. For define
[TABLE]
Suppose there is a so that for all , . Then, there exist continuous functions on and on so that
[TABLE]
and
[TABLE]
Remark**.**
The point, of course, is that if we add a constant to so that (which doesn’t change (3.33)), then
[TABLE]
which assures that we can extend to infinite sequences with a convergent sum or else a sum that diverges to .
Theorem 3.5** (Abstract Gem).**
Let V be a potential of the form (3.4) and given by (3.13). Let and let be the measure with those Verblunsky coefficients and the measure obeying (3.6). Then
[TABLE]
exists and the limit is finite if and only if is finite.
We refer to the sum in (3.34) as the Verblunsky side of the gem.
Remark**.**
[12, Theorem 3.3] have a general abstract gem derived by very different means.
Proof.
By the theory of projective limits (see [4, Theorem 2.7]), . Thus by (3.22), if , the limit in (3.34) exists and is .
Assume now that . We would like to use Proposition 3.4, but first we need to restrict attention to a compact subset of the unit disc. Since , the weight of is a.e. non–zero, so, by Rakhmanov’s Theorem (see [23, Chapter 9]), as . Thus . Let . This is compact so we can apply Proposition 3.4, (3.22) and for all to conclude that there is and so that
[TABLE]
and by adding a constant to we can suppose that .
The sum in (3.34) is thus
[TABLE]
Since and is continuous, the last term goes to [math] as . Since , the sum has a limit (which may be . By (3.22) and , we see that the sum is bounded, hence convergent. ∎
Finally, we turn to the abstract sum rule. For any define
[TABLE]
may be infinite if the limit is.
Theorem 3.6** (Abstract Sum Rule).**
Under the hypothesis of Theorem 3.5, for any with infinite support
[TABLE]
Remark**.**
Basically, on the basis of (3.24), one expects that the rate function is where is a constant coming from the th root of the denominator in (3.24). Given that , the constant has to be .
Proof.
We begin with a formula like (3.26) but with two changes. First, rather than look at for a single , we look at a ratio
[TABLE]
for two open sets in so we needn’t concern ourselves with the normalization integral over all of but can focus on small sets where we have control over the ’s.
Secondly, we don’t drop all of the monomials in those terms for which intersects both and . We keep those monomials which only have . Thus the dropped terms all have a factor of some with . What results is that one obtains (still using for the probability with the, now slightly different, dropped terms):
[TABLE]
where
[TABLE]
for some constant because the dropped terms, being polynomials that are not of degree zero in all the ’s, are at least linear in some .
Note that because of lower semicontinuity of , for any ,
[TABLE]
where runs over all open neighborhoods of ordered by inverse inclusion. Moreover, because is continuous, one has that
[TABLE]
Thus taking in (3.40) and shrinking the open sets to two measures, and , we get from (3.37) that
[TABLE]
where is the sum in (3.34) when the infinite sum is replaced by the sum to .
When , we’ve already proven (3.38) so suppose . Then the density of the absolutely continuous part of with respect to is a.e. non–vanishing, so, by Rakhmanov’s Theorem, . Take so also, . Thus the right side of (3.41) goes to zero and we find that
[TABLE]
proving (3.38). ∎
4. The (1,0) Case
In this section, we’ll consider the case of a single singularity of order 1 and recover the sum rule of Simon (1.6). The calculations are so simple, we need not make the simplifying assumption that that we’ll make in the later sections.
The normalized empirical measure is
[TABLE]
[TABLE]
and
[TABLE]
In the CMV basis, where . Thus, the Verblunsky side of the sum rule is
[TABLE]
for a suitable constant, .
In (4.4), the sum rule involves limits of finite objects so here and below, sums should involve finite matrices and finite sums. But, as we explained above we are interested in the limits of such finite sums. So we’ll write sums up to infinity indicating what one will get after taking at the end of the calculation.
Since
[TABLE]
we can rewrite (4.4) as (changed )
[TABLE]
That in this form the constant is follows from the requirement that this vanish if and the calculations in Section 3 that (3.10) is [math]. Thus, we have a LD proof of (1.6).
To get the gem (1.7), we need the case of
Proposition 4.1**.**
For any
[TABLE]
if and only if
[TABLE]
Remark**.**
Since
[TABLE]
for any , the summand in (4.6) is non–negative so the sum either converges or diverges to .
Proof.
By (4.8), we have that
[TABLE]
By (4.9), we have that (4.6)(4.7). On the other hand, if (4.7) holds, then so, for all large , so we can apply (4.10) to the tail of the sum in (4.6) and conclude that (4.7)(4.6) ∎
We thus have a quick proof of the gem of Simon [22, Section 2.8]:
Theorem 4.2**.**
With as in (1.5), if and only if
[TABLE]
5. The (1,1) Case
In terms of (1.8), this section will consider gems where the measure side is
[TABLE]
To figure out the normalization, we note that
[TABLE]
One can also figure this out by noting that the extreme sides of (5.2) are degree 2 Laurent polynomials in vanishing at to second order with maximum on . For later use, we note that the same argument shows that for
[TABLE]
for a constant .
Since , we see that the normalized is
[TABLE]
so by (3.4) and (3.7), we have that
[TABLE]
We discussed in Section 4, we’ll discuss in this section (thereby recovering, using large deviations, a special case of a result of Simon–Zlatǒs [27]) and general in Section 7. Thus in this section, we’ll prove
Theorem 5.1**.**
Let be real. Then
[TABLE]
[TABLE]
Note that if is real. For such , the CMV matrix has the form for (see [22, eqn(4.2.14)]):
[TABLE]
There are also matrix elements that are two off–diagonal, but if , then , so these terms don’t contribute to (this is also clear from the factorization and from the GGT representation). Thus
[TABLE]
where bdy is short for boundary and refers to some finite number of terms involving small indices (and, later, when it appears with a finite sum, involving finitely many terms involving large indices with the number of terms bounded as the upper index of the sum changes).
Therefore, using and , we see after some algebraic manipulations that the Verblunsky side of the gem, see (3.34), is
[TABLE]
[TABLE]
We claim that up to boundary terms
[TABLE]
Accepting this for a moment, we can show that the conditions of Simon and Lukic (which agree in this case)
[TABLE]
[TABLE]
imply the measure condition, that is (given the gem) that . For clearly, by (S1), (5.13) and, by Hölder’s inequality, is bounded by . Since (on account of ), is finite by Proposition 4.1 with .
To see (5.13), define for
[TABLE]
Proposition 5.2**.**
Let and be two functions on sequences of real which are boundary terms plus a sum of the form (3.31) where is a quadratic function of its variables (i.e. a second degree homogeneous polynomial). Suppose for some , has no terms of the form with . Then up to boundary terms, each is a linear combination of and up to boundary terms if they are the same linear combinations.
Remarks**.**
-
This result is obvious. More subtle is the fact that “if” in the last sentence can be replaced by “if and only if” but we won’t need that harder half of this.
-
Again, what is being stated involves limits of finite sums. The equalities only hold up to finite boundary terms. There are also boundary terms at the upper limit but those go to zero by Rakhmanov’s Theorem. One infinite sum converges if and only if the other one does.
Corollary 5.3**.**
* of (5.10) is given by (5.13) up to constants.*
Proof.
The RHS of (5.10) is, up to boundary terms . Expanding the square, the RHS of (5.13) is . ∎
Proof of Theorem 5.1.
We’ve already proven that (S1-2) imply that the integral in (5.6) is finite. So we need to go in the opposite direction. Therefore, we suppose the integral is finite.
By the abstract gems discussed in Section 3, we know that is finite (in that the cutoff sums are uniformly bounded) with given by (5.13). In this form is positive and so is as noted in the remark after Proposition 4.1. So we look at which we write up to boundary terms as
[TABLE]
By Hölder’s inequality, , so up to boundary terms, is positive and thus is finite. Since each term is positive, they are all finite, i.e. and . is (S1).
and . means . Since , we conclude that
[TABLE]
Since , we see that . All the other terms in are positive, so all are finite. In particular, which is (S2). ∎
In particular, we see that the proof from Lukic conditions to convergence of the integral is much easier than the converse.
6. The (2,0) Case
Our goal in this section is to prove:
Theorem 6.1**.**
Let be as in (1.5) and assume that its Verblunsky sequence is real. Then
[TABLE]
if and only if
[TABLE]
and
[TABLE]
Remarks**.**
-
In this case the Lukic and Simon conditions agree.
-
This result, indeed without the reality restriction is in Simon–Zlatǒs [27]. The main difference in our approach is the method of deriving the sum rules. Once one has the sum rules the arguments are related but we feel our presentation is more transparent.
To begin we need to normalize , i.e. determine so that . We’ll use
[TABLE]
as one can see by expanding the square or by using . Thus
[TABLE]
Since , we see that
[TABLE]
[TABLE]
and thus when is real
[TABLE]
We computed in (4.4) and in (5.8). Thus the Verblunsky side of the sum rule, see (3.34), is
[TABLE]
[TABLE]
In terms of the quantities of (5.16)
[TABLE]
up to boundary terms. On the other hand, expanding the square, we see that up to boundary terms
[TABLE]
since and . Thus we see that up to boundary terms
[TABLE]
Proof of half of Theorem 6.1 that (6.1) (S1),(S2).
Since , up to boundary terms, by Hölder’s inequality. By the abstract sum rule, is finite. Since each of these terms is positive (by (6.15) and Proposition 4.1), each is individually finite. (S1) and, by Proposition 4.1, (S2). ∎
Proof of Other Half of Theorem 6.1 that (S1),(S2) (6.1).
Clearly (S1) and (S2) by Proposition 4.1, so we need only control . Hölder lets one control if and . Since , we can’t just look at products of four ’s. However since , we can control products of three ’s and one . By the Gagliardo-Nirenberg inequality, (2.12), (S1)+(S2). Since , a product of two ’s and two is also summable. So the goal is to write as sums of these two terms. We write
[TABLE]
The term is a sum of products of two terms and two terms so by the above, it is a convergent sum by (S1),(S2). Let so
[TABLE]
and thus
[TABLE]
We’ve already seen that the sum in is absolutely convergent. By (6.21), is a sum of and terms and so a convergent sum. Thus . ∎
7. The th Roots of Unity Case
Fix . In this section, we’ll consider the conditions
[TABLE]
[TABLE]
[TABLE]
By (5.3), this is the same as taking so are the th roots of unity. Of course, if is a primitive th root of unity, then , so (7.2)/(7.3) are precisely the Simon (=Lukic) conditions for this case. In this section, we’ll prove
Theorem 7.1**.**
Suppose obeys (S2). Then
[TABLE]
In particular, (S1-2)(7.1).
Remarks**.**
-
need not be assumed real.
-
This is a special case of a result of Golinskii–Zlatoš [12].
The key input to proving this will be
Proposition 7.2**.**
If is written in terms of ’s only, the term quadratic in is
[TABLE]
Remark**.**
This proof will rely on the CMV representation of unitaries. It is an interesting exercise to give a different proof using the GGT representation and ideas of Section 9.
Proof.
By (3.21), is a homogeneous polynomial of degree in and . To be left with quadratic terms after using , we need products with ’s and two of and/or .
As the end of the proof of theorem 3.2 explains, one gets strings of increasing or decreasing ’s and or at turn around points. The ’s must occur in a string of increasing and a second string of decreasing ’s. The form, (3.15), of shows we get at the bottom turn around and at the top turn around, so the only quadratic terms are .
Each diagonal matrix element has such a term for , so in all which yields (7.4). ∎
Proposition 7.3**.**
The quadratic term in the sum rule, (3.29), for (7.1) with “borrowed” from is (up to a boundary term)
[TABLE]
Proof.
The normalized is , so by (5.5), the potential is and is
[TABLE]
Thus, since in (7.4) cancels the in (7.6), the quadratic term including the borrowed is
[TABLE]
which is (7.5). ∎
Proof of Proposition 7.3.
The Verblunsky side of the sum rule associated to (7.1) has quadratic term (7.5) and a remainder that is finite if . Thus the equivalence is immediate. ∎
8. Single th Order Singularity
We are interested here in measures which obey
[TABLE]
Here the Simon–Lukic conditions are
[TABLE]
[TABLE]
Our main goal is to prove that
Theorem 8.1**.**
Suppose . Then
[TABLE]
Remark**.**
This is a special case of a result of Golinskii–Zlatoš [12].
To put this in perspective, we note that Lukic [19] has proven
Theorem 8.2** ([19]).**
Suppose . Then
[TABLE]
These two extreme cases are consistent with (S1-2) and suggest its truth.
The key to our proof will be to show that the quadratic term in the sum rule is for an explicit . We’ve seen that (4.5) and (6.15). The reader might stop and try to figure out the general formula.
By (6.4)
[TABLE]
Thus the normalized is
[TABLE]
where
[TABLE]
Using the binomial expansion , we have that
[TABLE]
Therefore, we may rewrite (8.5) as
[TABLE]
It follows from (5.5) that
[TABLE]
Recalling that we need to borrow from , and that the quadratic term in equals up to boundary terms, we see that the quadratic term in the sum rule is
[TABLE]
where now, instead of (5.16)
[TABLE]
On the other hand,
[TABLE]
where we use the fact a term will contribute to if .
Proposition 8.3**.**
For any and , we have that
[TABLE]
Proof.
To pick elements from among numbered objects, we can pick from the first and from the second. Thus
[TABLE]
Since , we have that and . We thus get (8.12). ∎
Proof of Theorem 8.1.
Picking in (8.9), we see that
[TABLE]
which by (8.11),(8.12) and (8.6) equals . When , by Hölder’s inequality, all terms in the sum rule but the quadratic are finite. So the Verblunsky side of the sum rule is finite if and only if . By the sum rule, we conclude the result. ∎
9. The (2,1) Case
Our main result in this section is half the Lukic conjecture in the case, specifically:
Theorem 9.1**.**
Let be a probability measure on of the form (1.5) with real Verblunsky coefficients obeying (1.16)–(1.18). Then the integral on the left side of (1.12) is finite.
Remark**.**
As noted, this is important because there are examples where Simon’s conditions (i.e. (1.16) and (1.18) without (1.17)) hold, but the integral in (1.12) is .
We’ll compute the sum rule guaranteed by Section 3 to say the integral in (1.13) is finite, see (9.29)-(9.34) for notation. Then we’ll show that (1.16)–(1.18) . We start by computing the potential, , of (3.4) for the case. As noted (see (5.2)), we have that
[TABLE]
Similarly
[TABLE]
Thus
[TABLE]
by (9.1)–(9.2). Thus, since , the normalized is
[TABLE]
Using (3.4) and (3.7), we conclude that
[TABLE]
so that if is real then
[TABLE]
In earlier sections, we used the CMV matrix representation to compute and . While initially we computed in this way also, we realized the calculations are simpler in the GGT matrix representation. (GGT and CMV representations are discussed in Section 4.1 and 4.2 of Simon [22].) This is given by
[TABLE]
The explicit calculation is (Simon [22, (4.15)])
[TABLE]
In [22], this is calculated using if . An easier alternative is to use the Szegő recursion ([22, (1.5.25)]) and inverse Szegő recursion ([22, 1.5.46])
[TABLE]
so
[TABLE]
which upon iterating yields
[TABLE]
with given by (9.9).
When dealing with the GGT representation, it can be an issue that is not a basis but the calculations need only be done for finite matrices where the OPs are a basis (or one can use the extended GGT basis of [22, Section 4.1] noting that diagonal matrix elements of in the extra basis elements are zero).
Define to be the th diagonal of so ()
[TABLE]
Of course, only with have non–zero main diagonal and so if we expand using (9.16), only those terms contribute to so
[TABLE]
We can now understand why calculations are easier with the GGT than CMV matrix. In (9.17), the sums start at while in the analog for CMV, we start at , so at least for not too large, there are fewer terms with GGT. Moreover, the form of (9.14)–(9.15) is covariant under translation along the diagonal while the CMV matrix diagonals have an even–odd structure.
For , we must have and for , we have or . Moreover, by cyclicity of the trace, the and terms are equal, i.e. . We thus recover (4.4) and (5.9) when is real, that is up to boundary terms:
[TABLE]
For , we have up to cyclic permutations, (once), (each three times). Thus up to boundary terms:
[TABLE]
We also write
[TABLE]
so the coefficient side of the sum rule is where
[TABLE]
where we use the fact that adding a constant to all indices in a sum only changes the sum by a boundary term.
We start with by using Proposition 5.2. In terms of the of (5.16), up to boundary terms
[TABLE]
by (9.30). On the other hand, by the same calculation that gave (9.3), so
[TABLE]
Thus
[TABLE]
We conclude by Proposition 5.2 that up to boundary terms
[TABLE]
and thus
[TABLE]
By Hölder’s inequality
[TABLE]
By Proposition 4.1
[TABLE]
Thus, we need to focus on . Let . By Theorem 2.6 we have that
[TABLE]
Here is the key first step:
Proposition 9.2**.**
(a) For any , we have that
[TABLE]
(b) For any , we have that
[TABLE]
(c) For any , we have that
[TABLE]
is conditionally convergent.
Remarks**.**
-
We only need conditional summability so, since , (c) implies the conditional summability of the sum in (9.43) without the . However, we use (a) in the proof of (c).
-
To avoid having to worry about boundary terms at [math], we extend all sequences to by setting for . This doesn’t effect conditional convergence of any sums. Since , all of go to zero as .
Proof.
(a) , so since , Hölder’s inequality implies (9.43).
(b) , so since , Hölder’s inequality implies (9.44).
(c) The intuition is simple. The continuum analog is that if is on , as , then has a zero limit. The sum in (9.45) is a discrete analog so the key will be a summation by parts.
Since we’ll be summing by parts, we need to know the appropriate discrete Leibniz rule. Let and so . Then
[TABLE]
or . By induction, one sees that
[TABLE]
Consider the sum in (9.45) first if . Let . By (9.47)
[TABLE]
Given two sequences, and , write to mean . In (9.48), so if we write , the term produces products of two ’s and two ’s, so in by (a). Thus
[TABLE]
The conditional sum of is finite and indeed zero since and
[TABLE]
Thus is conditionally summable.
Consider next the case . By (9.47) and the same argument that led to (9.49)
[TABLE]
since, as above, we can replace by making an error in the four–fold product.
Telescoping as in (9.49), we have that is conditionally summable. Note that whether a sequence is conditionally summable or not doesn’t change by a translation of index so we can replace by and conclude that
[TABLE]
is conditionally summable and thus is conditionally summable proving the result when .
Now consider general . Since , we can change to any value we want making an change. Similarly, by shifting by multiples of 2 units, we can change each of to [math] or . If they are all equal after this, set to the common value and get conditional convergence by the case (0,0,0,0). If the first three ’s have two equal and one unequal, set to the unequal value and get either (1,1,0,0) or (0,0,1,1). We’ve handled the first and by using the trick, (0,0,1,1) is the same as (0,0,-1,-1) and by covariance, that is the same as (1,1,0,0). ∎
Next, we recall the remarkable fact that if (1.16)+(1.18), then ! (see Theorem 2.6).
Proof of Theorem 9.1.
As we’ve seen, we need only show that is conditionally convergent. We only used (1.16)+(1.18) so far, but not (1.17) which we’ll use in the form .
We begin by noting that because of (c) of the last Proposition, is conditionally convergent. Using that index shifts modify sums only by boundary terms, we conclude that
[TABLE]
is conditionally convergent.
Since , using again that index shifts do not affect conditional convergence and (9.53), we see that implies that
[TABLE]
is conditionally convergent.
On the other hand, by (c) of the last Proposition, in (9.32) we can replace by and by without effecting conditional convergence. If we do that and use (9.53) again, we see that is a conditionally convergent sum plus
[TABLE]
This is half the sum in (9.54) so (1.17) implies conditional convergence of the sum in . ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2[2] G. Anderson, A. Guionnet and O. Zeitouni, An Introduction to Random Matrices , Cambridge University Press, 2010
- 3[3] G. Ben Arous and A. Guionnet, Large deviations for Wigner’s law and Voiculescu’s non-commutative entropy , Probab. Theory Rel. Fields, 108 (1997), 517–542.
- 4[4] J. Breuer, B. Simon and O. Zeitouni Large Deviations and Sum Rules for Spectral Theory – A Pedagogical Approach , J. Spec. Th., to appear
- 5[5] A. Dembo and O. Zeitouni Large Deviations Techniques and Applications , 2nd Edition, Springer, Berlin, 1998.
- 6[6] S. Denisov and S. Kupin, Asymptotics of the orthogonal polynomials for the Szegő class with a polynomial weight , J. Approx. Theory 139 (2006), 8–28.
- 7[7] J. Deuschel and D. Stroock, Large Deviations , Academic Press, Boston, 1989.
- 8[8] E. Gagliardo, Proprietà di alcune classi di funzioni in più variabili , Ric. Mat. 7 (1958), 102–137.
