Orthogonal and multiple orthogonal polynomials, random matrices, and Painlev\'e equations
Walter Van Assche

TL;DR
This paper introduces orthogonal and multiple orthogonal polynomials, explores their applications in random matrix theory, and discusses their connection to Painlevé equations, highlighting their significance in mathematical physics and related fields.
Contribution
It provides an overview of the theory of orthogonal and multiple orthogonal polynomials and elucidates their links with Painlevé equations in the context of random matrices.
Findings
Orthogonal polynomials are fundamental in mathematical physics and probability.
Multiple orthogonal polynomials extend classical theory with new applications.
Connections between orthogonal polynomials and Painlevé equations are established.
Abstract
Orthogonal polynomials and multiple orthogonal polynomials are interesting special functions because there is a beautiful theory for them, with many examples and useful applications in mathematical physics, numerical analysis, statistics and probability and many other disciplines. In these notes we give an introduction to the use of orthogonal polynomials in random matrix theory, we explain the notion of multiple orthogonal polynomials, and we show the link with certain non-linear difference and differential equations known as Painlev\'e equations.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Orthogonal and multiple orthogonal polynomials, random matrices, and Painlevé equations
Walter Van Assche
Department of Mathematics
KU Leuven
Celestijnenlaan 200B box 2400
BE 3001 Leuven, Belgium
Abstract.
Orthogonal polynomials and multiple orthogonal polynomials are interesting special functions because there is a beautiful theory for them, with many examples and useful applications in mathematical physics, numerical analysis, statistics and probability and many other disciplines. In these notes we give an introduction to the use of orthogonal polynomials in random matrix theory, we explain the notion of multiple orthogonal polynomials, and we show the link with certain non-linear difference and differential equations known as Painlevé equations.
Key words and phrases:
Orthogonal polynomials, random matrices, multiple orthogonal polynomials, Painlevé equations
1991 Mathematics Subject Classification:
Primary 33C45, 42C05, 60B20, 33E17; Secondary 15B52, 34M55, 41A21
Contents
1. Introduction
For these lecture notes I assume the reader is familiar with the basic theory of orthogonal polynomials, in particular the classical orthogonal polynomials (Jacobi, Laguerre, Hermite) should be known. In this introduction we will fix the notation and terminology. Let be a positive measure on the real line for which all the moments , exist, where
[TABLE]
The orthonormal polynomials are such that , with , satisfying the orthogonality condition
[TABLE]
It is well known that the zeros of are real and simple, and we denote them by
[TABLE]
Orthonormal polynomials on the real line always satisfy a three-term recurrence relation
[TABLE]
with initial condition and , with recurrence coefficients and for . Often we will also use monic orthogonal polynomials, which we denote by capital letters:
[TABLE]
Their recurrence relation is of the form
[TABLE]
with initial conditions and . The classical families of orthogonal polynomials are
- •
The Jacobi polynomials , for which
[TABLE]
with parameters .
- •
The Laguerre polynomials for which
[TABLE]
with parameter .
- •
The Hermite polynomials for which
[TABLE]
Usually these polynomials are neither normalized nor monic but another normalization is used (for historical reasons) and one has to be a bit careful with some of the general formulas for orthonormal or monic orthogonal polynomials.
The matrix
[TABLE]
is the Hankel matrix with the moments of the orthogonality measure . The Hankel determinant is
[TABLE]
If the support of contains infinitely many points, then for all .
The monic orthogonal polynomials are given by
[TABLE]
and
[TABLE]
The Christoffel-Darboux kernel is defined as
[TABLE]
This Christoffel-Darboux kernel is a reproducing kernel: for every polynomial of degree one has
[TABLE]
If is a function in , then
[TABLE]
gives a polynomial of degree which is the least squares approximant of in the space of polynomials of degree . The Christoffel-Darboux kernel is a sum of terms containing all the polynomials , but there is a nice formula that expresses the kernel in just two terms containing the polynomials and only:
Property 1.1**.**
The Christoffel-Darboux formula is
[TABLE]
and its confluent version is
[TABLE]
The version for orthonormal polynomials is
Property 1.2**.**
The Christoffel-Darboux formula is
[TABLE]
and its confluent version is
[TABLE]
2. Orthogonal polynomials and random matrices
The link between orthogonal polynomials and random matrices is via the Christoffel-Darboux kernel and Heine’s formula for orthogonal polynomials, see Property 2.1. Useful references for random matrices are Mehta’s book [31], the book by Anderson, Guionnet and Zeitouni [1], and Deift’s monograph [11]. First of all, let be real or complex numbers, then we define the Vandermonde determinant as
[TABLE]
This Vandermonde determinant can be evaluated explicitly:
[TABLE]
From this it is clear that when all the are distinct, and if then . Heine’s formula expresses the Hankel determinant with the moments of a measure as an -fold integral:
Property 2.1** (Heine).**
The Hankel determinants in (1.3) can be written as
[TABLE]
where is the Vandermonde determinant (2.1). Furthermore, the monic orthogonal polynomial is also given by an -fold integral
[TABLE]
Proof.
If we write all the moments in the first row of (1.3) as an integral and use linearity of the determinant (for one row), then
[TABLE]
Repeating this for every row gives
[TABLE]
In each row we can take out the common factors to find
[TABLE]
Now write the integral over as a sum of integrals over all simplices , where is a permutation of . Then
[TABLE]
With the change of variables one has , with and
[TABLE]
Observe that , so that
[TABLE]
Now use
[TABLE]
to find
[TABLE]
This is an integral over one simplex in . This integral is the same for every simplex, and since there are simplices (because there are permutations of ), we find the required formula (2.2).
The proof for formula (2.3) is similar, using the determinant expression (1.4) for the monic orthogonal polynomial. ∎
It is remarkable that Szegő writes in his book [40]:
[These] Formulas … are not suitable in general for derivation of properties of the polynomials in question. To this end we shall generally prefer the orthogonality property itself, or other representations derived by means of the orthogonality property.
Heine’s formulas have now become crucial in the theory of random matrices.
2.1. Point processes
A -point process is a stochastic process where a set of points is selected, and the joint distribution of the random variables is given. Since we are dealing with a set of random numbers, the order of the random variables is irrelevant and hence we use a probability distribution which is invariant under permutations. Our interest is in the -point process where the joint probability distribution has a density (with respect to the product measure ) given by
[TABLE]
where we mean that
[TABLE]
Observe that by Heine’s formula (2.2) this is indeed a probability distribution since it is positive and integrates over to one. The points in this -point process are not independent and the factor describes the dependence of the points. Two points are unlikely to be close together because then is small and by the maximum likelihood principle the points will prefer to choose a position that maximizes . This -point process therefore has points that repel each other.
An important property of this -point process is that it is a determinantal point process. To see this, we will express the probability density in terms of the Christoffel-Darboux kernel. We need a few important properties of that kernel.
Property 2.2**.**
The Christoffel-Darboux kernel satisfies
[TABLE]
and
[TABLE]
Proof.
The first property follows from the reproducing property of the Christoffel-Darboux kernel. For the second property we have
[TABLE]
∎
Property 2.3**.**
The density (2.4) can be written as
[TABLE]
where is the Christoffel-Darboux kernel.
Proof.
If we add rows in the Vandermonde determinant (2.1), then
[TABLE]
for any sequence of monic polynomials. If we take the monic orthogonal polynomials, then
[TABLE]
where . Then use (1.5) to find that , so that
[TABLE]
which combined with (2.4) gives the required result. ∎
For this reason we call the -point process with density (2.4) the Christoffel-Darboux point process.
2.2. Determinantal point process
The fact that the density can be written as a determinant of a kernel function that satisfies Property 2.2 is important and allows to compute correlation functions for points of the point process, in particular the probability density of one point (for ).
Definition 2.4**.**
For the th correlation function is
[TABLE]
The interpretation of these th correlation functions is the following: if (), and is the number of points in , then
[TABLE]
The th correlation function can also be seen as the density of the marginal distribution of points in the -point process, up to a normalization factor:
Property 2.5**.**
The th correlation function is obtained from by
[TABLE]
Proof.
For we have, by expanding the determinant along the last row,
[TABLE]
By Property 2.2 the last term is . Expanding the remaining determinant along the last column gives
[TABLE]
The determinant does not contain , so the remaining integration can be done using Property 2.2 and gives
[TABLE]
The sum over gives the determinant (recall that column which contains is missing since )
[TABLE]
and to get the last column in the th position, we need to interchange columns times, which gives
[TABLE]
and hence
[TABLE]
To prove the case for all one uses induction on , for which we just proved the case . ∎
Definition 2.6**.**
A point process on with correlation functions is a determinantal point process if there exists a kernel such that for every and every
[TABLE]
The following theorem shows that Property 2.2 is indeed crucial.
Theorem 2.7**.**
Suppose is a kernel such that
- •
,
- •
For every , one has \det\bigl{(}K(x_{i},x_{j})\bigr{)}_{i,j=1}^{k}\geq 0.
- •
.
Then
[TABLE]
is a probability density on which is invariant under permutations of coordinates. The associated -point process is determinantal.
The most important example (at least in the context of this section) is when , and then one can take
[TABLE]
2.3. Random matrices
To see the relation with random matrices, we claim that the eigenvalues of certain random matrices of order form a determinantal point process with the Christoffel-Darboux kernel for a particular family of orthogonal polynomials. The Gaussian unitary ensemble (GUE) consists of Hermitian random matrices of order with random entries
[TABLE]
[TABLE]
where all are independent normal random variables with mean zero and variance (if ) or (if ). The multivariate density is
[TABLE]
where is normalizing constant. But this is also equal to
[TABLE]
where for , , and .
We are mostly interested in the eigenvalues of the random matrix . To find the density of the eigenvalues, we use the change of variables: , where is a unitary matrix for which
[TABLE]
and , and then integrate over the unitary part , which leaves only the eigenvalues. This change of variables is done using the Weyl integration formula (see, e.g., [1, §4.1.3]):
Theorem 2.8** (Weyl integration formula).**
For the change of variables one has
[TABLE]
where is a constant and is the Haar measure on the unitary group.
We will use a simplified version of this result, for which one does not need the Haar measure on the unitary group. This works when the expression that we want to integrate only depends on the eigenvalues of . Let be the Hermitian matrices of order .
Definition 2.9**.**
A function is a class function if
[TABLE]
for all unitary matrices .
Theorem 2.10** (Weyl integration formula for class functions).**
For an integrable class function we have
[TABLE]
with
[TABLE]
The characteristic polynomial of a matrix only depends on the eigenvalues, hence is a class function. For random matrices in GUE one finds for the average characteristic function
[TABLE]
and by (2.3) this is the monic Hermite polynomial . More generally, the eigenvalues of a random matrix in GUE form a determinantal point process with the Christoffel-Darboux kernel of (scaled) Hermite polynomials. The average number of eigenvalues of in is in terms of the correlation function :
[TABLE]
2.4. Random matrix ensembles
Here we give a few more random matrix ensembles for which the eigenvalues form a determinantal point process with the Christoffel-Darboux kernel of classical orthogonal polynomials.
- •
We already defined GUE (Gaussian Unitary Ensemble): this contains random matrices in with density
[TABLE]
The average characteristic polynomial is
[TABLE]
This suggests that on the average the eigenvalues behave like the zeros of (scaled) Hermite polynomials. This is indeed true, but for this one needs the correlation function and the result that
[TABLE]
where are the zeros of the Hermite polynomial .
- •
The Wishart ensemble. Let be a matrix with independent complex Gaussian entries . Then has the Wishart distribution with density
[TABLE]
The average characteristic polynomial is
[TABLE]
Observe that is a positive definite matrix so that all the eigenvalues are positive. On the average they behave like the zeros of Laguerre polynomials.
- •
Truncated unitary matrices. Let be a random unitary matrix of order and let be the upper left corner . Then is an matrix and
[TABLE]
Unitary matrices have their eigenvalues on the unit circle, and a truncated unitary matrix has its singular values (the eigenvalues of ) in . These eigenvalues behave on the average like the zeros of Jacobi polynomials.
Exercise*.*
Let be the Hermitian random matrix with entries
where and are independent random variables with means and variances . Show that satisfies the three-term recurrence relation
with and . Identify this as , where is the Hermite polynomial of degree . This shows that the Hermite polynomial is the average characteristic polynomial of a large class of Hermitian random matrices, not only GUE.
So far we found that on the average the eigenvalues of random matrices from these ensembles behave like zeros of orthogonal polynomials. To get more information about individual eigenvalues, for example the largest eigenvalue or the smallest eigenvalue, one needs a more detailed analysis of the point process. In particular one needs to investigate the asymptotic behavior of the Christoffel-Darboux kernels. In particular, to understand the spacing between the eigenvalues in the neighborhood of in the bulk of the spectrum, one needs results for
[TABLE]
or, when is at the end of the spectrum,
[TABLE]
where depends on the nature of the endpoint (hard or soft edge). This will give kernels of well-known point processes.
An important quantity of interest is the probability that there are exactly eigenvalues in the set . If there are eigenvalues in , then the number of ordered -tuples in is and thus
[TABLE]
because this is the expected number of ordered -tuples in . For one has
[TABLE]
therefore
[TABLE]
Changing the order of summation (we assume that this is allowed) and using
[TABLE]
we find that
[TABLE]
This is the so-called gap probability: the probability to find no eigenvalues in . For a determinantal point process, such as the eigenvalues of various random matrices, this gap probability is in fact the Fredholm determinant of the operator defined by
[TABLE]
The asymptotic behavior as the size of the random matrices increases to infinity, then gives the Fredholm determinant of the operator that uses the kernel which is the limit of the Christoffel-Darboux kernel as described above. The lesson to be learned from this is that the asymptotic behavior of orthogonal polynomials and their Christoffel-Darboux kernel gives important insight in the behavior of eigenvalues of random matrices.
3. Multiple orthogonal polynomials
In this section we will explain the notion of multiple orthogonal polynomials. Useful references are Ismail’s book [20, Ch. 23], Nikishin and Sorokin’s book [33, Ch. 4] and the papers [2, 29, 48]. Instead of orthogonality conditions with respect to one measure on the real line, the orthogonality will be with respect to measures, where . For one has the usual orthogonal polynomials, but for one gets two types of multiple orthogonal polynomials.
Let and let be positive measures on the real line, for which all the moments exist. We use multi-indices and denote their length by .
Definition 3.1** (type I).**
Type I multiple orthogonal polynomials for consist of the vector of polynomials, with , for which
[TABLE]
with normalization
[TABLE]
Definition 3.2** (type II).**
The type II multiple orthogonal polynomial for is the monic polynomial of degree for which
[TABLE]
for .
The conditions for type I and type II multiple orthogonal polynomials give a system of linear equations for the unknown coefficients of the polynomials. This system may not have a solution, or when a solution exists it may not be unique. A multi-index is said to be normal if the type I vector exists and is unique, and this is equivalent with the existence and uniqueness of the monic type II multiple orthogonal polynomial , because the matrix of the linear system for type II is the transpose of the matrix for the type I linear system. Hence is a normal multi-index if and only if
[TABLE]
where
[TABLE]
are rectangular Hankel matrices containing the moments
[TABLE]
3.1. Special systems
Interesting systems of measures are those for which all the multi-indices are normal. We call such systems perfect. Here we will describe two such systems.
Definition 3.3** (Angelesco system).**
The measures are an Angelesco system if the supports of the measures are subsets of disjoint intervals , i.e., and whenever .
Usually one allows that the intervals are touching, i.e., whenever .
Theorem 3.4** (Angelesco, Nikishin).**
The type II multiple orthogonal polynomial for an Angelesco system has exactly distinct zeros on for .
This means that the type II multiple orthogonal polynomial can be factored as , where has all its zeros on . In fact, is an ordinary orthogonal polynomial of degree on the interval for the measure :
[TABLE]
Observe that for the polynomial has constant sign on .
Corollary 3.5**.**
Every multi-index is normal (an Angelesco system is perfect).
Exercise*.*
Show that every has zeros on .
For another system of measures, which are all supported on the same interval , we need to recall the notion of a Chebyshev system.
Definition 3.6**.**
The functions are a Chebyshev system on if every linear combination with has at most zeros on .
We can then define an Algebraic Chebyshev system:
Definition 3.7** (AT-system).**
The measures are an AT-system on the interval if the measures are all absolutely continuous with respect to a positive measure on , i.e., , and for every the functions
[TABLE]
are a Chebyshev system on .
For an AT-system we have some control of the zeros of the type I and type II multiple orthogonal polynomials.
Theorem 3.8**.**
For an AT-system the function
[TABLE]
has exactly sign changes on . Furthermore, the type II multiple orthogonal polynomial has exactly distinct zeros on .
Corollary 3.9**.**
Every multi-index in an AT-system is normal (an AT-system is perfect).
A very special system of measures was introduced by Nikishin in 1980.
Definition 3.10** (Nikishin system for ).**
A Nikishin system of order consists of two measures , both supported on an interval , and such that
[TABLE]
where is a positive measure on an interval and .
Nikishin showed that indices with are perfect. Driver and Stahl [12] proved the more general statement.
Theorem 3.11** (Nikishin, Driver-Stahl).**
A Nikishin system of order two is perfect.
In order to define a Nikishin system of order we need some notation. We write for the measure which is absolutely continuous with respect to and for which the Radon-Nikodym derivative is the Stieltjes transform of :
[TABLE]
Nikishin systems of order can then be defined by induction.
Definition 3.12** (Nikishin system for general ).**
A Nikishin system of order on an interval is a system of measures supported on such that , , where is a Nikishin system of order on an interval and .
Fidalgo Prieto and López Lagomasino proved [13]
Theorem 3.13**.**
Every Nikishin system is perfect.
In most cases the measures are absolutely continuous with respect to one fixed measure :
[TABLE]
We then define the type I function
[TABLE]
The type I functions and the type II polynomials then are very complementary: they form a biorthogonal system for many multi-indices.
Property 3.14** (biorthogonality).**
[TABLE]
3.2. Nearest neighbor recurrence relations
The usual orthogonal polynomials (the case ) on the real line always satisfy a three-term recurrence relation that expresses in terms of the polynomials with neighboring degrees . A similar result is true for multiple orthogonal polynomials, but there are more neighbors for a multi-index. Indeed, the multi-index has neighbors from above by adding 1 to one of the components of . We denote these neighbors from above by for , where with in position . There are also neighbors from below, namely , for . The nearest neighbor recurrence relations for type II multiple orthogonal polynomials are [45]
[TABLE]
Observe that one always uses the same linear combination of the neighbors from below. The nearest neighbor recurrence relations for type I multiple orthogonal polynomials are
[TABLE]
These are using the same recurrence coefficients , but there is a shift for the recurrence coefficients . For the recurrence coefficients and are connected:
Theorem 3.15** (Van Assche [45]).**
The recurrence coefficients and satisfy the partial difference equations
[TABLE]
for all .
By combining the equations of the nearest neighbor recurrence relations, one can also find a recurrence relation of order for the multiple orthogonal polynomials along a path from to in . Let be a path in starting from , such that for some . Then
[TABLE]
These coefficients can be expressed in terms of the recurrence coefficients in the nearest neighbor recurrence relations, but the explicit expression is rather complicated for general . An important case is the stepline:
[TABLE]
This recurrence relation of order can be expressed in terms of a Hessenberg matrix with diagonals below the main diagonal:
[TABLE]
[TABLE]
3.3. Christoffel-Darboux formula
The Christoffel-Darboux kernel, which is the important reproducing kernel for orthogonal polynomials, has a counterpart in the theory of multiple orthogonal polynomials. It uses both the type I and type II multiple orthogonal polynomials, and is a sum over a path from to as described before. The Christoffel-Darboux kernel is defined as
[TABLE]
where , and the path in is such that for some satisfying , i.e., in every step the multi-index is increased by 1 in one component. This definition seems to depend on the choice of the path from to , but surprisingly this kernel is independent of that chosen path. This is a consequence of the relations between the recurrence coefficients given by Theorem 3.15 and is best explained by the following analogue of the Christoffel-Darboux formula for orthogonal polynomials:
Theorem 3.16** (Daems and Kuijlaars).**
Let be a path in starting from and ending in (where ), such that for some . Then
[TABLE]
Proof.
This was first proved in [9] and a proof based on the nearest neighbor recurrence relations can be found in [45]. ∎
The sum depends only on the endpoint of the path in and not on the path from to this point. In many cases this Christoffel-Darboux kernel can be used to generate a determinantal process by using Theorem 2.7 and the biorthogonality in Property 3.14. The only thing which is not obvious is the positivity , which needs to be checked separately. See [23] for more details about such determinantal processes.
3.4. Hermite-Padé approximation
Multiple orthogonal polynomials have their roots in Hermite-Padé approximation, which was introduced by Hermite and investigated in detail by Padé (for ). Hermite-Padé approximation is a method to approximate functions simultaneously by rational functions. Multiple orthogonal polynomials appear when one uses Hermite-Padé approximation near infinity. Let be Markov functions, i.e.,
[TABLE]
Definition 3.17** (Type I Hermite-Padé approximation).**
Type I Hermite-Padé approximation is to find polynomials , with , and a polynomial such that
[TABLE]
The solution is that is the type I multiple orthogonal polynomial vector, and
[TABLE]
The error in this approximation problem can also be expressed in terms of the type I multiple orthogonal polynomials. One has
[TABLE]
and the orthogonality properties of the type I multiple orthogonal polynomials indeed show that (3.1) holds.
Definition 3.18** (Type II Hermite-Padé approximation).**
Type II Hermite-Padé approximation is to find a polynomial of degree and polynomials such that
[TABLE]
for .
The solution for this approximation problem is to take the type II multiple orthogonal polynomial and
[TABLE]
Observe that this approximation problem is to find rational approximants to each with a common denominator, and this common denominator turns out to be the type II multiple orthogonal polynomial. The error can again be expressed in terms of the multiple orthogonal polynomial:
[TABLE]
which can be verified by using the orthogonality conditions for the type II multiple orthogonal polynomial.
Hermite-Padé approximants are used frequently in number theory to find good rational approximants for real numbers and to prove irrationality and transcendence of some important real numbers. Hermite used these approximants (but at [math] rather than ) to prove that is a transcendental number.
3.5. Multiple Hermite polynomials
As an example we will describe multiple Hermite polynomials in some detail and explain some applications where they are used. The type II multiple Hermite polynomials satisfy
[TABLE]
for , with whenever . This condition on the parameters guarantees that every multi-index is normal, since the measures with weight function form an AT-system. These multiple orthogonal polynomials can be obtained by using the Rodrigues formula
[TABLE]
Exercise*.*
Show that the differential operators
are commuting. Use this (and integration by parts) to show that this indeed gives the type II multiple Hermite polynomial.
By using this Rodrigues formula (and the Leibniz rule for the th derivative of a product), one finds the explicit expression
[TABLE]
where are the usual Hermite polynomials. The nearest neighbor recurrence relations for multiple Hermite polynomials are quite simple:
[TABLE]
They also have some useful differential properties: there are raising operators
[TABLE]
and one lowering operator
[TABLE]
By combining these raising operators and the lowering operator one finds a differential equation of order :
[TABLE]
where
[TABLE]
One can also find some integral representations (see [4])
[TABLE]
For the type I multiple Hermite polynomials one has
[TABLE]
where is a closed contour encircling once and none of the other , and
[TABLE]
where is a closed contour encircling all .
3.5.1. Random matrices
These multiple Hermite polynomials are useful for investigating random matrices with external source [5]. Let be a random Hermitian matrix and consider the ensemble with probability distribution
[TABLE]
where is a fixed Hermitian matrix (the external source). The average characteristic polynomial is a multiple Hermite polynomial:
Property 3.19**.**
Suppose has eigenvalues with multiplicities , then
[TABLE]
Furthermore, the eigenvalues form a determinantal process with the Christoffel-Darboux kernel for multiple Hermite polynomials:
Property 3.20**.**
The density of the eigenvalues is given by
[TABLE]
where the kernel is given by
[TABLE]
with a path from to in and
[TABLE]
This means that we can also find the correlation functions:
Property 3.21**.**
The -point correlation function
[TABLE]
is given by
[TABLE]
where the kernel is given by
[TABLE]
3.5.2. Non-intersecting Brownian motions
Another interesting problem where multiple Hermite polynomials are appearing is to find what happens with independent Brownian motions (in fact, Brownian bridges) with the constraint that they are not allowed to intersect, see [10].
The density of the probability that the non-intersecting paths, leaving at and arriving at , are at at time is (Karlin and McGregor [22])
[TABLE]
where
[TABLE]
When and (see Fig. 1) then
[TABLE]
where the kernel is given by
[TABLE]
This kernel is related to the Christoffel-Darboux kernel for the usual Hermite polynomials.
When and , (see Fig. 2) then
[TABLE]
with
[TABLE]
with multiple orthogonal polynomials for the weights
[TABLE]
This kernel is related to the Christoffel-Darboux kernel for multiple Hermite polynomials. An interesting phenomenon appears: for small values of the points at level accumulate on one interval, but for larger values of in the points accumulate on two disjoint intervals. There is a phase transition at a critical point . A detailed asymptotic analysis of the kernel near this point will require a special function satisfying a third order differential equation (the Pearsey equation) which is a limiting case of the third order differential equation of multiple Hermite polynomials. The limiting kernel is known as the Pearsey kernel.
3.6. Multiple Laguerre polynomials
The Laguerre weight is
[TABLE]
There are two easy ways to obtain multiple Laguerre polynomials:
- (1)
Changing the parameter to . This gives multiple Laguerre polynomials of the first kind. 2. (2)
Changing the exponential decay at infinity from to with parameters . This gives multiple Laguerre polynomials of the second kind.
3.6.1. Multiple Laguerre polynomials of the first kind
Type II multiple Laguerre of the first kind satisfy
[TABLE]
for . In order that all multi-indices are normal we need to have parameters and whenever , in which case the measures form an AT-system. The multiple orthogonal polynomials can be found from the Rodrigues formula
[TABLE]
An explicit formula is
[TABLE]
Another explicit expression with hypergeometric functions is
[TABLE]
The nearest neighbor recurrence relations are
[TABLE]
with
[TABLE]
and
[TABLE]
These multiple Laguerre polynomials also have some differential properties. There are raising operators
[TABLE]
and there is one lowering operator
[TABLE]
Combining them gives the differential equation
[TABLE]
[TABLE]
3.6.2. Multiple Laguerre polynomials of the second kind
Type II multiple Laguerre polynomials of the second kind satisfy
[TABLE]
for . The parameters need to satisfy and with whenever . The Rodrigues formula is
[TABLE]
which allows to find the explicit expression
[TABLE]
The nearest neighbor recurrence relations are
[TABLE]
with
[TABLE]
The differential properties include raising operators
[TABLE]
and one lowering operator
[TABLE]
They give the differential equation
[TABLE]
where
[TABLE]
3.6.3. Random matrices: Wishart ensemble
Wishart (1928) introduced the Wishart distribution for positive definite Hermitian matrices
[TABLE]
where all the columns of are independent and have a multivariate Gauss distribution with covariance matrix . The density for the Wishart distribution is
[TABLE]
If then Laguerre polynomials (with ) play an important role. If has eigenvalues with multiplicities , then we need multiple Laguerre polynomials of the second kind. The average characteristic polynomial is
[TABLE]
3.7. Jacobi-Piñeiro polynomials
There are several ways to find multiple Jacobi polynomials. Here we only mention one way which uses the same differential operators as the multiple Laguerre polynomials of the first kind. The Jacobi-Piñeiro polynomials satisfy
[TABLE]
for . Hence we are using Jacobi weights on the interval , with but with different parameters . In order to have a perfect system we require whenever . They can be obtained using the Rodrigues formula
[TABLE]
An expression in terms of generalized hypergeometric functions is
[TABLE]
This hypergeometric function does not terminate when is not an integer. Another useful expression is
[TABLE]
Again there are raising differential operators and one lowering operator and the recurrence coefficients are known explicitly. These polynomials are useful for rational approximation of polylogarithms, and in particular for the zeta function at integers. The polylogarithms are defined by
[TABLE]
and one has
[TABLE]
Simultaneous rational approximation to can be done using Hermite-Padé approximation with a limiting case of Jacobi-Piñeiro polynomials where and , which is possible when . This is particularly interesting if we let , since . Apéry’s construction of good rational approximants for (proving that is irrational) essentially makes use of these multiple orthogonal polynomials, see, e.g. [43].
4. Orthogonal polynomials and Painlevé equations
In this section we describe how orthogonal polynomials are related to non-linear difference and differential equations, in particular to discrete Painlevé equations and the six Painlevé differential equations. For a recent discussion on this relation between orthogonal polynomials and Painlevé equations we refer to the monograph [46]. Other useful references are [8, 7, 44].
Painlevé equations (discrete and continuous) appear at various places in the theory of orthogonal polynomials, in particular
- •
The recurrence coefficients of some semiclassical orthogonal polynomials satisfy discrete Painlevé equations.
- •
The recurrence coefficients of orthogonal polynomials with a Toda-type evolution satisfy Painlevé differential equations for which special solutions depending on special functions (Airy, Bessel, (confluent) hypergeometric, parabolic cylinder functions) are relevant.
- •
Rational solutions of Painlevé equations can be expressed in terms of Wronskians of orthogonal polynomials.
- •
The local asymptotics for orthogonal polynomials at critical points is often using special transcendental solutions of Painlevé equations.
In this section we will only deal with the first two of these.
What are Painlevé (differential) equations? They are second order nonlinear differential equations
[TABLE]
that have the Painlevé property: The general solution is free from movable branch points. The only singularities which may depend on the initial conditions are poles. Painlevé and his collaborators found 50 families (up to Möbius transformations), all of which could be reduced to known equations and six new equations (new at least at the beginning of the 20th century). The six Painlevé equations are
[TABLE]
Discrete Painlevé equations are somewhat more difficult to describe. Roughly speaking they are second order nonlinear recurrence equations for which the continuous limit is a Painlevé equation. They have the singularity confinement property, but this property is not sufficient to characterize discrete Painlevé equations. A quote by Kruskal [24] is:
Anything simpler becomes trivially integrable, anything more complicated becomes hopelessly non-integrable.
A more correct description is that they are nonlinear recurrence relations with ‘nice’ symmetry and geometry. A full classification of discrete (and continuous) Painlevé equations has been found by Sakai [36]. This is based on rational surfaces associated with affine root systems. It describes the space of initial values which parametrizes all the solutions (Okamoto [34]). A fine tuning of this classification was given recently by Kajiwara, Noumi and Yamada [21]: they also include the symmetry, i.e., the group of Bäcklund transformations, which are transformations that map a solution of a Painlevé equation to another solution with different parameters. A partial list of discrete Painlevé equations is:
[TABLE]
where and are constants.
[TABLE]
where and are constants.
[TABLE]
The latter corresponds to where is the surface type and is the symmetry type. Sakai’s classification (surface type) corresponds to the following diagram:
[TABLE]
4.1. Compatibility and Lax pairs
There is a general philosophy behind the reason why Painlevé equations appear for the recurrence coefficients of orthogonal polynomials. Orthogonal polynomials are really functions of two variables: a discrete variable and a continuous variable . The three term recurrence relation (1.2) gives a difference equation in the variable , and if the measure is absolutely continuous with a weight function that satisfies a Pearson equation
[TABLE]
where and are polynomials, then the orthogonal polynomials also satisfy differential relations in the variable . If and then we are dealing with classical orthogonal polynomials which satisfy the second order differential equation
[TABLE]
where . In the semiclassical case we still have the Pearson equation (4.7) but we allow or . In that case there is a structure relation
[TABLE]
where and . The structure relation (4.8) and the three-term recurrence relation (1.2) have to be compatible: if we differentiate the terms in the recurrence relation (1.2) and replace all the using the structure relation (4.8), then we get a linear combination of a finite number of orthogonal polynomials that is equal to [math]. Since (orthogonal) polynomials are linearly independent in the linear space of polynomials, the coefficients in this linear combination have to be zero, and this gives relations between the recurrence coefficient and the coefficients in the structure relation. Eliminating these gives recurrence relations for the , which turn out to be non-linear. If they are of second order, then we can identify them as discrete Painlevé equations. In this way the three-term recurrence relation and the structure relation can be considered as a Lax pair for the obtained discrete Painlevé equation.
In order to get to the Painlevé differential equation, we need to introduce an extra continuous parameter . For this we will use an exponential modification of the measure and investigate orthogonal polynomials for the measure , whenever all the moments of this modified measure exist. We will denote the monic orthogonal polynomials by and in this way the orthogonal polynomial is now a function of three variables . The behavior for the parameter is given by:
Theorem 4.1**.**
The monic orthogonal polynomials for the measure satisfy
[TABLE]
where depends only on and .
Proof.
First of all, since is a monic polynomial, the derivative is a polynomial of degree . We will show that it is orthogonal to for for the measure , so that it is proportional to , which proves (4.9). We start from the orthogonality relations
[TABLE]
and take derivatives with respect to to find
[TABLE]
The second integral vanishes for by orthogonality, hence
[TABLE]
which is what we needed to prove. ∎
This relation is not new, see e.g. [39, §4], but has not been sufficiently appreciated in the literature. If we now check the compatibility between (4.9) and the three-term recurrence relation (1.2), then we find differential-difference equations for the recurrence coefficients .
Theorem 4.2** (Toda equations).**
The recurrence coefficients and for the orthogonal polynomials satisfy
[TABLE]
with .
Proof.
If we take derivatives with respect to in the three-term recurrence relations (1.2), then
[TABLE]
Use (4.9) to find
[TABLE]
If we compare this with (1.2) (with shifted to ), then we find
[TABLE]
From (4.14) we find that does not depend on , so that and from (4.12) we find that . A simple exercise shows that so that . If we use this in (4.13), then we find (4.10). If we use it in (4.12), then we find (4.11). ∎
The system (4.10)–(4.11) is closely related to a chain of interacting particles with exponential interaction with their neighbors, introduced by Toda [41] in 1967. If is the position of particle , then the Toda system of equations is
[TABLE]
The relation with orthogonal polynomials was made by Flaschka [15, 16] and Manakov [28], who suggested the change of variables
[TABLE]
which gives the system (4.10)–(4.11).
If we are dealing with symmetric orthogonal polynomials, i.e., when the measure is symmetric and all the odd moments are zero, then the three-term recurrence relation simplifies to
[TABLE]
A symmetric modification of the measure is given by and the relation becomes
[TABLE]
The compatibility between (4.15) and (4.16) then gives:
Theorem 4.3** (Langmuir lattice).**
Let be a symmetric positive measure on for which all the moments exist and let be the measure for which , where is such that all the moments of exist. Then the recurrence coefficients of the orthogonal polynomials for satisfy the differential-difference equations
[TABLE]
Proof.
If we differentiate (4.15) with respect to and then use (4.16), then we find
[TABLE]
Comparing with (4.15) (with replaced by ) gives
[TABLE]
From (4.19) it follows that is constant and therefore equal to . Now and one can easily compute and in terms of the moments to find that , so that . If one uses this in (4.18), then one finds (4.17). ∎
This differential-difference equation is known as the Langmuir lattice or the Kac-van Moerbeke lattice. We will now illustrate this with a number of explicit examples.
4.2. Discrete Painlevé I
Let us consider orthogonal polynomials for the weight function on . The symmetry of this weight function implies that the recurrence coefficients in (1.1) or (1.2) vanish and the three-term recurrence relation is (4.15). The orthogonal polynomials also have a nice differential property: the structure relation is
[TABLE]
for certain sequences and . Indeed, we can express in terms of the orthogonal polynomials as
[TABLE]
where
[TABLE]
Using integration by parts gives
[TABLE]
and the last two integrals are zero for by orthogonality, so that only , and are left. The symmetry of implies that is an even polynomial and is an odd polynomial for every , hence . Taking and then gives the structure relation.
We now have a recurrence relation (4.15) which describes the behavior of in the (discrete) variable , and a structure relation (4.20) which describes the behavior of in the (continuous) variable . Both relations have to be compatible: if we differentiate (4.15) and then use (4.20) to replace all the derivatives, then comparing coefficients of the polynomials gives the compatibility relations
[TABLE]
This simple non-linear recurrence relation is known as discrete Painlevé I () and is a special case of (4.5) we gave earlier. This particular equation was already in work of Shohat [37] in 1939, who extended earlier work of Laguerre [25] from 1885. Later it was obtained again by Freud [18] in 1976, who was unaware of the work of Shohat. The special positive solution needed to get the recurrence coefficients was analyzed by Nevai [32] and Lew and Quarles [26]. An asymptotic expansion was found by Máté-Nevai-Zaslavsky [30]. Only later (in 1991) it was recognized as a discrete Painlevé equation by Fokas, Its and Kitaev [17] who coined the name . Magnus [27] used the extra parameter and showed that, as a function of , the recurrence coefficient satisfies the differential equation Painlevé IV, as we will see later.
The discrete Painlevé equation (4.21) easily allows to find the asymptotic behavior as :
Theorem 4.4** (Freud).**
The recurrence coefficients for the weight on satisfy
[TABLE]
Observe that (4.21) is a second order recurrence relation, so one needs two initial conditions and to generate all the recurrence coefficients. It turns out that the recurrence coefficients are a special solution with for which all are positive for . This means that there is only one special initial value that gives a positive solution. Put , then (for )
[TABLE]
Theorem 4.5** (Lew and Quarles, Nevai).**
There is a unique solution of (4.22) for which and for all .
Hence one should not use this recurrence relation (4.22) to generate the recurrence coefficients starting from and , because a small error in will produce a sequence for which not all the terms are positive. A small perturbation in the initial condition has a very important effect on the solution as . This is not unusual for non-linear recurrence relations. Instead it is better to generate the positive solution by using a fixed point algorithm, because the positive solution turns out to be the fixed point of a contraction in an appropriate normed space of infinite sequences. See, e.g., [46, §2.3].
4.3. Langmuir lattice and Painlevé IV
We will modify the measure by multiplying it with the symmetric function , where is a real parameter. This gives the Langmuir lattice (4.17). We can combine this with the discrete Painlevé equation (4.21) to find a differential equation for as a function of the variable . Put , then
[TABLE]
where the ′ denotes the derivative with respect to . Differentiate (4.24) to find
[TABLE]
Replace and by (4.24), then
[TABLE]
Eliminate and using (4.23)–(4.24) to find
[TABLE]
This is Painlevé IV if we use the transformation . This means that Painlevé IV has a solution which can be described completely in terms of the moments of , since and by (1.5) , where is the Hankel determinant (1.3) containing the moments. Notice that all the odd moments are zero, and for the even moments one has
[TABLE]
Hence the special solution of Painlevé IV is in terms of only, and this is a special function:
[TABLE]
where is a parabolic cylinder function.
4.4. Singularity confinement
In this section we will explain the notion of singularity confinement for the discrete Painlevé I equation
[TABLE]
From this equation one finds
[TABLE]
If then becomes infinite. This need not be a problem, but problems arise later when we have to add or subtract infinities. So we need to be careful and suppose that is small. Then
[TABLE]
and
[TABLE]
and
[TABLE]
and one more
[TABLE]
and for we see that is finite again and recovers the value we had before we started to get singularities. The singularities are confined to and and one can continue the recurrence relation from . This has some meaning in terms of the orthogonal polynomials for the weight , but we have to consider this weight on the set and look for orthogonal polynomials for which
[TABLE]
with . They satisfy the recurrence relation
[TABLE]
and the recurrence coefficients still satisfy (4.22) but with initial condition and . If then generates a singularity for and gives , hence does not exist if we define it using (1.4). The singularity, however, is confined to a finite number of terms. We have
Property 4.6**.**
For one has for the Hankel determinants, so that and as defined by (1.4) do not exist for . Furthermore
[TABLE]
The polynomials and can be identified as Laguerre polynomials with parameter and respectively. The problem with and is not so much that they do not exist, but rather that they are not unique.
Exercise*.*
Show that for every the polynomials are monic polynomials of degree that are orthogonal to for , so that the monic orthogonal polynomial is not unique. In a similar way are monic polynomials of degree that are orthogonal to for for every so that the monic orthogonal polynomial is not unique.
4.5. Generalized Charlier polynomials
Our next example is a family of discrete orthogonal polynomials , which satisfy
[TABLE]
Without the factor the polynomials are the Charlier polynomials, but with the factor we have a semiclassical family of discrete orthogonal polynomials. The case was investigated in [47] and the general case in [38], see also [46, §3.2]. The structure relation for discrete orthogonal polynomials is now in terms of a difference operator instead of a differential operator. For these generalized Charlier polynomials it is
[TABLE]
where is the forward difference operator acting on a function by
[TABLE]
and and are certain sequences. If one works out the compatibility of (1.2) and (4.25), then one finds
[TABLE]
This corresponds to a limiting case of discrete Painlevé with surface/symmetry in Sakai’s classification.
If we put , then the weights with parameter are a Toda modification of the weights with parameter ,
[TABLE]
and hence the recurrence coefficients satisfy the Toda equations given in Theorem 4.2. Put and , then
[TABLE]
and if , , the Toda lattice equations are
[TABLE]
Eliminate and (this requires quite a few computations) and put , then satisfies (after even more computations)
[TABLE]
This is a Painlevé V differential equation as in (4.4) with . Such an equation can always be transformed to Painlevé III.
4.6. Discrete Painlevé II
We will now give an example of a family of orthogonal polynomials on the unit circle, for which the recurrence coefficients satisfy a discrete Painlevé equation. Orthogonal polynomials on the unit circle (OPUC) are defined by the orthogonality relations
[TABLE]
where . We denote the monic polynomials by . They satisfy a nice recurrence relation
[TABLE]
where is the reversed polynomial. The recurrence coefficients are nowadays known as Verblunsky coefficients, but earlier they were also known as Schur parameters or reflection coefficients. Let for . The trigonometric moments for this weight function are modified Bessel functions
[TABLE]
which is why Ismail [20, Example 8.4.3] calls them modified Bessel polynomials. The symmetry implies that are real-valued. If we write
[TABLE]
then
[TABLE]
and this function satisfies the Pearson equation
[TABLE]
As a consequence the orthogonal polynomials satisfy a structure relation:
Property 4.7**.**
The monic orthogonal polynomials for satisfy
[TABLE]
for some sequence . In fact, one has
[TABLE]
We now have two equations: the recurrence relation (4.26) and the structure relation (4.27), and we can check their compatibility. They will be compatible if the recurrence coefficients satisfy the following non-linear relation:
Theorem 4.8** (Periwal and Shevitz [35]).**
The Verblunsky coefficients for the weight satisfy
[TABLE]
with initial values
[TABLE]
Let , then
[TABLE]
and this is a particular case of discrete Painlevé II () given in (4.6). We need a solution with and for , because for Verblunsky coefficients one always has . Such a solution is unique.
Theorem 4.9**.**
Suppose . Then there is a unique solution of (4.28) for which and . The solution corresponds to and is negative for every .
A proof of this result can be found in [46, §3.3] for ; a proof for has not been published and we invite the reader to come up with such a proof. This special solution converges to zero (fast).
4.7. The Ablowitz-Ladik lattice and Painlevé III
The lattice equations corresponding to orthogonal polynomials on the unit circle are the Ablowitz-Ladik lattice equations (or the Schur flow).
Theorem 4.10**.**
Let be a positive measure on the unit circle which is symmetric (the Verblunsky coefficients are real). Let be the modified measure , with . The Verblunsky coefficients for the measure then satisfy
[TABLE]
We can now combine the discrete Painlevé II equation
[TABLE]
with the Ablowitz-Ladik equation
[TABLE]
Eliminate and to find
[TABLE]
Exercise*.*
If one puts , then show that satisfies the Painlevé V differential equation (4.4) with .
Painlevé V with can always be transformed to Painlevé III. A direct approach was given by Hisakado [19] and Tracy and Widom [42]. They showed that the ratio satisfies Painlevé III.
4.8. Some more examples
Several more examples have been worked out in the literature the past few years. Here is a short sample.
4.8.1. Generalized Meixner polynomials
These are discrete orthogonal polynomials
[TABLE]
which were considered in [38, 14, 8]. Put , and , then
[TABLE]
The initial values are
[TABLE]
where is Kummer’s confluent hypergeometric function. This is asymmetric discrete Painlevé IV or . If we put
[TABLE]
then
[TABLE]
with
[TABLE]
which is Painlevé V given in (4.4).
4.8.2. Modified Laguerre polynomials
Chen and Its [6] (see also [46, §4.4]) looked at orthogonal polynomials for the weight function on . This is a modification of the Laguerre weight with an exponential function that has an essential singularity at [math]. Put , , and , then
[TABLE]
This corresponds to the discrete Painlevé equation . The exponential modification is not of Toda type but belongs to a similar class of modifications (the Toda hierarchy). With some effort one can find the differential equation
[TABLE]
which is Painlevé III given in (4.2).
4.8.3. Modified Jacobi polynomials
Basor, Chen and Ehrhardt [3] (see also [46, §5.2]) considered the weight . This is a Toda modification of the weight function for Jacobi polynomials. In this case one has
[TABLE]
where and satisfy the recurrence relations
[TABLE]
and for one has the differential equation
[TABLE]
which is Painlevé V given in (4.4).
4.8.4. -orthogonal polynomials
There are also examples of families of -orthogonal polynomials for which one can find -discrete Painlevé equations for the recurrence coefficients. In this case the structure relation uses the -difference operator for which
[TABLE]
If we consider the weight
[TABLE]
then the recurrence coefficients (after some transformation) satisfy -discrete Painlevé III
[TABLE]
For the weight
[TABLE]
one finds -discrete Painlevé V
[TABLE]
and for
[TABLE]
one again finds -discrete Painlevé V. Observe that sometimes the weights are on but they can also be on the discrete set . See [46, §5.4] for more details.
4.9. Wronskians and special function solutions
There is a good explanation why these Toda modifications of orthogonal polynomials often give rise to Painlevé differential equations. In fact the solutions that we need for the recurrence coefficients are special solutions of the Painlevé equations in terms of special functions, such as the Airy functions, the Bessel functions, parabolic cylinder functions, the confluent hypergeometric function and the hypergeometric function. Such special function solutions are often in terms of Wronskians of one of these special functions. We can easily explain where these Wronskians are coming from, by using the theory of orthogonal polynomials. Indeed, we return to our Hankel determinants given in (1.3). They contain the moments , which for a Toda modification are
[TABLE]
Hence all the moments are obtained from the moment by differentiation, and the Hankel determinant (1.3) becomes
[TABLE]
which is the Wronskian of the functions ,
[TABLE]
The recurrence coefficient can be expressed in terms of these Hankel determinants as
[TABLE]
where we used (1.5). The recurrence coefficients can also be found in terms of determinants. If we write and compare the coefficients of in the recurrence relation (1.2), then . The coefficient can be obtained from (1.4) from which we see that , where is obtained from by replacing the last column by moments of one order higher . If we take a derivative of the Wronskian, then
[TABLE]
so that
[TABLE]
This gives explicit expressions of the recurrence coefficients and in terms of Wronskians generated from one seed function .
Acknowledgement
Many thanks to Mama Foupouagnigni and Wolfram Koepf for organizing the workshop Introduction to Orthogonal Polynomials and Applications in Douala, Cameroon, and for encouraging me to write this survey. Also thanks to Arno Kuijlaars with whom I am sharing a course on Orthogonal Polynomials and Random Matrices at KU Leuven, which was very useful for the material in Section 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] G. Anderson, A. Guionnet, O. Zeitouni, An Introduction to Random Matrices , Cambridge Studies in Advanced Mathematics 118 , Cambridge University Press, 2010.
- 2[2] A.I. Aptekarev, Multiple orthogonal polynomials , J. Comput. Appl. Math. 99 (1998), no. 1–2, 423–447.
- 3[3] E. Basor, Y. Chen, T. Ehrhardt, Painlevé V and time dependent Jacobi polynomials , J. Phys. A: Math. Theor. 43 (20 10), no. 1, 015204 (25 pp.).
- 4[4] P.M. Bleher, A.B.J. Kuijlaars, Integral representations for multiple Hermite and multiple Laguerre polynomials , Ann. Inst. Fourier, Grenoble 55 (2005), no. 6, 2001–2014.
- 5[5] P.M. Bleher, A.B.J. Kuijlaars, Random matrices with external source and multiple orthogonal polynomials , International Mathematics Research Notices 2004 , no. 3, 109–129.
- 6[6] Y. Chen, A. Its, Painlevé III and a singular linear statistics in Hermitian random matrix ensembles , J. Approx. Theory 162 (2010), no. 2, 270–297.
- 7[7] P.A. Clarkson, Painlevé equations — nonlinear special functions , Lecture Notes in Mathematics 1883 , Springer, Berlin, 2006, pp. 331–411.
- 8[8] P.A. Clarkson, Recurrence coefficients for discrete orthogonal polynomials and the Painlevé equations , J. Phys. A: Math. Theor. 46 (2013), no. 18, 185205 (18 pp.).
