Orthogonal and multiple orthogonal polynomials, random matrices, and   Painlev\'e equations

Walter Van Assche

arXiv:1904.07518·math.CA·July 14, 2020

Orthogonal and multiple orthogonal polynomials, random matrices, and Painlev\'e equations

Walter Van Assche

PDF

TL;DR

This paper introduces orthogonal and multiple orthogonal polynomials, explores their applications in random matrix theory, and discusses their connection to Painlevé equations, highlighting their significance in mathematical physics and related fields.

Contribution

It provides an overview of the theory of orthogonal and multiple orthogonal polynomials and elucidates their links with Painlevé equations in the context of random matrices.

Findings

01

Orthogonal polynomials are fundamental in mathematical physics and probability.

02

Multiple orthogonal polynomials extend classical theory with new applications.

03

Connections between orthogonal polynomials and Painlevé equations are established.

Abstract

Orthogonal polynomials and multiple orthogonal polynomials are interesting special functions because there is a beautiful theory for them, with many examples and useful applications in mathematical physics, numerical analysis, statistics and probability and many other disciplines. In these notes we give an introduction to the use of orthogonal polynomials in random matrix theory, we explain the notion of multiple orthogonal polynomials, and we show the link with certain non-linear difference and differential equations known as Painlev\'e equations.

Equations542

m_{n} = \int_{R} x^{n} d μ (x) .

m_{n} = \int_{R} x^{n} d μ (x) .

\int_{R} p_{n} (x) p_{m} (x) d μ (x) = δ_{m, n}, m, n \in N .

\int_{R} p_{n} (x) p_{m} (x) d μ (x) = δ_{m, n}, m, n \in N .

x_{1, n} < x_{2, n} < \dots < x_{n, n} .

x_{1, n} < x_{2, n} < \dots < x_{n, n} .

x p_{n} (x) = a_{n + 1} p_{n + 1} (x) + b_{n} p_{n} (x) + a_{n} p_{n - 1} (x), n \geq 1,

x p_{n} (x) = a_{n + 1} p_{n + 1} (x) + b_{n} p_{n} (x) + a_{n} p_{n - 1} (x), n \geq 1,

P_{n} (x) = \frac{1}{γ _{n}} p_{n} (x) = x^{n} + \dots .

P_{n} (x) = \frac{1}{γ _{n}} p_{n} (x) = x^{n} + \dots .

P_{n + 1} (x) = (x - b_{n}) P_{n} (x) - a_{n}^{2} P_{n - 1} (x),

P_{n + 1} (x) = (x - b_{n}) P_{n} (x) - a_{n}^{2} P_{n - 1} (x),

\int_{- 1}^{1} P_{n}^{(α, β)} (x) P_{m}^{(α, β)} (x) (1 - x)^{α} (1 + x)^{β} d x = 0, m \neq = n,

\int_{- 1}^{1} P_{n}^{(α, β)} (x) P_{m}^{(α, β)} (x) (1 - x)^{α} (1 + x)^{β} d x = 0, m \neq = n,

\int_{0}^{\infty} L_{n}^{(α)} (x) L_{m}^{(α)} (x) x^{α} e^{- x} d x = 0, m \neq = n,

\int_{0}^{\infty} L_{n}^{(α)} (x) L_{m}^{(α)} (x) x^{α} e^{- x} d x = 0, m \neq = n,

\int_{- \infty}^{\infty} H_{n} (x) H_{m} (x) e^{- x^{2}} d x = 0, m \neq = n .

\int_{- \infty}^{\infty} H_{n} (x) H_{m} (x) e^{- x^{2}} d x = 0, m \neq = n .

H_{n}=\begin{pmatrix}m_{0}&m_{1}&m_{2}&\cdots&m_{n-1}\\ m_{1}&m_{2}&m_{3}&\cdots&m_{n}\\ m_{2}&m_{3}&m_{4}&\cdots&m_{n+1}\\ \vdots&\vdots&\vdots&\cdots&\vdots\\ m_{n-1}&m_{n}&m_{n+1}&\cdots&m_{2n-2}\end{pmatrix}=\bigl{(}m_{i+j-2}\bigr{)}_{i,j=1}^{n}

H_{n}=\begin{pmatrix}m_{0}&m_{1}&m_{2}&\cdots&m_{n-1}\\ m_{1}&m_{2}&m_{3}&\cdots&m_{n}\\ m_{2}&m_{3}&m_{4}&\cdots&m_{n+1}\\ \vdots&\vdots&\vdots&\cdots&\vdots\\ m_{n-1}&m_{n}&m_{n+1}&\cdots&m_{2n-2}\end{pmatrix}=\bigl{(}m_{i+j-2}\bigr{)}_{i,j=1}^{n}

D_{n}=\det\begin{pmatrix}m_{0}&m_{1}&m_{2}&\cdots&m_{n-1}\\ m_{1}&m_{2}&m_{3}&\cdots&m_{n}\\ m_{2}&m_{3}&m_{4}&\cdots&m_{n+1}\\ \vdots&\vdots&\vdots&\cdots&\vdots\\ m_{n-1}&m_{n}&m_{n+1}&\cdots&m_{2n-2}\end{pmatrix}=\det\bigl{(}m_{i+j-2}\bigr{)}_{i,j=1}^{n}.

D_{n}=\det\begin{pmatrix}m_{0}&m_{1}&m_{2}&\cdots&m_{n-1}\\ m_{1}&m_{2}&m_{3}&\cdots&m_{n}\\ m_{2}&m_{3}&m_{4}&\cdots&m_{n+1}\\ \vdots&\vdots&\vdots&\cdots&\vdots\\ m_{n-1}&m_{n}&m_{n+1}&\cdots&m_{2n-2}\end{pmatrix}=\det\bigl{(}m_{i+j-2}\bigr{)}_{i,j=1}^{n}.

P_{n} (x) = \frac{1}{D _{n}} det m_{0} m_{1} m_{2} ⋮ m_{n - 1} 1 m_{1} m_{2} m_{3} ⋮ m_{n} x m_{2} m_{3} m_{4} ⋮ m_{n + 1} x^{2} \dots \dots \dots \dots \dots \dots m_{n} m_{n + 1} m_{n + 2} ⋮ m_{2 n - 1} x^{n},

P_{n} (x) = \frac{1}{D _{n}} det m_{0} m_{1} m_{2} ⋮ m_{n - 1} 1 m_{1} m_{2} m_{3} ⋮ m_{n} x m_{2} m_{3} m_{4} ⋮ m_{n + 1} x^{2} \dots \dots \dots \dots \dots \dots m_{n} m_{n + 1} m_{n + 2} ⋮ m_{2 n - 1} x^{n},

\frac{1}{γ _{n}^{2}} = \int_{R} P_{n}^{2} (x) d μ (x) = \frac{D _{n + 1}}{D _{n}} .

\frac{1}{γ _{n}^{2}} = \int_{R} P_{n}^{2} (x) d μ (x) = \frac{D _{n + 1}}{D _{n}} .

K_{n} (x, y) = k = 0 \sum n - 1 γ_{k}^{2} P_{k} (x) P_{k} (y) = k = 0 \sum n - 1 p_{k} (x) p_{k} (y) .

K_{n} (x, y) = k = 0 \sum n - 1 γ_{k}^{2} P_{k} (x) P_{k} (y) = k = 0 \sum n - 1 p_{k} (x) p_{k} (y) .

\int K_{n} (x, y) q_{n - 1} (y) d μ (y) = q_{n - 1} (x) .

\int K_{n} (x, y) q_{n - 1} (y) d μ (y) = q_{n - 1} (x) .

\int K_{n} (x, y) f (y) d μ (y) = f_{n - 1} (x)

\int K_{n} (x, y) f (y) d μ (y) = f_{n - 1} (x)

k = 0 \sum n - 1 γ_{k}^{2} P_{k} (x) P_{k} (y) = γ_{n - 1}^{2} \frac{P _{n} ( x ) P _{n - 1} ( y ) - P _{n - 1} ( x ) P _{n} ( y )}{x - y},

k = 0 \sum n - 1 γ_{k}^{2} P_{k} (x) P_{k} (y) = γ_{n - 1}^{2} \frac{P _{n} ( x ) P _{n - 1} ( y ) - P _{n - 1} ( x ) P _{n} ( y )}{x - y},

\sum_{k=0}^{n-1}\gamma_{k}^{2}P_{k}^{2}(x)=\gamma_{n-1}^{2}\Bigl{(}P_{n}^{\prime}(x)P_{n-1}(x)-P_{n-1}^{\prime}(x)P_{n}(x)\Bigr{)}.

\sum_{k=0}^{n-1}\gamma_{k}^{2}P_{k}^{2}(x)=\gamma_{n-1}^{2}\Bigl{(}P_{n}^{\prime}(x)P_{n-1}(x)-P_{n-1}^{\prime}(x)P_{n}(x)\Bigr{)}.

k = 0 \sum n - 1 p_{k} (x) p_{k} (y) = a_{n} \frac{p _{n} ( x ) p _{n - 1} ( y ) - p _{n - 1} ( x ) p _{n} ( y )}{x - y},

k = 0 \sum n - 1 p_{k} (x) p_{k} (y) = a_{n} \frac{p _{n} ( x ) p _{n - 1} ( y ) - p _{n - 1} ( x ) p _{n} ( y )}{x - y},

\sum_{k=0}^{n-1}p_{k}^{2}(x)=a_{n}\Bigl{(}p_{n}^{\prime}(x)p_{n-1}(x)-p_{n-1}^{\prime}(x)p_{n}(x)\Bigr{)}.

\sum_{k=0}^{n-1}p_{k}^{2}(x)=a_{n}\Bigl{(}p_{n}^{\prime}(x)p_{n-1}(x)-p_{n-1}^{\prime}(x)p_{n}(x)\Bigr{)}.

Δ_{n} (x_{1}, \dots, x_{n}) = det 1 x_{1} x_{1}^{2} ⋮ x_{1}^{n - 1} 1 x_{2} x_{2}^{2} ⋮ x_{2}^{n - 1} 1 x_{3} x_{3}^{2} ⋮ x_{3}^{n - 1} \dots \dots \dots \dots \dots 1 x_{n} x_{n}^{2} ⋮ x_{n}^{n - 1} .

Δ_{n} (x_{1}, \dots, x_{n}) = det 1 x_{1} x_{1}^{2} ⋮ x_{1}^{n - 1} 1 x_{2} x_{2}^{2} ⋮ x_{2}^{n - 1} 1 x_{3} x_{3}^{2} ⋮ x_{3}^{n - 1} \dots \dots \dots \dots \dots 1 x_{n} x_{n}^{2} ⋮ x_{n}^{n - 1} .

Δ_{n} = i > j \prod (x_{i} - x_{j}) .

Δ_{n} = i > j \prod (x_{i} - x_{j}) .

D_{n} = \frac{1}{n !} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} Δ_{n}^{2} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}),

D_{n} = \frac{1}{n !} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} Δ_{n}^{2} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}),

P_{n} (x) = \frac{1}{n ! D _{n}} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} i = 1 \prod n (x - x_{i}) Δ_{n}^{2} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

P_{n} (x) = \frac{1}{n ! D _{n}} \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} i = 1 \prod n (x - x_{i}) Δ_{n}^{2} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = \int_{- \infty}^{\infty} det 1 m_{1} m_{2} ⋮ m_{n - 1} x_{1} m_{2} m_{3} ⋮ m_{n} x_{1}^{2} m_{3} m_{4} ⋮ m_{n + 1} \dots \dots \dots \dots \dots x_{1}^{n - 1} m_{n} m_{n + 1} ⋮ m_{2 n - 2} d μ (x_{1}) .

D_{n} = \int_{- \infty}^{\infty} det 1 m_{1} m_{2} ⋮ m_{n - 1} x_{1} m_{2} m_{3} ⋮ m_{n} x_{1}^{2} m_{3} m_{4} ⋮ m_{n + 1} \dots \dots \dots \dots \dots x_{1}^{n - 1} m_{n} m_{n + 1} ⋮ m_{2 n - 2} d μ (x_{1}) .

D_{n} = \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} det 1 x_{2} x_{3}^{2} ⋮ x_{n}^{n - 1} x_{1} x_{2}^{2} x_{3}^{3} ⋮ x_{n}^{n} x_{1}^{2} x_{2}^{3} x_{3}^{4} ⋮ x_{n}^{n + 1} \dots \dots \dots \dots \dots x_{1}^{n - 1} x_{2}^{n} x_{3}^{n + 1} ⋮ x_{n}^{2 n - 2} d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = \int_{- \infty}^{\infty} \dots \int_{- \infty}^{\infty} det 1 x_{2} x_{3}^{2} ⋮ x_{n}^{n - 1} x_{1} x_{2}^{2} x_{3}^{3} ⋮ x_{n}^{n} x_{1}^{2} x_{2}^{3} x_{3}^{4} ⋮ x_{n}^{n + 1} \dots \dots \dots \dots \dots x_{1}^{n - 1} x_{2}^{n} x_{3}^{n + 1} ⋮ x_{n}^{2 n - 2} d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = \int_{R^{n}} j = 1 \prod n x_{j}^{j - 1} Δ_{n} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = \int_{R^{n}} j = 1 \prod n x_{j}^{j - 1} Δ_{n} (x_{1}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = σ \in S_{n} \sum \int_{x_{σ (1)} < \dots < x_{σ (n)}} j = 1 \prod n x_{j}^{j - 1} Δ_{n} (x_{1}, x_{2}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = σ \in S_{n} \sum \int_{x_{σ (1)} < \dots < x_{σ (n)}} j = 1 \prod n x_{j}^{j - 1} Δ_{n} (x_{1}, x_{2}, \dots, x_{n}) d μ (x_{1}) \dots d μ (x_{n}) .

D_{n} = \int_{y_{1} < \dots < y_{n}} τ \in S_{n} \sum j = 1 \prod n y_{τ (j)}^{j - 1} Δ_{n} (y_{τ (1)}, \dots, y_{τ (n)}) d μ (y_{1}) \dots d μ (y_{n}) .

D_{n} = \int_{y_{1} < \dots < y_{n}} τ \in S_{n} \sum j = 1 \prod n y_{τ (j)}^{j - 1} Δ_{n} (y_{τ (1)}, \dots, y_{τ (n)}) d μ (y_{1}) \dots d μ (y_{n}) .

D_{n} = \int_{y_{1} < \dots < y_{n}} (τ \in S_{n} \sum sign (τ) j = 1 \prod n y_{τ (j)}^{j - 1}) Δ_{n} (y_{1}, \dots, y_{n}) d μ (y_{1}) \dots d μ (y_{n}) .

D_{n} = \int_{y_{1} < \dots < y_{n}} (τ \in S_{n} \sum sign (τ) j = 1 \prod n y_{τ (j)}^{j - 1}) Δ_{n} (y_{1}, \dots, y_{n}) d μ (y_{1}) \dots d μ (y_{n}) .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Orthogonal and multiple orthogonal polynomials, random matrices, and Painlevé equations

Walter Van Assche

Department of Mathematics

KU Leuven

Celestijnenlaan 200B box 2400

BE 3001 Leuven, Belgium

[email protected]

Abstract.

Orthogonal polynomials and multiple orthogonal polynomials are interesting special functions because there is a beautiful theory for them, with many examples and useful applications in mathematical physics, numerical analysis, statistics and probability and many other disciplines. In these notes we give an introduction to the use of orthogonal polynomials in random matrix theory, we explain the notion of multiple orthogonal polynomials, and we show the link with certain non-linear difference and differential equations known as Painlevé equations.

Key words and phrases:

Orthogonal polynomials, random matrices, multiple orthogonal polynomials, Painlevé equations

1991 Mathematics Subject Classification:

Primary 33C45, 42C05, 60B20, 33E17; Secondary 15B52, 34M55, 41A21

1 Introduction
2 Orthogonal polynomials and random matrices
2.1 Point processes
2.2 Determinantal point process
2.3 Random matrices
2.4 Random matrix ensembles
3 Multiple orthogonal polynomials
3.1 Special systems
3.2 Nearest neighbor recurrence relations
3.3 Christoffel-Darboux formula
3.4 Hermite-Padé approximation
3.5 Multiple Hermite polynomials
3.6 Multiple Laguerre polynomials
3.7 Jacobi-Piñeiro polynomials
4 Orthogonal polynomials and Painlevé equations
4.1 Compatibility and Lax pairs
4.2 Discrete Painlevé I
4.3 Langmuir lattice and Painlevé IV
4.4 Singularity confinement
4.5 Generalized Charlier polynomials
4.6 Discrete Painlevé II
4.7 The Ablowitz-Ladik lattice and Painlevé III
4.8 Some more examples
4.9 Wronskians and special function solutions

1. Introduction

For these lecture notes I assume the reader is familiar with the basic theory of orthogonal polynomials, in particular the classical orthogonal polynomials (Jacobi, Laguerre, Hermite) should be known. In this introduction we will fix the notation and terminology. Let $\mu$ be a positive measure on the real line for which all the moments $m_{n}$ , $n\in\mathbb{N}=\{0,1,2,3,\ldots\}$ exist, where

[TABLE]

The orthonormal polynomials $(p_{n})_{n\in\mathbb{N}}$ are such that $p_{n}(x)=\gamma_{n}x^{n}+\cdots$ , with $\gamma_{n}>0$ , satisfying the orthogonality condition

[TABLE]

It is well known that the zeros of $p_{n}$ are real and simple, and we denote them by

[TABLE]

Orthonormal polynomials on the real line always satisfy a three-term recurrence relation

[TABLE]

with initial condition $p_{0}=1/\sqrt{m_{0}}$ and $p_{-1}=0$ , with recurrence coefficients $a_{n+1}>0$ and $b_{n}\in\mathbb{R}$ for $n\geq 0$ . Often we will also use monic orthogonal polynomials, which we denote by capital letters:

[TABLE]

Their recurrence relation is of the form

[TABLE]

with initial conditions $P_{0}=1$ and $P_{-1}=0$ . The classical families of orthogonal polynomials are

•

The Jacobi polynomials $P_{n}^{(\alpha,\beta)}$ , for which

[TABLE]

with parameters $\alpha,\beta>-1$ .

•

The Laguerre polynomials $L_{n}^{(\alpha)}$ for which

[TABLE]

with parameter $\alpha>-1$ .

•

The Hermite polynomials $H_{n}(x)$ for which

[TABLE]

Usually these polynomials are neither normalized nor monic but another normalization is used (for historical reasons) and one has to be a bit careful with some of the general formulas for orthonormal or monic orthogonal polynomials.

The matrix

[TABLE]

is the Hankel matrix with the moments of the orthogonality measure $\mu$ . The Hankel determinant is

[TABLE]

If the support of $\mu$ contains infinitely many points, then $D_{n}>0$ for all $n\in\mathbb{N}$ .

The monic orthogonal polynomials $P_{n}(x)$ are given by

[TABLE]

and

[TABLE]

The Christoffel-Darboux kernel is defined as

[TABLE]

This Christoffel-Darboux kernel is a reproducing kernel: for every polynomial $q_{n-1}$ of degree $\leq n-1$ one has

[TABLE]

If $f$ is a function in $L^{2}(\mu)$ , then

[TABLE]

gives a polynomial of degree $\leq n-1$ which is the least squares approximant of $f$ in the space of polynomials of degree $\leq n-1$ . The Christoffel-Darboux kernel is a sum of $n$ terms containing all the polynomials $p_{0},p_{1},\ldots,p_{n-1}$ , but there is a nice formula that expresses the kernel in just two terms containing the polynomials $p_{n-1}$ and $p_{n}$ only:

Property 1.1.

The Christoffel-Darboux formula is

[TABLE]

and its confluent version is

[TABLE]

The version for orthonormal polynomials is

Property 1.2.

The Christoffel-Darboux formula is

[TABLE]

and its confluent version is

[TABLE]

2. Orthogonal polynomials and random matrices

The link between orthogonal polynomials and random matrices is via the Christoffel-Darboux kernel and Heine’s formula for orthogonal polynomials, see Property 2.1. Useful references for random matrices are Mehta’s book [31], the book by Anderson, Guionnet and Zeitouni [1], and Deift’s monograph [11]. First of all, let $x_{1},x_{2},\ldots,x_{n}$ be real or complex numbers, then we define the Vandermonde determinant as

[TABLE]

This Vandermonde determinant can be evaluated explicitly:

[TABLE]

From this it is clear that $\Delta_{n}\neq 0$ when all the $x_{i}$ are distinct, and if $x_{1}<x_{2}<\cdots<x_{n}$ then $\Delta_{n}>0$ . Heine’s formula expresses the Hankel determinant with the moments of a measure $\mu$ as an $n$ -fold integral:

Property 2.1 (Heine).

The Hankel determinants $D_{n}$ in (1.3) can be written as

[TABLE]

where $\Delta_{n}$ is the Vandermonde determinant (2.1). Furthermore, the monic orthogonal polynomial $P_{n}(x)$ is also given by an $n$ -fold integral

[TABLE]

Proof.

If we write all the moments in the first row of (1.3) as an integral and use linearity of the determinant (for one row), then

[TABLE]

Repeating this for every row gives

[TABLE]

In each row we can take out the common factors to find

[TABLE]

Now write the integral over $\mathbb{R}^{n}$ as a sum of integrals over all simplices $x_{i_{1}}<x_{i_{2}}<\cdots<x_{i_{n}}$ , where $\sigma=(i_{1},i_{2},\ldots,i_{n})$ is a permutation of $(1,2,\ldots,n)$ . Then

[TABLE]

With the change of variables $x_{\sigma(j)}=y_{j}$ one has $x_{j}=y_{\tau(j)}$ , with $\tau=\sigma^{-1}$ and

[TABLE]

Observe that $\Delta_{n}(y_{\tau(1)},\ldots,y_{\tau(n)})=\textup{sign}(\tau)\Delta_{n}(y_{1},\ldots,y_{n})$ , so that

[TABLE]

Now use

[TABLE]

to find

[TABLE]

This is an integral over one simplex $y_{1}<y_{2}<\cdots<y_{n}$ in $\mathbb{R}^{n}$ . This integral is the same for every simplex, and since there are $n!$ simplices (because there are $n!$ permutations of $(1,2,\ldots,n)$ ), we find the required formula (2.2).

The proof for formula (2.3) is similar, using the determinant expression (1.4) for the monic orthogonal polynomial. ∎

It is remarkable that Szegő writes in his book [40]:

[These] Formulas … are not suitable in general for derivation of properties of the polynomials in question. To this end we shall generally prefer the orthogonality property itself, or other representations derived by means of the orthogonality property.

Heine’s formulas have now become crucial in the theory of random matrices.

2.1. Point processes

A $n$ -point process is a stochastic process where a set of $n$ points $\{X_{1},\ldots,X_{n}\}$ is selected, and the joint distribution of the random variables $(X_{1},X_{2},\ldots,X_{n})$ is given. Since we are dealing with a set of $n$ random numbers, the order of the random variables is irrelevant and hence we use a probability distribution which is invariant under permutations. Our interest is in the $n$ -point process where the joint probability distribution has a density (with respect to the product measure $d\mu(x_{1})\ldots d\mu(x_{n})$ ) given by

[TABLE]

where we mean that

[TABLE]

Observe that by Heine’s formula (2.2) this is indeed a probability distribution since it is positive and integrates over $\mathbb{R}^{n}$ to one. The points in this $n$ -point process are not independent and the factor $\Delta_{n}^{2}(x_{1},\ldots,x_{n})$ describes the dependence of the points. Two points are unlikely to be close together because then $\Delta_{n}^{2}(x_{1},\ldots,x_{n})=\prod_{j>i}(x_{j}-x_{i})^{2}$ is small and by the maximum likelihood principle the points will prefer to choose a position that maximizes $\Delta_{n}^{2}(x_{1},\ldots,x_{n})$ . This $n$ -point process therefore has points that repel each other.

An important property of this $n$ -point process is that it is a determinantal point process. To see this, we will express the probability density in terms of the Christoffel-Darboux kernel. We need a few important properties of that kernel.

Property 2.2.

The Christoffel-Darboux kernel satisfies

[TABLE]

and

[TABLE]

Proof.

The first property follows from the reproducing property of the Christoffel-Darboux kernel. For the second property we have

[TABLE]

∎

Property 2.3.

The density (2.4) can be written as

[TABLE]

where $K_{n}$ is the Christoffel-Darboux kernel.

Proof.

If we add rows in the Vandermonde determinant (2.1), then

[TABLE]

for any sequence $(P_{0},P_{1},P_{2},\ldots,P_{n-1})$ of monic polynomials. If we take the monic orthogonal polynomials, then

[TABLE]

where $\Gamma_{n}=\textup{diag}(\gamma_{0}^{2},\gamma_{1}^{2},\ldots,\gamma_{n-1}^{2})$ . Then use (1.5) to find that $\prod_{j=0}^{n-1}\gamma_{j}^{2}=1/D_{n}$ , so that

[TABLE]

which combined with (2.4) gives the required result. ∎

For this reason we call the $n$ -point process with density (2.4) the Christoffel-Darboux point process.

2.2. Determinantal point process

The fact that the density $P(x_{1},\ldots,x_{n})$ can be written as a determinant of a kernel function $K(x,y)$ that satisfies Property 2.2 is important and allows to compute correlation functions for $k$ points $k\leq n$ of the point process, in particular the probability density of one point (for $k=1$ ).

Definition 2.4.

For $k\leq n$ the $k$ th correlation function is

[TABLE]

The interpretation of these $k$ th correlation functions is the following: if $A_{i}\cap A_{j}=\emptyset$ ( $i\neq j$ ), and $N(A)$ is the number of points in $A$ , then

[TABLE]

The $k$ th correlation function can also be seen as the density of the marginal distribution of $k$ points in the $n$ -point process, up to a normalization factor:

Property 2.5.

The $k$ th correlation function is obtained from $P(x_{1},\ldots,x_{n})$ by

[TABLE]

Proof.

For $k=n-1$ we have, by expanding the determinant along the last row,

[TABLE]

By Property 2.2 the last term is $1/(n-1)!\rho_{n-1}(x_{1},\ldots,x_{n-1})$ . Expanding the remaining determinant along the last column gives

[TABLE]

The determinant does not contain $x_{n}$ , so the remaining integration can be done using Property 2.2 and gives

[TABLE]

The sum over $\ell$ gives the $(n-1)\times(n-1)$ determinant (recall that column $k$ which contains $K_{n}(x_{i},x_{k})$ is missing since $j\neq k$ )

[TABLE]

and to get the last column in the $k$ th position, we need to interchange columns $n-1-k$ times, which gives

[TABLE]

and hence

[TABLE]

To prove the case for all $k=n-m$ one uses induction on $m$ , for which we just proved the case $m=1$ . ∎

Definition 2.6.

A point process on $\mathbb{R}$ with correlation functions $\rho_{k}$ is a determinantal point process if there exists a kernel $K(x,y)$ such that for every $k$ and every $x_{1},\ldots,x_{k}\in\mathbb{R}$

[TABLE]

The following theorem shows that Property 2.2 is indeed crucial.

Theorem 2.7.

Suppose $K:\mathbb{R}\times\mathbb{R}\to\mathbb{R}$ is a kernel such that

•

$\int_{-\infty}^{\infty}K(x,x)\,dx=n\in\mathbb{N}$ ,

•

For every $x_{1},\ldots,x_{n}\in\mathbb{R}$ , one has $\det\bigl{(}K(x_{i},x_{j})\bigr{)}_{i,j=1}^{k}\geq 0$ .

•

$K(x,y)=\int_{-\infty}^{\infty}K(x,s)K(s,y)\,ds$ .

Then

[TABLE]

is a probability density on $\mathbb{R}^{n}$ which is invariant under permutations of coordinates. The associated $n$ -point process is determinantal.

The most important example (at least in the context of this section) is when $d\mu(x)=w(x)\,dx$ , and then one can take

[TABLE]

2.3. Random matrices

To see the relation with random matrices, we claim that the eigenvalues of certain random matrices of order $n$ form a determinantal point process with the Christoffel-Darboux kernel for a particular family of orthogonal polynomials. The Gaussian unitary ensemble (GUE) consists of Hermitian random matrices $\mathbf{M}$ of order $n$ with random entries

[TABLE]

where all $X_{k,\ell},Y_{k,\ell},X_{k,k}$ are independent normal random variables with mean zero and variance $\frac{1}{4n}$ (if $k<\ell$ ) or $\frac{1}{2n}$ (if $k=\ell$ ). The multivariate density is

[TABLE]

where $Z_{n}$ is normalizing constant. But this is also equal to

[TABLE]

where $M_{k,\ell}=(x_{k,\ell}+iy_{k,\ell})$ for $k<\ell$ , $M_{k,k}=x_{k,k}$ , and $M=M^{*}$ .

We are mostly interested in the eigenvalues $\lambda_{1},\ldots,\lambda_{n}$ of the random matrix $\mathbf{M}$ . To find the density of the eigenvalues, we use the change of variables: ${M}\mapsto(\Lambda,U)$ , where $U$ is a unitary matrix for which

[TABLE]

and $\Lambda=\textup{diag}(\lambda_{1},\ldots,\lambda_{n})$ , and then integrate over the unitary part $U$ , which leaves only the eigenvalues. This change of variables is done using the Weyl integration formula (see, e.g., [1, §4.1.3]):

Theorem 2.8 (Weyl integration formula).

For the change of variables $M=U\Lambda U^{*}$ one has

[TABLE]

where $c_{n}$ is a constant and $dU$ is the Haar measure on the unitary group.

We will use a simplified version of this result, for which one does not need the Haar measure on the unitary group. This works when the expression $f(M)$ that we want to integrate only depends on the eigenvalues of $M$ . Let $\mathcal{H}_{n}$ be the Hermitian matrices of order $n$ .

Definition 2.9.

A function $f:\mathcal{H}_{n}\to\mathbb{C}$ is a class function if

[TABLE]

for all unitary matrices $U$ .

Theorem 2.10 (Weyl integration formula for class functions).

For an integrable class function $f$ we have

[TABLE]

with

[TABLE]

The characteristic polynomial of a matrix $M$ only depends on the eigenvalues, hence $\det(xI-M)$ is a class function. For random matrices in GUE one finds for the average characteristic function

[TABLE]

and by (2.3) this is the monic Hermite polynomial $H_{n}(\sqrt{n}x)$ . More generally, the eigenvalues of a random matrix in GUE form a determinantal point process with the Christoffel-Darboux kernel of (scaled) Hermite polynomials. The average number of eigenvalues of $\mathbf{M}$ in $[a,b]$ is in terms of the correlation function $\rho_{1}(x)$ :

[TABLE]

2.4. Random matrix ensembles

Here we give a few more random matrix ensembles for which the eigenvalues form a determinantal point process with the Christoffel-Darboux kernel of classical orthogonal polynomials.

•

We already defined GUE (Gaussian Unitary Ensemble): this contains random matrices in $\mathcal{H}_{n}$ with density

[TABLE]

The average characteristic polynomial is

[TABLE]

This suggests that on the average the eigenvalues behave like the zeros of (scaled) Hermite polynomials. This is indeed true, but for this one needs the correlation function $\rho_{1}$ and the result that

[TABLE]

where $x_{1,n},\ldots,x_{n,n}$ are the zeros of the Hermite polynomial $H_{n}$ .

•

The Wishart ensemble. Let $\mathbf{M}$ be a $n\times m$ matrix $(m\geq n)$ with independent complex Gaussian entries $X_{k,\ell}+iY_{k,\ell}$ . Then $\mathbf{M}\mathbf{M}^{*}$ has the Wishart distribution with density

[TABLE]

The average characteristic polynomial is

[TABLE]

Observe that $\mathbf{M}\mathbf{M}^{*}$ is a positive definite matrix so that all the eigenvalues are positive. On the average they behave like the zeros of Laguerre polynomials.

•

Truncated unitary matrices. Let $U$ be a random unitary matrix of order $(m+k)\times(m+k)$ and let $\mathbf{V}$ be the $m\times n$ upper left corner $(m\geq n)$ . Then $\mathbf{V}^{*}\mathbf{V}$ is an $n\times n$ matrix and

[TABLE]

Unitary matrices have their eigenvalues on the unit circle, and a truncated unitary matrix has its singular values (the eigenvalues of $\mathbf{V}^{*}\mathbf{V}$ ) in $[0,1]$ . These eigenvalues behave on the average like the zeros of Jacobi polynomials.

*Exercise**.*

Let $\mathbf{M}_{n}$ be the Hermitian random matrix with entries

$(\mathbf{M}_{n})_{k,\ell}=\begin{cases}X_{k,\ell}+iY_{k,\ell},&k<\ell,\\ X_{\ell,k}-iY_{\ell,k},&k>\ell,\\ X_{k,k},&k=\ell,\end{cases}$

where $X_{k,\ell},Y_{k,\ell}$ $(k<\ell)$ and $X_{k,k}$ $(1\leq k\leq n)$ are independent random variables with means $\mathbb{E}(X_{k,\ell})=\mathbb{E}(Y_{k,\ell})=\mathbb{E}(X_{k,k})=0$ and variances $\mathbb{E}(X_{k,\ell}^{2})=\mathbb{E}(Y_{k,\ell}^{2})=\mathbb{E}(X_{k,k}^{2})=\sigma^{2}>0$ . Show that $P_{n}(x)=\mathbb{E}\det(xI_{n}-\mathbf{M}_{n})$ satisfies the three-term recurrence relation

$P_{n}(x)=xP_{n-1}(x)-2(n-1)\sigma^{2}P_{n-2}(x),$

with $P_{0}(x)=1$ and $P_{1}(x)=x$ . Identify this $P_{n}(x)$ as $\sigma^{n}H_{n}(x/2\sigma)$ , where $H_{n}$ is the Hermite polynomial of degree $n$ . This shows that the Hermite polynomial is the average characteristic polynomial of a large class of Hermitian random matrices, not only GUE.

So far we found that on the average the eigenvalues of random matrices from these ensembles behave like zeros of orthogonal polynomials. To get more information about individual eigenvalues, for example the largest eigenvalue or the smallest eigenvalue, one needs a more detailed analysis of the point process. In particular one needs to investigate the asymptotic behavior of the Christoffel-Darboux kernels. In particular, to understand the spacing between the eigenvalues in the neighborhood of $x^{*}$ in the bulk of the spectrum, one needs results for

[TABLE]

or, when $x^{*}$ is at the end of the spectrum,

[TABLE]

where $\gamma$ depends on the nature of the endpoint (hard or soft edge). This will give kernels of well-known point processes.

An important quantity of interest is the probability $p_{A}(m)$ that there are exactly $m$ eigenvalues in the set $A\subset\mathbb{R}$ . If there are $m$ eigenvalues in $A$ , then the number of ordered $k$ -tuples in $A$ is $\binom{m}{k}$ and thus

[TABLE]

because this is the expected number of ordered $k$ -tuples in $A$ . For $k=0$ one has

[TABLE]

therefore

[TABLE]

Changing the order of summation (we assume that this is allowed) and using

[TABLE]

we find that

[TABLE]

This is the so-called gap probability: the probability to find no eigenvalues in $A$ . For a determinantal point process, such as the eigenvalues of various random matrices, this gap probability is in fact the Fredholm determinant $\det(I-K_{A})$ of the operator $K_{A}:L^{2}(A)\to L^{2}(A)$ defined by

[TABLE]

The asymptotic behavior as the size $n$ of the random matrices increases to infinity, then gives the Fredholm determinant $\det(I-K_{A})$ of the operator $K_{A}$ that uses the kernel $K(x,y)$ which is the limit of the Christoffel-Darboux kernel $K_{n}(x,y)$ as described above. The lesson to be learned from this is that the asymptotic behavior of orthogonal polynomials and their Christoffel-Darboux kernel gives important insight in the behavior of eigenvalues of random matrices.

3. Multiple orthogonal polynomials

In this section we will explain the notion of multiple orthogonal polynomials. Useful references are Ismail’s book [20, Ch. 23], Nikishin and Sorokin’s book [33, Ch. 4] and the papers [2, 29, 48]. Instead of orthogonality conditions with respect to one measure on the real line, the orthogonality will be with respect to $r$ measures, where $r\geq 1$ . For $r=1$ one has the usual orthogonal polynomials, but for $r\geq 2$ one gets two types of multiple orthogonal polynomials.

Let $r\in\mathbb{N}$ and let $\mu_{1},\ldots,\mu_{r}$ be positive measures on the real line, for which all the moments exist. We use multi-indices $\vec{n}=(n_{1},n_{2},\ldots,n_{r})\in\mathbb{N}^{r}$ and denote their length by $|\vec{n}|=n_{1}+n_{2}+\cdots+n_{r}$ .

Definition 3.1 (type I).

Type I multiple orthogonal polynomials for $\vec{n}$ consist of the vector $(A_{\vec{n},1},\ldots,A_{\vec{n},r})$ of $r$ polynomials, with $\deg A_{\vec{n},j}\leq n_{j}-1$ , for which

[TABLE]

with normalization

[TABLE]

Definition 3.2 (type II).

The type II multiple orthogonal polynomial for $\vec{n}$ is the monic polynomial $P_{\vec{n}}$ of degree $|\vec{n}|$ for which

[TABLE]

for $1\leq j\leq r$ .

The conditions for type I and type II multiple orthogonal polynomials give a system of $|\vec{n}|$ linear equations for the $|\vec{n}|$ unknown coefficients of the polynomials. This system may not have a solution, or when a solution exists it may not be unique. A multi-index $\vec{n}$ is said to be normal if the type I vector $(A_{\vec{n},1},\ldots A_{\vec{n},r})$ exists and is unique, and this is equivalent with the existence and uniqueness of the monic type II multiple orthogonal polynomial $P_{\vec{n}}$ , because the matrix of the linear system for type II is the transpose of the matrix for the type I linear system. Hence $\vec{n}$ is a normal multi-index if and only if

[TABLE]

where

[TABLE]

are rectangular Hankel matrices containing the moments

[TABLE]

3.1. Special systems

Interesting systems of measures $(\mu_{1},\ldots,\mu_{r})$ are those for which all the multi-indices are normal. We call such systems perfect. Here we will describe two such systems.

Definition 3.3 (Angelesco system).

The measures $(\mu_{1},\ldots,\mu_{r})$ are an Angelesco system if the supports of the measures are subsets of disjoint intervals $\Delta_{j}$ , i.e., $\textup{supp}(\mu_{j})\subset\Delta_{j}$ and $\Delta_{i}\cap\Delta_{j}=\emptyset$ whenever $i\neq j$ .

Usually one allows that the intervals are touching, i.e., $\stackrel{{\scriptstyle\circ}}{{\Delta_{i}}}\cap\stackrel{{\scriptstyle\circ}}{{\Delta_{j}}}=\emptyset$ whenever $i\neq j$ .

Theorem 3.4 (Angelesco, Nikishin).

The type II multiple orthogonal polynomial $P_{\vec{n}}$ for an Angelesco system has exactly $n_{j}$ distinct zeros on $\stackrel{{\scriptstyle\circ}}{{\Delta_{j}}}$ for $1\leq j\leq r$ .

This means that the type II multiple orthogonal polynomial $P_{\vec{n}}$ can be factored as $P_{\vec{n}}(x)=\prod_{j=1}^{r}p_{\vec{n},j}(x)$ , where $p_{\vec{n},j}$ has all its zeros on $\Delta_{j}$ . In fact, $p_{\vec{n},j}$ is an ordinary orthogonal polynomial of degree $n_{j}$ on the interval $\Delta_{j}$ for the measure $\prod_{i\neq j}p_{\vec{n},i}(x)\ d\mu_{j}(x)$ :

[TABLE]

Observe that for $i\neq j$ the polynomial $p_{\vec{n},i}(x)$ has constant sign on $\Delta_{j}$ .

Corollary 3.5.

Every multi-index $\vec{n}$ is normal (an Angelesco system is perfect).

*Exercise**.*

Show that every $A_{\vec{n},j}$ has $n_{j}-1$ zeros on $\stackrel{{\scriptstyle\circ}}{{\Delta_{j}}}$ .

For another system of measures, which are all supported on the same interval $[a,b]$ , we need to recall the notion of a Chebyshev system.

Definition 3.6.

The functions $\varphi_{1},\ldots,\varphi_{n}$ are a Chebyshev system on $[a,b]$ if every linear combination $\sum_{i=1}^{n}a_{i}\varphi_{i}$ with $(a_{1},\ldots,a_{n})\neq(0,\ldots,0)$ has at most $n-1$ zeros on $[a,b]$ .

We can then define an Algebraic Chebyshev system:

Definition 3.7 (AT-system).

The measures $(\mu_{1},\ldots,\mu_{r})$ are an AT-system on the interval $[a,b]$ if the measures are all absolutely continuous with respect to a positive measure $\mu$ on $[a,b]$ , i.e., $d\mu_{j}(x)=w_{j}(x)\,d\mu(x)$ $(1\leq j\leq r)$ , and for every $\vec{n}$ the functions

[TABLE]

are a Chebyshev system on $[a,b]$ .

For an AT-system we have some control of the zeros of the type I and type II multiple orthogonal polynomials.

Theorem 3.8.

For an AT-system the function

[TABLE]

has exactly $|\vec{n}|-1$ sign changes on $(a,b)$ . Furthermore, the type II multiple orthogonal polynomial $P_{\vec{n}}$ has exactly $|\vec{n}|$ distinct zeros on $(a,b)$ .

Corollary 3.9.

Every multi-index in an AT-system is normal (an AT-system is perfect).

A very special system of measures was introduced by Nikishin in 1980.

Definition 3.10 (Nikishin system for $r=2$ ).

A Nikishin system of order $r=2$ consists of two measures $(\mu_{1},\mu_{2})$ , both supported on an interval $\Delta_{2}$ , and such that

[TABLE]

where $\sigma$ is a positive measure on an interval $\Delta_{1}$ and $\Delta_{1}\cap\Delta_{2}=\emptyset$ .

Nikishin showed that indices with $n_{1}\geq n_{2}$ are perfect. Driver and Stahl [12] proved the more general statement.

Theorem 3.11 (Nikishin, Driver-Stahl).

A Nikishin system of order two is perfect.

In order to define a Nikishin system of order $r>2$ we need some notation. We write $\langle\sigma_{1},\sigma_{2}\rangle$ for the measure which is absolutely continuous with respect to $\sigma_{1}$ and for which the Radon-Nikodym derivative is the Stieltjes transform of $\sigma_{2}$ :

[TABLE]

Nikishin systems of order $r$ can then be defined by induction.

Definition 3.12 (Nikishin system for general $r$ ).

A Nikishin system of order $r$ on an interval $\Delta_{r}$ is a system of $r$ measures $(\mu_{1},\mu_{2},\ldots,\mu_{r})$ supported on $\Delta_{r}$ such that $\mu_{j}=\langle\mu_{1},\sigma_{j}\rangle$ , $2\leq j\leq r$ , where $(\sigma_{2},\ldots,\sigma_{r})$ is a Nikishin system of order $r-1$ on an interval $\Delta_{r-1}$ and $\Delta_{r}\cap\Delta_{r-1}=\emptyset$ .

Fidalgo Prieto and López Lagomasino proved [13]

Theorem 3.13.

Every Nikishin system is perfect.

In most cases the measures $(\mu_{1},\ldots,\mu_{r})$ are absolutely continuous with respect to one fixed measure $\mu$ :

[TABLE]

We then define the type I function

[TABLE]

The type I functions and the type II polynomials then are very complementary: they form a biorthogonal system for many multi-indices.

Property 3.14 (biorthogonality).

[TABLE]

3.2. Nearest neighbor recurrence relations

The usual orthogonal polynomials (the case $r=1$ ) on the real line always satisfy a three-term recurrence relation that expresses $xp_{n}(x)$ in terms of the polynomials with neighboring degrees $p_{n+1},p_{n},p_{n-1}$ . A similar result is true for multiple orthogonal polynomials, but there are more neighbors for a multi-index. Indeed, the multi-index $\vec{n}$ has $r$ neighbors from above by adding 1 to one of the components of $\vec{n}$ . We denote these neighbors from above by $\vec{n}+\vec{e}_{k}$ for $1\leq k\leq r$ , where $\vec{e}_{k}=(0,\ldots,0,1,0,\ldots,0)$ with $1$ in position $k$ . There are also $r$ neighbors from below, namely $\vec{n}-\vec{e}_{j}$ , for $1\leq j\leq r$ . The nearest neighbor recurrence relations for type II multiple orthogonal polynomials are [45]

[TABLE]

Observe that one always uses the same linear combination of the neighbors from below. The nearest neighbor recurrence relations for type I multiple orthogonal polynomials are

[TABLE]

These are using the same recurrence coefficients $a_{\vec{n},j}$ , but there is a shift for the recurrence coefficients $b_{\vec{n},k}$ . For $r\geq 2$ the recurrence coefficients $\{a_{\vec{n},j},1\leq j\leq r\}$ and $\{b_{\vec{n},k},1\leq k\leq r\}$ are connected:

Theorem 3.15 (Van Assche [45]).

The recurrence coefficients $(a_{\vec{n},1},\ldots,a_{\vec{n},r})$ and $(b_{\vec{n},1},\ldots,b_{\vec{n},r})$ satisfy the partial difference equations

[TABLE]

for all $1\leq i\neq j\leq r$ .

By combining the equations of the nearest neighbor recurrence relations, one can also find a recurrence relation of order $r+1$ for the multiple orthogonal polynomials along a path from $\vec{0}$ to $\vec{n}$ in $\mathbb{N}^{r}$ . Let $(\vec{n}_{k})_{k\geq 0}$ be a path in $\mathbb{N}^{r}$ starting from $\vec{n}_{0}=\vec{0}$ , such that $\vec{n}_{k+1}-\vec{n}_{k}=\vec{e}_{i}$ for some $1\leq i\leq r$ . Then

[TABLE]

These $\beta_{\vec{n}_{k},j}$ coefficients can be expressed in terms of the recurrence coefficients in the nearest neighbor recurrence relations, but the explicit expression is rather complicated for general $r$ . An important case is the stepline:

[TABLE]

This recurrence relation of order $r+1$ can be expressed in terms of a Hessenberg matrix with $r$ diagonals below the main diagonal:

[TABLE]

3.3. Christoffel-Darboux formula

The Christoffel-Darboux kernel, which is the important reproducing kernel for orthogonal polynomials, has a counterpart in the theory of multiple orthogonal polynomials. It uses both the type I and type II multiple orthogonal polynomials, and is a sum over a path from $\vec{0}$ to $\vec{n}$ as described before. The Christoffel-Darboux kernel is defined as

[TABLE]

where $\vec{n}_{0}=\vec{0}$ , $\vec{n}_{N}=\vec{n}$ and the path in $\mathbb{N}^{r}$ is such that $\vec{n}_{k+1}-\vec{n}_{k}=\vec{e}_{i}$ for some $i$ satisfying $1\leq i\leq r$ , i.e., in every step the multi-index is increased by 1 in one component. This definition seems to depend on the choice of the path from $\vec{0}$ to $\vec{n}$ , but surprisingly this kernel is independent of that chosen path. This is a consequence of the relations between the recurrence coefficients given by Theorem 3.15 and is best explained by the following analogue of the Christoffel-Darboux formula for orthogonal polynomials:

Theorem 3.16 (Daems and Kuijlaars).

Let $(\vec{n}_{k})_{0\leq k\leq N}$ be a path in $\mathbb{N}^{r}$ starting from $\vec{n}_{0}=\vec{0}$ and ending in $\vec{n}_{N}=\vec{n}$ (where $N=|\vec{n}|$ ), such that $\vec{n}_{k+1}-\vec{n}_{k}=\vec{e}_{i}$ for some $1\leq i\leq r$ . Then

[TABLE]

Proof.

This was first proved in [9] and a proof based on the nearest neighbor recurrence relations can be found in [45]. ∎

The sum depends only on the endpoint $\vec{n}$ of the path in $\mathbb{N}^{r}$ and not on the path from $\vec{0}$ to this point. In many cases this Christoffel-Darboux kernel can be used to generate a determinantal process by using Theorem 2.7 and the biorthogonality in Property 3.14. The only thing which is not obvious is the positivity $K_{\vec{n}}(x,x)\geq 0$ , which needs to be checked separately. See [23] for more details about such determinantal processes.

3.4. Hermite-Padé approximation

Multiple orthogonal polynomials have their roots in Hermite-Padé approximation, which was introduced by Hermite and investigated in detail by Padé (for $r=1$ ). Hermite-Padé approximation is a method to approximate $r$ functions simultaneously by rational functions. Multiple orthogonal polynomials appear when one uses Hermite-Padé approximation near infinity. Let $(f_{1},\ldots,f_{r})$ be $r$ Markov functions, i.e.,

[TABLE]

Definition 3.17 (Type I Hermite-Padé approximation).

Type I Hermite-Padé approximation is to find $r$ polynomials $(A_{\vec{n},1},\ldots,A_{\vec{n},r})$ , with $\deg A_{\vec{n},j}\leq n_{j}-1$ , and a polynomial $B_{\vec{n}}$ such that

[TABLE]

The solution is that $(A_{\vec{n},1},\ldots,A_{\vec{n},r})$ is the type I multiple orthogonal polynomial vector, and

[TABLE]

The error in this approximation problem can also be expressed in terms of the type I multiple orthogonal polynomials. One has

[TABLE]

and the orthogonality properties of the type I multiple orthogonal polynomials indeed show that (3.1) holds.

Definition 3.18 (Type II Hermite-Padé approximation).

Type II Hermite-Padé approximation is to find a polynomial $P_{\vec{n}}$ of degree $\leq|\vec{n}|$ and polynomials $Q_{\vec{n},1},\ldots,$ $Q_{\vec{n},r}$ such that

[TABLE]

for $1\leq j\leq r$ .

The solution for this approximation problem is to take the type II multiple orthogonal polynomial $P_{\vec{n}}$ and

[TABLE]

Observe that this approximation problem is to find rational approximants to each $f_{j}$ with a common denominator, and this common denominator turns out to be the type II multiple orthogonal polynomial. The error can again be expressed in terms of the multiple orthogonal polynomial:

[TABLE]

which can be verified by using the orthogonality conditions for the type II multiple orthogonal polynomial.

Hermite-Padé approximants are used frequently in number theory to find good rational approximants for real numbers and to prove irrationality and transcendence of some important real numbers. Hermite used these approximants (but at [math] rather than $\infty$ ) to prove that $e$ is a transcendental number.

3.5. Multiple Hermite polynomials

As an example we will describe multiple Hermite polynomials in some detail and explain some applications where they are used. The type II multiple Hermite polynomials $H_{\vec{n}}$ satisfy

[TABLE]

for $1\leq j\leq r$ , with $c_{i}\neq c_{j}$ whenever $i\neq j$ . This condition on the parameters $c_{1},\ldots,c_{r}$ guarantees that every multi-index $\vec{n}$ is normal, since the measures with weight function $e^{-x^{2}+c_{j}x}$ $(1\leq j\leq r)$ form an AT-system. These multiple orthogonal polynomials can be obtained by using the Rodrigues formula

[TABLE]

*Exercise**.*

Show that the differential operators

$e^{-c_{j}x}\frac{d^{n_{j}}}{dx^{n_{j}}}e^{c_{j}x},\qquad 1\leq j\leq r$

are commuting. Use this (and integration by parts) to show that this indeed gives the type II multiple Hermite polynomial.

By using this Rodrigues formula (and the Leibniz rule for the $n$ th derivative of a product), one finds the explicit expression

[TABLE]

where $H_{n}$ are the usual Hermite polynomials. The nearest neighbor recurrence relations for multiple Hermite polynomials are quite simple:

[TABLE]

They also have some useful differential properties: there are $r$ raising operators

[TABLE]

and one lowering operator

[TABLE]

By combining these raising operators and the lowering operator one finds a differential equation of order $r+1$ :

[TABLE]

where

[TABLE]

One can also find some integral representations (see [4])

[TABLE]

For the type I multiple Hermite polynomials one has

[TABLE]

where $\Gamma_{k}$ is a closed contour encircling $c_{k}/2$ once and none of the other $c_{j}/2$ , and

[TABLE]

where $\Gamma$ is a closed contour encircling all $c_{j}/2$ .

3.5.1. Random matrices

These multiple Hermite polynomials are useful for investigating random matrices with external source [5]. Let $\mathbf{M}$ be a random $N\times N$ Hermitian matrix and consider the ensemble with probability distribution

[TABLE]

where $A$ is a fixed Hermitian matrix (the external source). The average characteristic polynomial is a multiple Hermite polynomial:

Property 3.19.

Suppose $A$ has eigenvalues $c_{1},\ldots,c_{r}$ with multiplicities $n_{1},\ldots,n_{r}$ , then

[TABLE]

Furthermore, the eigenvalues form a determinantal process with the Christoffel-Darboux kernel for multiple Hermite polynomials:

Property 3.20.

The density of the eigenvalues is given by

[TABLE]

where the kernel is given by

[TABLE]

with $(\vec{n}_{k})_{0\leq k\leq N}$ a path from $\vec{0}$ to $\vec{n}$ in $\mathbb{N}^{r}$ and

[TABLE]

This means that we can also find the correlation functions:

Property 3.21.

The $m$ -point correlation function

[TABLE]

is given by

[TABLE]

where the kernel is given by

[TABLE]

3.5.2. Non-intersecting Brownian motions

Another interesting problem where multiple Hermite polynomials are appearing is to find what happens with $n$ independent Brownian motions (in fact, $n$ Brownian bridges) with the constraint that they are not allowed to intersect, see [10].

The density of the probability that the $n$ non-intersecting paths, leaving $(t=0)$ at $a_{1},\ldots,a_{n}$ and arriving $(t=1)$ at $b_{1},\ldots,b_{n}$ , are at $x_{1},\ldots,x_{n}$ at time $t\in(0,1)$ is (Karlin and McGregor [22])

[TABLE]

where

[TABLE]

When $a_{1},\ldots,a_{n}\to 0$ and $b_{1},\ldots,b_{n}\to 0$ (see Fig. 1) then

[TABLE]

where the kernel is given by

[TABLE]

This kernel is related to the Christoffel-Darboux kernel for the usual Hermite polynomials.

When $a_{1},\ldots,a_{n}\to 0$ and $b_{1},\ldots,b_{n/2}\to-b$ , $b_{n/2+1},\ldots,b_{n}\to b$ (see Fig. 2) then

[TABLE]

with

[TABLE]

with multiple orthogonal polynomials for the weights

[TABLE]

This kernel is related to the Christoffel-Darboux kernel for multiple Hermite polynomials. An interesting phenomenon appears: for small values of $t$ the points at level $t$ accumulate on one interval, but for larger values of $t$ in $(0,1)$ the points accumulate on two disjoint intervals. There is a phase transition at a critical point $t_{c}\in(0,1)$ . A detailed asymptotic analysis of the kernel near this point will require a special function satisfying a third order differential equation (the Pearsey equation) which is a limiting case of the third order differential equation of multiple Hermite polynomials. The limiting kernel is known as the Pearsey kernel.

3.6. Multiple Laguerre polynomials

The Laguerre weight is

[TABLE]

There are two easy ways to obtain multiple Laguerre polynomials:

(1)

Changing the parameter $\alpha$ to $\alpha_{1},\ldots,\alpha_{r}$ . This gives multiple Laguerre polynomials of the first kind. 2. (2)

Changing the exponential decay at infinity from $e^{-x}$ to $e^{-c_{j}x}$ with parameters $c_{1},\ldots,c_{r}$ . This gives multiple Laguerre polynomials of the second kind.

3.6.1. Multiple Laguerre polynomials of the first kind

Type II multiple Laguerre of the first kind $L_{\vec{n}}^{\vec{\alpha}}(x)$ satisfy

[TABLE]

for $1\leq j\leq r$ . In order that all multi-indices are normal we need to have parameters $\alpha_{j}>-1$ and $\alpha_{i}-\alpha_{j}\notin\mathbb{Z}$ whenever $i\neq j$ , in which case the $r$ measures form an AT-system. The multiple orthogonal polynomials can be found from the Rodrigues formula

[TABLE]

An explicit formula is

[TABLE]

Another explicit expression with hypergeometric functions is

[TABLE]

The nearest neighbor recurrence relations are

[TABLE]

with

[TABLE]

and

[TABLE]

These multiple Laguerre polynomials also have some differential properties. There are $r$ raising operators

[TABLE]

and there is one lowering operator

[TABLE]

Combining them gives the differential equation

[TABLE]

3.6.2. Multiple Laguerre polynomials of the second kind

Type II multiple Laguerre polynomials of the second kind $L_{\vec{n}}^{\alpha,\vec{c}}(x)$ satisfy

[TABLE]

for $1\leq j\leq r$ . The parameters need to satisfy $\alpha>-1$ and $c_{j}>0$ with $c_{i}\neq c_{j}$ whenever $i\neq j$ . The Rodrigues formula is

[TABLE]

which allows to find the explicit expression

[TABLE]

The nearest neighbor recurrence relations are

[TABLE]

with

[TABLE]

The differential properties include $r$ raising operators

[TABLE]

and one lowering operator

[TABLE]

They give the differential equation

[TABLE]

where

[TABLE]

3.6.3. Random matrices: Wishart ensemble

Wishart (1928) introduced the Wishart distribution for $N\times N$ positive definite Hermitian matrices

[TABLE]

where all the columns of $\mathbf{X}$ are independent and have a multivariate Gauss distribution with covariance matrix $\Sigma$ . The density for the Wishart distribution is

[TABLE]

If $\Sigma=I_{N}$ then Laguerre polynomials (with $\alpha=p$ ) play an important role. If $\Sigma^{-1}$ has eigenvalues $c_{1},\ldots,c_{r}$ with multiplicities $n_{1},\ldots,n_{r}$ , then we need multiple Laguerre polynomials of the second kind. The average characteristic polynomial is

[TABLE]

3.7. Jacobi-Piñeiro polynomials

There are several ways to find multiple Jacobi polynomials. Here we only mention one way which uses the same differential operators as the multiple Laguerre polynomials of the first kind. The Jacobi-Piñeiro polynomials $P_{\vec{n}}^{(\vec{\alpha},\beta)}$ satisfy

[TABLE]

for $1\leq j\leq r$ . Hence we are using Jacobi weights $x^{\alpha}(1-x)^{\beta}$ on the interval $[0,1]$ , with $\alpha,\beta>-1$ but with $r$ different parameters $\alpha_{1},\ldots,\alpha_{r}$ . In order to have a perfect system we require $\alpha_{i}-\alpha_{j}\notin\mathbb{Z}$ whenever $i\neq j$ . They can be obtained using the Rodrigues formula

[TABLE]

An expression in terms of generalized hypergeometric functions is

[TABLE]

This hypergeometric function does not terminate when $\beta$ is not an integer. Another useful expression is

[TABLE]

Again there are $r$ raising differential operators and one lowering operator and the recurrence coefficients are known explicitly. These polynomials are useful for rational approximation of polylogarithms, and in particular for the zeta function $\zeta(k)$ at integers. The polylogarithms are defined by

[TABLE]

and one has

[TABLE]

Simultaneous rational approximation to $\textup{Li}_{1}(1/z),\ldots,\textup{Li}_{r}(1/z)$ can be done using Hermite-Padé approximation with a limiting case of Jacobi-Piñeiro polynomials where $\beta=0$ and $\alpha_{1}=\alpha_{2}=\cdots=\alpha_{r}=0$ , which is possible when $n_{1}\geq n_{2}\geq\cdots\geq n_{r}$ . This is particularly interesting if we let $z\to 1$ , since $\textup{Li}_{k}(1)=\zeta(k)$ . Apéry’s construction of good rational approximants for $\zeta(3)$ (proving that $\zeta(3)$ is irrational) essentially makes use of these multiple orthogonal polynomials, see, e.g. [43].

4. Orthogonal polynomials and Painlevé equations

In this section we describe how orthogonal polynomials are related to non-linear difference and differential equations, in particular to discrete Painlevé equations and the six Painlevé differential equations. For a recent discussion on this relation between orthogonal polynomials and Painlevé equations we refer to the monograph [46]. Other useful references are [8, 7, 44].

Painlevé equations (discrete and continuous) appear at various places in the theory of orthogonal polynomials, in particular

•

The recurrence coefficients of some semiclassical orthogonal polynomials satisfy discrete Painlevé equations.

•

The recurrence coefficients of orthogonal polynomials with a Toda-type evolution satisfy Painlevé differential equations for which special solutions depending on special functions (Airy, Bessel, (confluent) hypergeometric, parabolic cylinder functions) are relevant.

•

Rational solutions of Painlevé equations can be expressed in terms of Wronskians of orthogonal polynomials.

•

The local asymptotics for orthogonal polynomials at critical points is often using special transcendental solutions of Painlevé equations.

In this section we will only deal with the first two of these.

What are Painlevé (differential) equations? They are second order nonlinear differential equations

[TABLE]

that have the Painlevé property: The general solution is free from movable branch points. The only singularities which may depend on the initial conditions are poles. Painlevé and his collaborators found 50 families (up to Möbius transformations), all of which could be reduced to known equations and six new equations (new at least at the beginning of the 20th century). The six Painlevé equations are

[TABLE]

Discrete Painlevé equations are somewhat more difficult to describe. Roughly speaking they are second order nonlinear recurrence equations for which the continuous limit is a Painlevé equation. They have the singularity confinement property, but this property is not sufficient to characterize discrete Painlevé equations. A quote by Kruskal [24] is:

Anything simpler becomes trivially integrable, anything more complicated becomes hopelessly non-integrable.

A more correct description is that they are nonlinear recurrence relations with ‘nice’ symmetry and geometry. A full classification of discrete (and continuous) Painlevé equations has been found by Sakai [36]. This is based on rational surfaces associated with affine root systems. It describes the space of initial values which parametrizes all the solutions (Okamoto [34]). A fine tuning of this classification was given recently by Kajiwara, Noumi and Yamada [21]: they also include the symmetry, i.e., the group of Bäcklund transformations, which are transformations that map a solution of a Painlevé equation to another solution with different parameters. A partial list of discrete Painlevé equations is:

[TABLE]

where $z_{n}=\alpha n+\beta$ and $a,b,c,d$ are constants.

[TABLE]

where $q_{n}=q_{0}q^{n}$ and $a,b,c,d$ are constants.

[TABLE]

The latter corresponds to $\textup{d-P}(E_{6}^{(1)}/A_{2}^{(1)})$ where $E_{6}^{(1)}$ is the surface type and $A_{2}^{(1)}$ is the symmetry type. Sakai’s classification (surface type) corresponds to the following diagram:

[TABLE]

4.1. Compatibility and Lax pairs

There is a general philosophy behind the reason why Painlevé equations appear for the recurrence coefficients of orthogonal polynomials. Orthogonal polynomials $P_{n}(x)$ are really functions of two variables: a discrete variable $n$ and a continuous variable $x$ . The three term recurrence relation (1.2) gives a difference equation in the variable $n$ , and if the measure is absolutely continuous with a weight function $w$ that satisfies a Pearson equation

[TABLE]

where $\sigma$ and $\tau$ are polynomials, then the orthogonal polynomials also satisfy differential relations in the variable $x$ . If $\deg\sigma\leq 2$ and $\deg\tau=1$ then we are dealing with classical orthogonal polynomials which satisfy the second order differential equation

[TABLE]

where $\lambda_{n}=-n(n-1)\sigma^{\prime\prime}/2-n\tau^{\prime}$ . In the semiclassical case we still have the Pearson equation (4.7) but we allow $\deg\sigma>2$ or $\deg\tau\neq 1$ . In that case there is a structure relation

[TABLE]

where $s=\deg\sigma$ and $t=\max\{\deg\tau,\deg\sigma-1\}$ . The structure relation (4.8) and the three-term recurrence relation (1.2) have to be compatible: if we differentiate the terms in the recurrence relation (1.2) and replace all the $P_{k}^{\prime}(x)$ using the structure relation (4.8), then we get a linear combination of a finite number of orthogonal polynomials that is equal to [math]. Since (orthogonal) polynomials are linearly independent in the linear space of polynomials, the coefficients in this linear combination have to be zero, and this gives relations between the recurrence coefficient $a_{n}^{2},b_{n}$ and the coefficients $A_{n,k}$ in the structure relation. Eliminating these $A_{n,k}$ gives recurrence relations for the $a_{n}^{2},b_{n}$ , which turn out to be non-linear. If they are of second order, then we can identify them as discrete Painlevé equations. In this way the three-term recurrence relation and the structure relation can be considered as a Lax pair for the obtained discrete Painlevé equation.

In order to get to the Painlevé differential equation, we need to introduce an extra continuous parameter $t$ . For this we will use an exponential modification of the measure $\mu$ and investigate orthogonal polynomials for the measure $d\mu_{t}(x)=e^{xt}\,d\mu(x)$ , whenever all the moments of this modified measure exist. We will denote the monic orthogonal polynomials by $P_{n}(x;t)$ and in this way the orthogonal polynomial is now a function of three variables $n,x,t$ . The behavior for the parameter $t$ is given by:

Theorem 4.1.

The monic orthogonal polynomials $P_{n}(x;t)$ for the measure $d\mu_{t}(x)=e^{xt}\,d\mu(x)$ satisfy

[TABLE]

where $C_{n}(t)$ depends only on $t$ and $n$ .

Proof.

First of all, since $P_{n}(x;t)$ is a monic polynomial, the derivative $\frac{d}{dt}P_{n}(x;t)$ is a polynomial of degree $\leq n-1$ . We will show that it is orthogonal to $x^{k}$ for $0\leq k\leq n-2$ for the measure $e^{xt}\,d\mu(x)$ , so that it is proportional to $P_{n-1}(x;t)$ , which proves (4.9). We start from the orthogonality relations

[TABLE]

and take derivatives with respect to $t$ to find

[TABLE]

The second integral vanishes for $0\leq k\leq n-2$ by orthogonality, hence

[TABLE]

which is what we needed to prove. ∎

This relation is not new, see e.g. [39, §4], but has not been sufficiently appreciated in the literature. If we now check the compatibility between (4.9) and the three-term recurrence relation (1.2), then we find differential-difference equations for the recurrence coefficients $a_{n}^{2},b_{n}$ .

Theorem 4.2 (Toda equations).

The recurrence coefficients $a_{n}^{2}(t)$ and $b_{n}(t)$ for the orthogonal polynomials $P_{n}(x;t)$ satisfy

[TABLE]

with $a_{0}^{2}=0$ .

Proof.

If we take derivatives with respect to $t$ in the three-term recurrence relations (1.2), then

[TABLE]

Use (4.9) to find

[TABLE]

If we compare this with (1.2) (with $n$ shifted to $n-1$ ), then we find

[TABLE]

From (4.14) we find that $a_{n}^{2}/C_{n}$ does not depend on $n$ , so that $a_{n}^{2}/C_{n}=a_{1}^{2}/C_{1}$ and from (4.12) we find that $C_{1}=-b_{0}^{\prime}(t)$ . A simple exercise shows that $b_{0}^{\prime}(t)=a_{1}^{2}(t)$ so that $C_{n}(t)=-a_{n}^{2}(t)$ . If we use this in (4.13), then we find (4.10). If we use it in (4.12), then we find (4.11). ∎

The system (4.10)–(4.11) is closely related to a chain of interacting particles with exponential interaction with their neighbors, introduced by Toda [41] in 1967. If $x_{n}(t)$ is the position of particle $n$ , then the Toda system of equations is

[TABLE]

The relation with orthogonal polynomials was made by Flaschka [15, 16] and Manakov [28], who suggested the change of variables

[TABLE]

which gives the system (4.10)–(4.11).

If we are dealing with symmetric orthogonal polynomials, i.e., when the measure is symmetric and all the odd moments are zero, then the three-term recurrence relation simplifies to

[TABLE]

A symmetric modification of the measure is given by $d\mu_{t}(x)=e^{tx^{2}}\,d\mu(x)$ and the relation becomes

[TABLE]

The compatibility between (4.15) and (4.16) then gives:

Theorem 4.3 (Langmuir lattice).

Let $\mu$ be a symmetric positive measure on $\mathbb{R}$ for which all the moments exist and let $\mu_{t}$ be the measure for which $d\mu_{t}(x)=e^{tx^{2}}\,d\mu(x)$ , where $t\in\mathbb{R}$ is such that all the moments of $\mu_{t}$ exist. Then the recurrence coefficients of the orthogonal polynomials for $\mu_{t}$ satisfy the differential-difference equations

[TABLE]

Proof.

If we differentiate (4.15) with respect to $t$ and then use (4.16), then we find

[TABLE]

Comparing with (4.15) (with $n$ replaced by $n-2$ ) gives

[TABLE]

From (4.19) it follows that $a_{n}^{2}a_{n-1}^{2}/C_{n}$ is constant and therefore equal to $a_{2}^{2}a_{1}^{2}/C_{2}$ . Now $C_{2}(t)=-(a_{1}^{2})^{\prime}$ and one can easily compute $a_{1}^{2},a_{2}^{2}$ and $(a_{1}^{2})^{\prime}$ in terms of the moments $m_{0},m_{2},m_{4}$ to find that $a_{2}^{2}a_{1}^{2}/C_{2}=-1$ , so that $a_{n}^{2}a_{n-1}^{2}=-C_{n}$ . If one uses this in (4.18), then one finds (4.17). ∎

This differential-difference equation is known as the Langmuir lattice or the Kac-van Moerbeke lattice. We will now illustrate this with a number of explicit examples.

4.2. Discrete Painlevé I

Let us consider orthogonal polynomials for the weight function $w(x)=e^{-x^{4}+tx^{2}}$ on $(-\infty,\infty)$ . The symmetry $w(-x)=w(x)$ of this weight function implies that the recurrence coefficients $b_{n}$ in (1.1) or (1.2) vanish and the three-term recurrence relation is (4.15). The orthogonal polynomials also have a nice differential property: the structure relation is

[TABLE]

for certain sequences $(A_{n})_{n}$ and $(C_{n})_{n}$ . Indeed, we can express $P_{n}^{\prime}$ in terms of the orthogonal polynomials as

[TABLE]

where

[TABLE]

Using integration by parts gives

[TABLE]

and the last two integrals are zero for $0\leq k<n-3$ by orthogonality, so that only $c_{n,n-1}$ , $c_{n,n-2}$ and $c_{n,n-3}$ are left. The symmetry of $w$ implies that $P_{2n}(x)$ is an even polynomial and $P_{2n+1}(x)$ is an odd polynomial for every $n$ , hence $c_{n,n-2}=0$ . Taking $A_{n}=c_{n,n-1}$ and $C_{n}=c_{n,n-3}$ then gives the structure relation.

We now have a recurrence relation (4.15) which describes the behavior of $P_{n}(x)$ in the (discrete) variable $n$ , and a structure relation (4.20) which describes the behavior of $P_{n}(x)$ in the (continuous) variable $x$ . Both relations have to be compatible: if we differentiate (4.15) and then use (4.20) to replace all the derivatives, then comparing coefficients of the polynomials $p_{k}$ gives the compatibility relations

[TABLE]

This simple non-linear recurrence relation is known as discrete Painlevé I ( $\textrm{d-P}_{\scriptstyle\textrm{I}}$ ) and is a special case of (4.5) we gave earlier. This particular equation was already in work of Shohat [37] in 1939, who extended earlier work of Laguerre [25] from 1885. Later it was obtained again by Freud [18] in 1976, who was unaware of the work of Shohat. The special positive solution needed to get the recurrence coefficients was analyzed by Nevai [32] and Lew and Quarles [26]. An asymptotic expansion was found by Máté-Nevai-Zaslavsky [30]. Only later (in 1991) it was recognized as a discrete Painlevé equation by Fokas, Its and Kitaev [17] who coined the name $\textrm{d-P}_{\scriptstyle\textrm{I}}$ . Magnus [27] used the extra parameter $t$ and showed that, as a function of $t$ , the recurrence coefficient $a_{n}(t)$ satisfies the differential equation Painlevé IV, as we will see later.

The discrete Painlevé equation (4.21) easily allows to find the asymptotic behavior as $n\to\infty$ :

Theorem 4.4 (Freud).

The recurrence coefficients for the weight $w(x)=e^{-x^{4}+tx^{2}}$ on $(-\infty,\infty)$ satisfy

[TABLE]

Observe that (4.21) is a second order recurrence relation, so one needs two initial conditions $a_{0}$ and $a_{1}$ to generate all the recurrence coefficients. It turns out that the recurrence coefficients are a special solution with $a_{0}=0$ for which all $a_{n}$ are positive for $n\geq 1$ . This means that there is only one special initial value $a_{1}$ that gives a positive solution. Put $x_{n}=a_{n}^{2}$ , then (for $t=0$ )

[TABLE]

Theorem 4.5 (Lew and Quarles, Nevai).

There is a unique solution of (4.22) for which $x_{0}=0$ and $x_{n}>0$ for all $n\geq 1$ .

Hence one should not use this recurrence relation (4.22) to generate the recurrence coefficients starting from $x_{0}=0$ and $x_{1}$ , because a small error in $x_{1}$ will produce a sequence for which not all the terms are positive. A small perturbation in the initial condition $x_{1}$ has a very important effect on the solution as $n\to\infty$ . This is not unusual for non-linear recurrence relations. Instead it is better to generate the positive solution by using a fixed point algorithm, because the positive solution turns out to be the fixed point of a contraction in an appropriate normed space of infinite sequences. See, e.g., [46, §2.3].

4.3. Langmuir lattice and Painlevé IV

We will modify the measure $\mu$ by multiplying it with the symmetric function $e^{tx^{2}}$ , where $t$ is a real parameter. This gives the Langmuir lattice (4.17). We can combine this with the discrete Painlevé equation (4.21) to find a differential equation for $a_{n}^{2}(t)$ as a function of the variable $t$ . Put $a_{n}^{2}=x_{n}$ , then

[TABLE]

where the ′ denotes the derivative with respect to $t$ . Differentiate (4.24) to find

[TABLE]

Replace $x_{n+1}^{\prime}$ and $x_{n-1}^{\prime}$ by (4.24), then

[TABLE]

Eliminate $x_{n+1}$ and $x_{n-1}$ using (4.23)–(4.24) to find

[TABLE]

This is Painlevé IV if we use the transformation $2x_{n}(t)=y(-t/2)$ . This means that Painlevé IV has a solution which can be described completely in terms of the moments of $w(x)=e^{-x^{4}+tx^{2}}$ , since $a_{n}^{2}=\gamma_{n-1}^{2}/\gamma_{n}^{2}$ and by (1.5) $\gamma_{n}^{2}=D_{n}/D_{n+1}$ , where $D_{n}$ is the Hankel determinant (1.3) containing the moments. Notice that all the odd moments $m_{2n+1}$ are zero, and for the even moments one has

[TABLE]

Hence the special solution $a_{n}^{2}(t)$ of Painlevé IV is in terms of $m_{0}(t)$ only, and this is a special function:

[TABLE]

where $D_{-1/2}$ is a parabolic cylinder function.

4.4. Singularity confinement

In this section we will explain the notion of singularity confinement for the discrete Painlevé I equation

[TABLE]

From this equation one finds

[TABLE]

If $x_{n}=0$ then $x_{n+1}$ becomes infinite. This need not be a problem, but problems arise later when we have to add or subtract infinities. So we need to be careful and suppose that $x_{n}=\epsilon$ is small. Then

[TABLE]

and

[TABLE]

and

[TABLE]

and one more

[TABLE]

and for $\epsilon\to 0$ we see that $x_{n+4}$ is finite again and recovers the value $x_{n-1}$ we had before we started to get singularities. The singularities are confined to $x_{n+1}$ and $x_{n+2}$ and one can continue the recurrence relation from $x_{n+4}$ . This has some meaning in terms of the orthogonal polynomials for the weight $e^{-x^{4}}$ , but we have to consider this weight on the set $\mathbb{R}\cup i\mathbb{R}$ and look for orthogonal polynomials $(R_{n})_{n}$ for which

[TABLE]

with $\alpha,\beta>0$ . They satisfy the recurrence relation

[TABLE]

and the recurrence coefficients $(c_{n})_{n}$ still satisfy (4.22) but with initial condition $c_{0}=0$ and $c_{1}=\frac{(\alpha-\beta)m_{2}}{(\alpha+\beta)m_{0}}$ . If $\alpha=\beta$ then $c_{1}=0$ generates a singularity for $\textrm{d-P}_{\small\textrm{I}}$ and gives $c_{2}=\infty$ , hence $R_{3}$ does not exist if we define it using (1.4). The singularity, however, is confined to a finite number of terms. We have

Property 4.6.

For $\alpha=\beta$ one has $D_{4n-1}=D_{4n-2}=0$ for the Hankel determinants, so that $R_{4n-1}$ and $R_{4n-2}$ as defined by (1.4) do not exist for $n\geq 1$ . Furthermore

[TABLE]

The polynomials $r_{n}$ and $s_{n}$ can be identified as Laguerre polynomials with parameter $\alpha=-3/4$ and $\alpha=1/4$ respectively. The problem with $R_{4n-1}$ and $R_{4n-2}$ is not so much that they do not exist, but rather that they are not unique.

*Exercise**.*

Show that for every $a\in\mathbb{R}$ the polynomials $(x^{2}+ax)s_{n}(x^{4})$ are monic polynomials of degree $4n+2$ that are orthogonal to $x^{k}$ for $0\leq k\leq 4n+1$ , so that the monic orthogonal polynomial $R_{4n+2}$ is not unique. In a similar way $(x^{3}+ax^{2}+bx)s_{n}(x^{4})$ are monic polynomials of degree $4n+3$ that are orthogonal to $x^{k}$ for $0\leq k\leq 4n+2$ for every $a,b\in\mathbb{R}$ so that the monic orthogonal polynomial $R_{4n+3}$ is not unique.

4.5. Generalized Charlier polynomials

Our next example is a family of discrete orthogonal polynomials $P_{n}(x)$ , which satisfy

[TABLE]

Without the factor $(\beta)_{k}$ the polynomials are the Charlier polynomials, but with the factor $(\beta)_{k}$ we have a semiclassical family of discrete orthogonal polynomials. The case $\beta=1$ was investigated in [47] and the general case in [38], see also [46, §3.2]. The structure relation for discrete orthogonal polynomials is now in terms of a difference operator instead of a differential operator. For these generalized Charlier polynomials it is

[TABLE]

where $\Delta$ is the forward difference operator acting on a function $f$ by

[TABLE]

and $(A_{n})_{n}$ and $(B_{n})_{n}$ are certain sequences. If one works out the compatibility of (1.2) and (4.25), then one finds

[TABLE]

This corresponds to a limiting case of discrete Painlevé with surface/symmetry $D_{4}^{(1)}$ in Sakai’s classification.

If we put $c=c_{0}e^{t}$ , then the weights with parameter $c$ are a Toda modification of the weights with parameter $c_{0}$ ,

[TABLE]

and hence the recurrence coefficients satisfy the Toda equations given in Theorem 4.2. Put $x_{n}(t)=a_{n}^{2}$ and $y_{n}(t)=b_{n}$ , then

[TABLE]

and if $x_{n}^{\prime}=dx_{n}/dc$ , $y_{n}^{\prime}=dy_{n}/dc$ , the Toda lattice equations are

[TABLE]

Eliminate $y_{n-1}$ and $x_{n+1}$ (this requires quite a few computations) and put $x_{n}=\frac{c}{1-y}$ , then $y(c)$ satisfies (after even more computations)

[TABLE]

This is a Painlevé V differential equation as in (4.4) with $\delta=0$ . Such an equation can always be transformed to Painlevé III.

4.6. Discrete Painlevé II

We will now give an example of a family of orthogonal polynomials on the unit circle, for which the recurrence coefficients satisfy a discrete Painlevé equation. Orthogonal polynomials on the unit circle (OPUC) are defined by the orthogonality relations

[TABLE]

where $\kappa_{n}>0$ . We denote the monic polynomials by $\Phi_{n}=\varphi_{n}/\kappa_{n}$ . They satisfy a nice recurrence relation

[TABLE]

where $\Phi_{n}^{*}(z)=z^{n}\overline{\Phi}_{n}(1/z)$ is the reversed polynomial. The recurrence coefficients $\alpha_{n}=-\overline{\Phi_{n+1}(0)}$ are nowadays known as Verblunsky coefficients, but earlier they were also known as Schur parameters or reflection coefficients. Let $v(\theta)=e^{t\cos\theta}$ for $\theta\in[-\pi,\pi]$ . The trigonometric moments for this weight function are modified Bessel functions

[TABLE]

which is why Ismail [20, Example 8.4.3] calls them modified Bessel polynomials. The symmetry $v(-\theta)=v(\theta)$ implies that $\alpha_{n}(t)$ are real-valued. If we write

[TABLE]

then

[TABLE]

and this function satisfies the Pearson equation

[TABLE]

As a consequence the orthogonal polynomials satisfy a structure relation:

Property 4.7.

The monic orthogonal polynomials for $v(\theta)=e^{t\cos\theta}$ satisfy

[TABLE]

for some sequence $(B_{n})_{n}$ . In fact, one has

[TABLE]

We now have two equations: the recurrence relation (4.26) and the structure relation (4.27), and we can check their compatibility. They will be compatible if the recurrence coefficients satisfy the following non-linear relation:

Theorem 4.8 (Periwal and Shevitz [35]).

The Verblunsky coefficients for the weight $v(\theta)=e^{t\cos\theta}$ satisfy

[TABLE]

with initial values

[TABLE]

Let $x_{n}=\alpha_{n-1}$ , then

[TABLE]

and this is a particular case of discrete Painlevé II ( $\textrm{d-P}_{\scriptstyle\textrm{II}}$ ) given in (4.6). We need a solution with $x_{0}=-1$ and $|x_{n}|<1$ for $n\geq 1$ , because for Verblunsky coefficients one always has $|\alpha_{n}|<1$ . Such a solution is unique.

Theorem 4.9.

Suppose $\alpha>0$ . Then there is a unique solution of (4.28) for which $x_{0}=-1$ and $-1<x_{n}<1$ . The solution corresponds to $x_{1}=I_{1}(-2/\alpha)/I_{0}(-2/\alpha)$ and is negative for every $n\geq 0$ .

A proof of this result can be found in [46, §3.3] for $\alpha>1$ ; a proof for $0<\alpha\leq 1$ has not been published and we invite the reader to come up with such a proof. This special solution converges to zero (fast).

4.7. The Ablowitz-Ladik lattice and Painlevé III

The lattice equations corresponding to orthogonal polynomials on the unit circle are the Ablowitz-Ladik lattice equations (or the Schur flow).

Theorem 4.10.

Let $\nu$ be a positive measure on the unit circle which is symmetric (the Verblunsky coefficients are real). Let $\nu_{t}$ be the modified measure $d\nu_{t}(\theta)=e^{t\cos\theta}\,d\nu(\theta)$ , with $t\in\mathbb{R}$ . The Verblunsky coefficients $(\alpha_{n}(t))_{n}$ for the measure $\nu_{t}$ then satisfy

[TABLE]

We can now combine the discrete Painlevé II equation

[TABLE]

with the Ablowitz-Ladik equation

[TABLE]

Eliminate $\alpha_{n+1}$ and $\alpha_{n-1}$ to find

[TABLE]

*Exercise**.*

If one puts $\alpha_{n}=\frac{1+y}{1-y}$ , then show that $y$ satisfies the Painlevé V differential equation (4.4) with $\gamma=0$ .

Painlevé V with $\gamma=0$ can always be transformed to Painlevé III. A direct approach was given by Hisakado [19] and Tracy and Widom [42]. They showed that the ratio $w_{n}(t)=\alpha_{n}(t)/\alpha_{n-1}(t)$ satisfies Painlevé III.

4.8. Some more examples

Several more examples have been worked out in the literature the past few years. Here is a short sample.

4.8.1. Generalized Meixner polynomials

These are discrete orthogonal polynomials

[TABLE]

which were considered in [38, 14, 8]. Put $a_{n}^{2}=na-(\gamma-1)u_{n}$ , and $b_{n}=n+\gamma-\beta+a-\frac{\gamma-1}{a}v_{n}$ , then

[TABLE]

The initial values are

[TABLE]

where $M(a,b,z)$ is Kummer’s confluent hypergeometric function. This is asymmetric discrete Painlevé IV or $\textrm{d-P}(E_{6}^{(1)}/A_{2}^{(1)})$ . If we put

[TABLE]

then

[TABLE]

with

[TABLE]

which is Painlevé V given in (4.4).

4.8.2. Modified Laguerre polynomials

Chen and Its [6] (see also [46, §4.4]) looked at orthogonal polynomials for the weight function $w(x)=x^{\alpha}e^{-x}e^{-t/x}$ on $[0,\infty)$ . This is a modification of the Laguerre weight with an exponential function that has an essential singularity at [math]. Put $b_{n}=2n+\alpha+1+c_{n}$ , $a_{n}^{2}=n(n+\alpha)+y_{n}+\sum_{j=0}^{n-1}c_{j}$ , and $c_{n}=1/x_{n}$ , then

[TABLE]

This corresponds to the discrete Painlevé equation $\textrm{d-P}((2A_{1})^{(1)}/D_{6}^{(1)})$ . The exponential modification is not of Toda type but belongs to a similar class of modifications (the Toda hierarchy). With some effort one can find the differential equation

[TABLE]

which is Painlevé III given in (4.2).

4.8.3. Modified Jacobi polynomials

Basor, Chen and Ehrhardt [3] (see also [46, §5.2]) considered the weight $w(x)=(1-x)^{\alpha}(1+x)^{\beta}e^{-tx}$ . This is a Toda modification of the weight function for Jacobi polynomials. In this case one has

[TABLE]

where $r_{n}$ and $R_{n}$ satisfy the recurrence relations

[TABLE]

and for $y=1+t/R_{n}$ one has the differential equation

[TABLE]

which is Painlevé V given in (4.4).

4.8.4. $q$ -orthogonal polynomials

There are also examples of families of $q$ -orthogonal polynomials for which one can find $q$ -discrete Painlevé equations for the recurrence coefficients. In this case the structure relation uses the $q$ -difference operator $D_{q}$ for which

[TABLE]

If we consider the weight

[TABLE]

then the recurrence coefficients (after some transformation) satisfy $q$ -discrete Painlevé III

[TABLE]

For the weight

[TABLE]

one finds $q$ -discrete Painlevé V

[TABLE]

and for

[TABLE]

one again finds $q$ -discrete Painlevé V. Observe that sometimes the weights are on $[0,\infty)$ but they can also be on the discrete set $\{q^{n},n\in\mathbb{N}\}$ . See [46, §5.4] for more details.

4.9. Wronskians and special function solutions

There is a good explanation why these Toda modifications of orthogonal polynomials often give rise to Painlevé differential equations. In fact the solutions that we need for the recurrence coefficients are special solutions of the Painlevé equations in terms of special functions, such as the Airy functions, the Bessel functions, parabolic cylinder functions, the confluent hypergeometric function and the hypergeometric function. Such special function solutions are often in terms of Wronskians of one of these special functions. We can easily explain where these Wronskians are coming from, by using the theory of orthogonal polynomials. Indeed, we return to our Hankel determinants $D_{n}$ given in (1.3). They contain the moments $m_{n}$ , which for a Toda modification are

[TABLE]

Hence all the moments are obtained from the moment $m_{0}(t)$ by differentiation, and the Hankel determinant (1.3) becomes

[TABLE]

which is the Wronskian of the functions $m_{0},m_{0}^{\prime},m_{0}^{\prime\prime},\ldots,m_{0}^{(n-1)}$ ,

[TABLE]

The recurrence coefficient $a_{n}^{2}$ can be expressed in terms of these Hankel determinants as

[TABLE]

where we used (1.5). The recurrence coefficients $b_{n}$ can also be found in terms of determinants. If we write $P_{n}(x)=x^{n}+\delta_{n}x^{n-1}+\cdots$ and compare the coefficients of $x^{n}$ in the recurrence relation (1.2), then $b_{n}=\delta_{n}-\delta_{n+1}$ . The coefficient $\delta_{n}$ can be obtained from (1.4) from which we see that $\delta_{n}=-D_{n}^{*}/D_{n}$ , where $D_{n}^{*}$ is obtained from $D_{n}$ by replacing the last column $(m_{n-1},m_{n},\ldots,m_{(2n-2)})^{T}$ by moments of one order higher $(m_{n},m_{n+1},\ldots,m_{2n-1})^{T}$ . If we take a derivative of the Wronskian, then

[TABLE]

so that

[TABLE]

This gives explicit expressions of the recurrence coefficients $a_{n}^{2}(t)$ and $b_{n}(t)$ in terms of Wronskians generated from one seed function $m_{0}(t)$ .

Acknowledgement

Many thanks to Mama Foupouagnigni and Wolfram Koepf for organizing the workshop Introduction to Orthogonal Polynomials and Applications in Douala, Cameroon, and for encouraging me to write this survey. Also thanks to Arno Kuijlaars with whom I am sharing a course on Orthogonal Polynomials and Random Matrices at KU Leuven, which was very useful for the material in Section 2.

Bibliography48

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] G. Anderson, A. Guionnet, O. Zeitouni, An Introduction to Random Matrices , Cambridge Studies in Advanced Mathematics 118 , Cambridge University Press, 2010.
2[2] A.I. Aptekarev, Multiple orthogonal polynomials , J. Comput. Appl. Math. 99 (1998), no. 1–2, 423–447.
3[3] E. Basor, Y. Chen, T. Ehrhardt, Painlevé V and time dependent Jacobi polynomials , J. Phys. A: Math. Theor. 43 (20 10), no. 1, 015204 (25 pp.).
4[4] P.M. Bleher, A.B.J. Kuijlaars, Integral representations for multiple Hermite and multiple Laguerre polynomials , Ann. Inst. Fourier, Grenoble 55 (2005), no. 6, 2001–2014.
5[5] P.M. Bleher, A.B.J. Kuijlaars, Random matrices with external source and multiple orthogonal polynomials , International Mathematics Research Notices 2004 , no. 3, 109–129.
6[6] Y. Chen, A. Its, Painlevé III and a singular linear statistics in Hermitian random matrix ensembles , J. Approx. Theory 162 (2010), no. 2, 270–297.
7[7] P.A. Clarkson, Painlevé equations — nonlinear special functions , Lecture Notes in Mathematics 1883 , Springer, Berlin, 2006, pp. 331–411.
8[8] P.A. Clarkson, Recurrence coefficients for discrete orthogonal polynomials and the Painlevé equations , J. Phys. A: Math. Theor. 46 (2013), no. 18, 185205 (18 pp.).

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Orthogonal and multiple orthogonal polynomials, random matrices, and Painlevé equations

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

Contents

1. Introduction

Property 1.1**.**

Property 1.2**.**

2. Orthogonal polynomials and random matrices

Property 2.1** (Heine).**

Proof.

2.1. Point processes

Property 2.2**.**

Proof.

Property 2.3**.**

Proof.

2.2. Determinantal point process

Definition 2.4**.**

Property 2.5**.**

Proof.

Definition 2.6**.**

Theorem 2.7**.**

2.3. Random matrices

Theorem 2.8** (Weyl integration formula).**

Definition 2.9**.**

Theorem 2.10** (Weyl integration formula for class functions).**

2.4. Random matrix ensembles

Exercise*.*

3. Multiple orthogonal polynomials

Definition 3.1** (type I).**

Definition 3.2** (type II).**

3.1. Special systems

Definition 3.3** (Angelesco system).**

Theorem 3.4** (Angelesco, Nikishin).**

Corollary 3.5**.**

Exercise*.*

Definition 3.6**.**

Definition 3.7** (AT-system).**

Theorem 3.8**.**

Corollary 3.9**.**

Definition 3.10** (Nikishin system for r=2r=2r=2).**

Theorem 3.11** (Nikishin, Driver-Stahl).**

Definition 3.12** (Nikishin system for general rrr).**

Theorem 3.13**.**

Property 3.14** (biorthogonality).**

3.2. Nearest neighbor recurrence relations

Theorem 3.15** (Van Assche [45]).**

3.3. Christoffel-Darboux formula

Theorem 3.16** (Daems and Kuijlaars).**

Proof.

3.4. Hermite-Padé approximation

Definition 3.17** (Type I Hermite-Padé approximation).**

Definition 3.18** (Type II Hermite-Padé approximation).**

3.5. Multiple Hermite polynomials

Exercise*.*

3.5.1. Random matrices

Property 3.19**.**

Property 3.20**.**

Property 3.21**.**

3.5.2. Non-intersecting Brownian motions

3.6. Multiple Laguerre polynomials

3.6.1. Multiple Laguerre polynomials of the first kind

3.6.2. Multiple Laguerre polynomials of the second kind

3.6.3. Random matrices: Wishart ensemble

3.7. Jacobi-Piñeiro polynomials

4. Orthogonal polynomials and Painlevé equations

4.1. Compatibility and Lax pairs

Theorem 4.1**.**

Proof.

Theorem 4.2** (Toda equations).**

Proof.

Theorem 4.3** (Langmuir lattice).**

Proof.

4.2. Discrete Painlevé I

Property 1.1.

Property 1.2.

Property 2.1 (Heine).

Property 2.2.

Property 2.3.

Definition 2.4.

Property 2.5.

Definition 2.6.

Theorem 2.7.

Theorem 2.8 (Weyl integration formula).

Definition 2.9.

Theorem 2.10 (Weyl integration formula for class functions).

*Exercise**.*

Definition 3.1 (type I).

Definition 3.2 (type II).

Definition 3.3 (Angelesco system).

Theorem 3.4 (Angelesco, Nikishin).

Corollary 3.5.

*Exercise**.*

Definition 3.6.

Definition 3.7 (AT-system).

Theorem 3.8.

Corollary 3.9.

Definition 3.10 (Nikishin system for $r=2$ ).

Theorem 3.11 (Nikishin, Driver-Stahl).

Definition 3.12 (Nikishin system for general $r$ ).

Theorem 3.13.

Property 3.14 (biorthogonality).

Theorem 3.15 (Van Assche [45]).

Theorem 3.16 (Daems and Kuijlaars).

Definition 3.17 (Type I Hermite-Padé approximation).

Definition 3.18 (Type II Hermite-Padé approximation).

*Exercise**.*

Property 3.19.

Property 3.20.

Property 3.21.

Theorem 4.1.

Theorem 4.2 (Toda equations).

Theorem 4.3 (Langmuir lattice).

Theorem 4.4 (Freud).

Theorem 4.5 (Lew and Quarles, Nevai).

Property 4.6.

*Exercise**.*

Property 4.7.

Theorem 4.8 (Periwal and Shevitz [35]).

Theorem 4.9.

Theorem 4.10.

*Exercise**.*

4.8.4. $q$ -orthogonal polynomials