The Gauss quadrature for general linear functionals, Lanczos algorithm, and minimal partial realization
Stefano Pozza, Miroslav S. Prani\'c

TL;DR
This paper reviews the generalization of Gauss quadrature to complex linear functionals, exploring its connections with formal orthogonal polynomials, non-Hermitian Lanczos algorithms, and minimal partial realization, providing new proofs of key theorems.
Contribution
It offers a comprehensive survey of Gauss quadrature for linear functionals, highlighting its links with various mathematical concepts and presenting original proofs of important theorems.
Findings
Connections between Gauss quadrature and formal orthogonal polynomials clarified.
Relationship with non-Hermitian Lanczos algorithm and minimal partial realization established.
Original proofs of the Mismatch Theorem and Matching Moment Property provided.
Abstract
The concept of Gauss quadrature can be generalized to approximate linear functionals with complex moments. Following the existing literature, this survey will revisit such generalization. It is well known that the (classical) Gauss quadrature for positive definite linear functionals is connected with orthogonal polynomials, and with the (Hermitian) Lanczos algorithm. Analogously, the Gauss quadrature for linear functionals is connected with formal orthogonal polynomials, and with the non-Hermitian Lanczos algorithm with look-ahead strategy; moreover, it is related to the minimal partial realization problem. We will review these connections pointing out the relationships between several results established independently in related contexts. Original proofs of the Mismatch Theorem and of the Matching Moment Property are given by using the properties of formal orthogonal polynomials and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
∎
11institutetext: S. Pozza 22institutetext: Faculty of Mathematics and Physics, Charles University, Sokolovská 83, 186 75 Praha 8, Czech Republic. Associated member of ISTI-CNR, Pisa, Italy, and member of INdAM-GNCS group, Italy.
22email: [email protected] 33institutetext: M. Pranić 44institutetext: Department of Mathematics and Informatics, University of Banja Luka, Faculty of Science, M. Stojanovića 2, 51000 Banja Luka, Bosnia and Herzegovina.
The Gauss quadrature for general linear functionals, Lanczos algorithm, and minimal partial realization
Stefano Pozza
Miroslav Pranić
(Received: date / Accepted: date)
Abstract
The concept of Gauss quadrature can be generalized to approximate linear functionals with complex moments. Following the existing literature, this survey will revisit such generalization. It is well known that the (classical) Gauss quadrature for positive definite linear functionals is connected with orthogonal polynomials, and with the (Hermitian) Lanczos algorithm. Analogously, the Gauss quadrature for linear functionals is connected with formal orthogonal polynomials, and with the non-Hermitian Lanczos algorithm with look-ahead strategy; moreover, it is related to the minimal partial realization problem. We will review these connections pointing out the relationships between several results established independently in related contexts. Original proofs of the Mismatch Theorem and of the Matching Moment Property are given by using the properties of formal orthogonal polynomials and the Gauss quadrature for linear functionals.
Keywords:
Linear functionals Matching moments Gauss quadrature Formal orthogonal polynomials Minimal realization Look-ahead Lanczos algorithm Mismatch Theorem.
1 Introduction
Let be an Hermitian positive definite matrix and a vector so that , where is the conjugate transpose of . Consider the specific linear functional on the space of polynomials defined by
[TABLE]
where are real numbers known as the moments of . The functional can be expressed as the Riemann-Stieltjes integral with a non-decreasing positive distribution function supported on the real axis having finitely many points of increase; see, e.g, (LieStrBook13, , Section 3.5),(GolMeuBook10, , Section 7.1), and (ChiBook78, , Chapter II, Section 3). For , the -node (classical) Gauss quadrature approximating is given by the unique -node quadrature formula which matches the first moments, i.e.,
[TABLE]
with positive weights and positive distinct nodes. Classical results of the Gauss quadrature can be found, e.g., in (SzeBook39, , Chapters III and XV), (ChiBook78, , Chapter I, Section 6), Gau81 , (GauBook04, , Section 1.4), (gautschi2011numerical, , Chapter 3.2), (LieStrBook13, , Section 3.2). The linear functional can be associated with a Jacobi matrix which is an real symmetric tridiagonal matrix. For every function defined on the spectrum of and , the matrix gives an algebraic expression for the Gauss quadrature, i.e.,
[TABLE]
where and are matrix functions, and is the first vector of the Euclidean basis (with the transpose). The matrix can be obtained by iterations of the Hermitian Lanczos algorithm with inputs and . Indeed, , where is the matrix given by the Lanczos algorithm whose columns are an orthonormal basis of the Krylov subspace . Hence the Hermitian Lanczos algorithm with input gives a matrix formulation of the Gauss quadrature for . Figure 1 (see (LieStrBook13, , Figure 3.2)) represents the connections described above. Such connections can be derived by the properties of orthogonal polynomials; a detailed explanation can be found, e.g., in (LieStrBook13, , Chapter 3) and GolMeuBook10 (note that the relationships between the Conjugate Gradient method, Lanczos algorithm, and orthogonal polynomials were already pointed out by Hestenes and Stiefel in their seminal paper published in 1952 (HesSti52, , Sections 14–17)).
This survey deals with the extension of the connections summarized in Figure 1 to the case of a general linear functional defined on the space of the polynomials with generally complex coefficients, . We point out that, if not specified otherwise, we will consider linear functionals without the underlying assumption that they are determined by a matrix bilinear form analogous to (1.1). The survey will revisit the Gauss quadrature for linear functionals, its matrix formulation, its connection with the non-Hermitian Lanczos algorithm with look-ahead strategy, and its relationship with the minimal partial realization problem. Furthermore, the connections between the incurable breakdown, the exactness of the Gauss quadrature, and the minimal realization problem, will be examined with giving an original proof of the Mismatch Theorem (first proved in (Tay82, , Theorem 4.2)). The proof easily follows from the properties we will present, providing a different interpretation of the Theorem in terms of formal orthogonal polynomials roots and nodes of the Gauss quadrature for linear functionals.
Information about the topics mentioned above and their mutual relationships are scattered in the literature. The survey aims to describe such topics and their connections from the point of view of formal orthogonal polynomials. We hope that such a presentation will be of interest for readers working in related different areas.
Regarding the formal orthogonal polynomials and the Gauss quadrature generalization, we will mainly follow the book DraBook83 by Draux where the Gauss quadrature is extended for the approximation of real-valued linear functionals. More precisely, a straightforward extension of Draux’s definition to the case of complex-valued linear functionals will be presented. The more recent Gauss quadrature definitions in Mil03 and in PozPraStr16 ; PozPraStr18 , obtained independently of DraBook83 , can be seen as a generalization to the complex quasi-definite case. Indeed, for a real quasi-definite linear functional the quadratures in DraBook83 ; Mil03 ; PozPraStr16 ; PozPraStr18 are equivalent. However, some results in PozPraStr16 ; PozPraStr18 do not have a counterpart in the real setting of DraBook83 (for instance, formal orthonormal polynomials may have complex coefficients). The case of a quasi-definite linear functional is simpler to treat; see, e.g., ChiBook78 ; PozPraStr16 ; PozPraStr18 . The survey will first recall the primary results associated with quasi-definite functionals and then deal with the case of a general linear functional.
This survey approaches the Lanczos algorithm in a finite dimensional setting. Hence we will not treat infinite dimensional problems. For infinite dimensional problems related to positive definite linear functionals refer, e.g., to (ChiBook78, , Chapter II, Section 3, in particular Theorem 3.1). For the relationship with infinite dimensional Krylov subspace methods refer, e.g., to VorBook65 , GunHerSac14 , and (malek2015, , Chapter 5) where many references to original works can be found.
Throughout the survey, we will consider only computations in exact arithmetic. Since rounding errors substantially affect computations with short recurrences, the results described in this survey cannot be applied to finite precision computations without a thorough analysis. Such analysis is out of the scope of this survey. The interested reader can refer to Bai94 and Day93 ; Day97 for analysis of the non-Hermitian Lanczos algorithm in finite precision (assuming no breakdown); see also the related works BaiDayYe99 ; TonYe00 ; PaiPanZem14 . As pointed out in (LieStrBook13, , Sections 2.5.6 and 5.11), in finite precision arithmetic the short recurrences cannot preserve the biorthogonality or even the linear independence of the computed Krylov subspace basis. Therefore look-ahead techniques for the non-Hermitian Lanczos have a limited impact in computing sufficiently well-conditioned basis when dealing with the loss of biorthogonality. The interplay of look-ahead techniques and rounding errors in practical computations is still an open issue.
The paper is organized as follows. Section 2 summarizes basic results of quasi-definite linear functionals. Section 3 recalls properties of formal orthogonal polynomials and of quasi-orthogonal polynomials with respect to a linear functional . The concept of Gauss quadrature for linear functionals and its matrix interpretation can be found respectively in Section 4 and Section 5. The Gauss quadrature connections with the minimal partial realization problem and with the look-ahead Lanczos algorithm are described respectively in Section 6 and Section 7. Section 8 concludes the survey summarizing the links between the Gauss quadrature, minimal partial realization, and look-ahead Lanczos algorithm.
2 Quasi-definite linear functionals
We start recalling several results for quasi-definite linear functionals following the description in our previous works with Zdeněk Strakoš PozPraStr16 ; PozPraStr18 . These results will be extended to the more challenging general case in the remaining sections.
Let be a linear functional with complex moments,
[TABLE]
An -degree polynomial is called formal orthogonal polynomial (FOP) when it satisfies the orthogonality conditions with respect to
[TABLE]
refer, e.g., to (DraBook83, , Introduction and Section 1.1) and (Bre02, , Chapter 2). Notice that in BreBook80 is referred as general orthogonal polynomial; cf. the concept of weak orthogonal polynomial in (Krall1966, , definition on p. 137) and (KwoLit97, , Section 2). The subindex in the polynomial notation will always stand for the degree of the polynomial and we will not emphasize it further on. Moreover, whenever appropriate the argument will be skipped for simplicity of notation.
Denoting with the subspace of polynomials of degree at most , the following classes of linear functionals can be defined (see, e.g., Theorem 3.1, Definition 3.2, Theorem 3.4 and the subsequent Corollary in (ChiBook78, , Chapter I), Theorem 1 and the subsequent Remark in (LorWaaBook92, , Chapter VII)).
Definition 2.1
The linear functional is said to be quasi-definite on if there exist unique FOPs (we always use the term “unique” for a polynomial in the sense of unique up to multiplication by a nonzero scalar) satisfying the conditions
[TABLE]
A linear functional is said to be positive definite on if in addition are real polynomials, and , for .
Note that the definition above is equivalent to the ones in (PozPraStr16, , Definition 2.1 and Definition 3.1) and (PozPraStr18, , Definition 1.1).
If a FOP is such that , then it is a formal orthonormal polynomial. A beautiful summary about FOPs in the quasi-definite case can be found in the book by Chihara ChiBook78 (notice that Chihara used the simplified term orthogonal polynomials instead of formal orthogonal polynomials). A sequence of formal orthonormal polynomials satisfy the three-term recurrence
[TABLE]
where , , and the coefficients , are given by
[TABLE]
see, e.g., (ChiBook78, , Chapter I, Section 4), (BreBook80, , Theorem 2.4). Notice that in order to avoid ambiguity, we always take the principal value of the complex square root, i.e., we consider . The recurrences (2.2) can be written in the compact form as
[TABLE]
where , is the th vector of the Euclidean basis, and is the th complex Jacobi matrix
[TABLE]
more information about complex Jacobi matrices and their properties can be found, e.g., in Bec01 and in (PozPraStr16, , in particular Section 4).
Given a smooth enough function , the Gauss quadrature for quasi-definite linear functionals considered in PozPraStr16 ; PozPraStr18 has the form
[TABLE]
and satisfies the following properties.
- •
G1: the quadrature has maximal degree of exactness , i.e., it is exact for all polynomials of degree at most ;
- •
G2: the quadrature is well-defined and it is unique. Moreover, Gauss quadratures with a smaller number of weights also exist and they are unique;
- •
G3: the quadrature can be written as the matrix form , where is the complex Jacobi matrix associated with .
A quadrature having properties G1, G2 and G3 exists if and only if the linear functional is quasi-definite on ; see (PozPraStr16, , Section 7, in particular Corollaries 7.4 and 7.5) and (PozPraStr18, , Theorem 3.1).
Property G3 corresponds to the so called Matching Moment Property of the complex Jacobi matrix, i.e., if the complex numbers define a quasi-definite linear functional (2.1) with associated Jacobi matrix (here and in the following the simplified term quasi-definite linear functional and positive definite linear functional will stand for linear functionals that are quasi-definite and positive definite on the space of polynomials of sufficiently large degree), then
[TABLE]
see (PozPraStr16, , Section 5). In (FreHoc93, , Theorem 2) the Matching Moment Property was proved for a quasi-definite linear functional given by
[TABLE]
where is a complex matrix and are vectors (compare also with (Cyb87, , Theorem 1)). In Str09 it was derived by the Vorobyev method of moments (see in particular Chapter III of VorBook65 ).
3 Polynomials and orthogonality
Let be a linear functional with moments . Consider the sequence of -dimensional Hankel matrices
[TABLE]
with the corresponding determinant (the notation stands for the submatrix of composed of the elements in the rows from to and in the columns from to ). Setting as the vector
[TABLE]
we are interested in the properties of the linear system
[TABLE]
The solution of Hankel systems, and many related properties of Hankel matrices, have been extensively treated in the literature; see, e.g, to the seminal paper by Stieltjes (Sti1894, , Sections 8–11, pp. 624–630) (please notice that we refer to the English translation published by Springer in 1993), the monographs (ChiBook78, , Chapter I), (DraBook83, , Chapter 1), Ioh82 , (HeiRos84, , Part I), and (BulVBaBook97, , Chapter 2), and the paper (GraLin83, , Section 2). Here, we refer in particular to some results in Section 1.2 of DraBook83 ; their straightforward generalization to the complex case is equivalent to Theorems 3.1 and 3.2 given in this section; see also Theorem 7 in (GanBook59, , Chapter XV, §10) in the context of infinite Hankel matrices with finite rank. We do not report the proofs of Theorems 3.1 and 3.2 since they are based on the study of Hankel matrices, and they would lead us too far from the main point of the survey. We will use them as the starting point of our presentation.
Theorem 3.1
Assume that , then if and only if
[TABLE]
where is the unique solution of the linear system (3.2). Moreover, if and for , then if and only if
[TABLE]
As a consequence, we get the following theorem; see (DraBook83, , Property 1.6).
Theorem 3.2
Assume that and . Then the system
[TABLE]
has (infinitely many) solutions if and only if .
The following theorem gives necessary and sufficient conditions for the existence (and uniqueness) of a FOP of degree ; see (DraBook83, , Property 1.14).
Theorem 3.3
Let be a linear functional. An -degree monic FOP exists if and only if one of the following conditions is satisfied.
- •
* (unique monic FOP);*
- •
* and (infinitely many monic FOPs);*
where are the determinants of the Hankel submatrices composed of the moments of .
Proof
A monic FOP of degree
[TABLE]
exists if and only if , for , which gives the linear system (3.2) with . Therefore if , then the polynomial exists and is unique. If , then necessary and sufficient conditions for the existence of are given by Theorem 3.2: for and , there exist infinitely many if and only if . ∎
Note that by Theorem 3.3, a linear functional is quasi-definite on if and only if , for ; see, e.g., (ChiBook78, , Chapter I, Theorem 3.1).
The second item of Theorem 3.3 can be interpreted in the following way: consider the sequence . Let be the number of zeros in the sequence between and the first nonzero element in the sequence after , i.e., for and . Note that the parameters are known as Kronecker index, and the differences as Euclidean indices; see BulVBaBook97 ; Kai80 . Let be the number of zeros in the sequence between and the last nonzero element in the sequence before , i.e., for and . A FOP of degree exists if and only if . Roughly said, there are “more consecutive zeros to the right than to the left”.
Among the formal orthogonal polynomials the following cases can be distinguished; see Definition on p. 47 of DraBook83 .
Definition 3.4
A formal orthogonal polynomial (FOP) is called regular when (i.e., when it is unique), while it is called singular when (i.e., when it is not unique).
Proposition 3.5
Let be a regular FOP and . Then for every if and only if .
Proof
Without loss of generality, let us assume to be monic. The conditions , for , lead to the system
[TABLE]
with the unique solution of the linear system (3.2). Theorem 3.1 concludes the proof.
Knowing all the integers such that allows determining all the integers for which a FOP exists.
Example 1. If the zero-nonzero pattern of the sequence of Hankel determinants is
[TABLE]
then the FOPs of degree and do not exist. There exist regular FOPs of degree and and singular FOPs of degree and .
In order to fill the gaps in FOP sequences, we consider polynomials satisfying the following property.
Definition 3.6
The polynomial is called quasi-orthogonal of order (or -quasi-orthogonal), with , when
[TABLE]
Quasi-orthogonal polynomials of order were introduced by Riesz in Rie23 and then generalized to any order by Chihara in Chi57 ; see also (DraBook83, , Definition 1.1, p. 51), Dra90 , Dra16 , and compare the definition with the concept of inner formal orthogonal polynomials given in (hochbruck:1996, , Definition 5.2) and of left and right quasi-formally biorthogonal polynomials in (Fre93b, , Definition 3.3). Note that Definition 3.6 does not require to be minimal, i.e., it is not necessary that . Thus a -quasi-orthogonal polynomial of degree is also -quasi-orthogonal for . Also, any formal orthogonal polynomial of degree is -quasi-orthogonal for .
If , then an -quasi orthogonal polynomial of degree exists for every larger than ; see, e.g., (Fre93b, , Lemma 3.4). The following theorem will prove it together with the characterization of such polynomials; see discussion on pp. 47–51 of DraBook83 .
Theorem 3.7
Let be the Hankel determinants associated with the linear functional . Let , and for , and let be the regular monic FOP with respect to . Then all the monic -quasi-orthogonal polynomials for are of the form
[TABLE]
Proof
The proof is by induction on . Let . By Proposition 3.5, if and only if is orthogonal to all polynomials of degree . Therefore is a monic polynomial of degree that is orthogonal to :
[TABLE]
Moreover, any polynomial of the form , , is a monic -quasi-orthogonal polynomial. On the other side, assume that is an arbitrary monic polynomial of degree that is orthogonal to . Then the polynomial has the following two properties:
- •
it is of degree ,
- •
it is orthogonal to .
Hence the uniqueness of gives for a certain complex number , i.e.,
[TABLE]
Set between and , and assume that all the monic -quasi-orthogonal polynomials of degree are of the form (3.6). By Proposition 3.5, if and only if is orthogonal to all polynomials of degree . Therefore is a monic polynomial of degree that is orthogonal to :
[TABLE]
Clearly, is a monic -quasi-orthogonal polynomial of degree , for any complex number . It remains to prove that an arbitrary monic polynomial of degree that is orthogonal to is of the form , where is a certain complex number, and is a polynomial of the form (3.6). It can be done similarly to the case . ∎
Proposition 3.8
Let be the Hankel determinants associated with the linear functional such that and for . Then for , is a FOP if and only if it is -quasi-orthogonal.
Proof
Clearly any FOP of degree is -quasi-orthogonal. Vice versa if is -quasi-orthogonal, then it satisfies (3.6). By Proposition 3.5, is orthogonal to . Therefore if , then for . ∎
Consider the sequence of polynomials
[TABLE]
constructed in the following way: is a regular FOP (when possible) or is a -quasi-orthogonal polynomial, where is the last regular FOP before . For later convenience, we consider every nonzero choice for as a regular FOP. Let us denote by all the indexes for which is a regular FOP, i.e., (setting , and when is the last of the regular FOPs). By Theorem 3.7, the quasi-orthogonal polynomials between two consecutive regular FOPs satisfy the recurrences
[TABLE]
for some coefficients and ; see (DraBook83, , Theorem 1.5 and Remark 1.2). Notice that any choice of and defines a -quasi-orthogonal polynomial. In particular, there exist families of such polynomials satisfying the two-term recurrences
[TABLE]
fixing gives even simpler recurrences.
Setting for some , the regular FOP satisfies (see (DraBook83, , Theorem 1.5 and Remark 1.2) and (GraLin83, , Theorem 2))
[TABLE]
with , a nonzero coefficient, ,
[TABLE]
and given by
[TABLE]
where the matrix of the system is nonsingular; see, e.g., (Fre93:num:math, , Theorem 2.3). Notice that for a quasi-definite linear functional the related (regular) formal orthonormal polynomials satisfy the three term recurrences (2.2).
Given , the recurrences (3.8) and (3.10) can be expressed in the matrix form (see (DraBook83, , Section 1.7), (pinar:ramirez, , Section 3); c.f., (Gra74, , pp. 221–222), (GraLin83, , Figure 2 and Theorem 3), and (FreGutNac93, , Equalities (3.4) and (3.5)))
[TABLE]
with and where is the block matrix
[TABLE]
with the coefficients on the first upper diagonal, the coefficients in the position , when is not regular, and
[TABLE]
for ; for simplicity. Notice that using the recurrences (3.9) with gives the sparse matrix
[TABLE]
with obtained by (3.11).
When the polynomials are regular FOPs (the linear functional is quasi-definite on ) the blocks are scalars. Therefore is an irreducible tridiagonal matrix since and are nonzero for . In particular, there exists a sequence of formal orthonormal polynomials so that the matrix is the complex Jacobi matrix (2.3).
4 The Gauss quadrature for linear functionals
Given a linear functional and a smooth enough function , consider a quadrature approximating of the form (see (DraBook83, , Chapter 5), (Mil03, , Section 2), and (PozPraStr16, , Section 7))
[TABLE]
with the weights, the distinct nodes, and the multiplicity of the node . Notice that the number of nodes can be less than . The quadrature (4.1) will be referred as -node quadrature when for . Otherwise, the sum of the multiplicities would be smaller than . For any choice of (distinct) nodes and their multiplicities , such that , it is possible to achieve that the quadrature (4.1) is exact for any . It is necessary and sufficient to set the weights as
[TABLE]
where are polynomials from such that
[TABLE]
with , and ; see (DraBook83, , Theorem 5.1) or the proof of Theorem 7.1 in PozPraStr16 . In this case (4.1) is known as interpolatory quadrature, since it can be given by applying to the generalized (Hermite) interpolating polynomial for the function at the nodes of the multiplicities . An interpolatory quadrature is completely determined by its nodes and multiplicities. Therefore in the following a quadrature will be said to be determined by a polynomial when it is an interpolatory quadrature (4.1) with being the roots of , and the corresponding multiplicities of the roots.
The following definition is a straightforward extension to the complex case of the Gauss quadrature introduced by Draux in (DraBook83, , Chapter 5).
Definition 4.1
The quadrature (4.1) is called the -node Gauss quadrature when it is exact on the space and for (the number of nodes, counting the multiplicities, is ).
We point out the following remarks:
- •
the algebraic degree of exactness of the -node Gauss quadrature is allowed to be larger than ;
- •
a Gauss quadrature with smaller number of nodes may or may not exist when the -node Gauss quadrature exists.
Hence the -node Gauss quadrature generally does not satisfy properties G1–G3 in Section 2. However, when is a quasi-definite linear functional, then the -node Gauss quadrature for satisfies properties G1–G3, i.e., in this case Definition 4.1 is equivalent to the one in PozPraStr16 ; PozPraStr18 .
In order to give conditions for the existence of an -node Gauss quadrature for a linear functional the following result is needed; see (DraBook83, , Theorem 5.2), see also (GauBook04, , Theorem 1.45) for positive definite linear functionals and (PozPraStr16, , Theorem 7.1) for quasi-definite linear functionals.
Theorem 4.2
A quadrature determined by a polynomial is exact for all the polynomials in if and only if is -quasi-orthogonal.
Proof
Assume to be exact for every polynomial in . Then is -quasi-orthogonal. Indeed,
[TABLE]
since for , . Inversely, let be -quasi-orthogonal. Any can be written as for some and , giving . Since is interpolatory it is exact on and thus
[TABLE]
The proof is concluded since for , . ∎
Note that the proof is a straightforward adaptation of the classical well-known argument used for proving the same result in the positive definite case.
As discussed in Section 3, for every linear functional there exists a sequence of polynomials (3.7) so that is a regular FOP (when possible), or is -quasi-orthogonal, where is the last regular FOP before ( is assumed to be regular). We denote by the indexes of the regular FOPs (with when is the last of the regular FOPs). Theorem 4.2 implies the following corollary (see (DraBook83, , Theorem 5.2)).
Corollary 4.3
Let be a polynomial in the sequence described above.
- •
If is a regular FOP, then it determines a quadrature exact for every polynomials in .
- •
if is a -quasi-orthogonal polynomial, then it determines a quadrature exact for every polynomials in .
Notice that if , then . Thus for (see, e.g., (DraBook83, , Property 1.15)) and, consequently, for .
If for some , then the quadrature (4.1) has a smaller number of nodes (counting the multiplicities). The following lemmas deal with this issue; see (DraBook83, , Theorem 5.3).
Lemma 4.4
Consider the quadratures determined by the polynomial in the sequence described above. Given two consecutive regular FOPs and , with , then
[TABLE]
Proof
Theorem 3.7 gives
[TABLE]
for some polynomial . Let be the roots of with multiplicities . The weights of the quadrature are given by (4.2). Consider the pair so that is not a factor of , i.e., the root is not a root of or it is a root of but with greater than the multiplicity of as a root of . Then the -degree interpolatory polynomial defined in (4.3) is a multiple of , i.e.,
[TABLE]
for some polynomial . By Proposition 3.5, is orthogonal to , giving
[TABLE]
Therefore has at most nodes. Moreover, each node of is a node of and has multiplicity smaller than or equal to the one of the corresponding node of .
If is a root of with multiplicity , then there exists a polynomial of the kind of (4.3) so that is the corresponding weight of . Since has degree the weight is given by
[TABLE]
Noticing that concludes the proof. ∎
Lemma 4.5
If is a regular FOP, with , then it determines a quadrature (4.1) such that , for .
Proof
Let be such that and as in (4.3), then
[TABLE]
Proposition 3.5 gives , concluding the proof. ∎
The following theorem summarizes the previous discussion; see (DraBook83, , Theorems 5.2 and 5.3).
Theorem 4.6
The -node Gauss quadrature exists (and is unique) if and only if . Moreover, if , then has degree of exactness at least . In particular, if , then has (maximal) degree of exactness , with when is the last of the regular FOPs.
Proof
By Theorem 4.2, is exact on if and only if it is determined by a FOP with degree , i.e., a polynomial orthogonal to . By Lemma 4.4 if is a singular FOP, then has not nodes. Therefore it is not a -node Gauss quadrature. Considering Lemma 4.5 and noticing that regular FOPs are unique, exists and is unique if and only if . The proof is conclude noticing that Theorem 4.2 and Lemma 4.4 imply that is exact on . ∎
5 Matrix formulation of the Gauss quadrature
If is a quasi-definite linear functional, then the associated complex Jacobi matrix (2.3) satisfies the Matching Moment Property (2.4). We will give an original proof of an extension of the Matching Moment Property for a general sequence of moments using the properties of the formal orthogonal polynomials and of the Gauss quadrature for the linear functionals. The presented extension also considers the case of moments so that . The case of a linear functional of the kind , with , was treated in (GuoRen04, , Theorem 2.10). We remark that assuming real moments (with a straightforward extension to the complex case), the Matching Moment Property presented here, as well as the ones in FreHoc93 ; GuoRen04 ; PozPraStr16 , can be derived by Theorem 5 of the 1983 paper by Gragg and Lindquist GraLin83 , where such property is related to the minimal partial realization problem.
Let be a linear functional and let be the corresponding block tridiagonal matrix (3.13) associated with the sequence of polynomials . Denote by the subsequence of the regular FOPs and recall that for the polynomials are -quasi-orthogonal. Also recall that if , then for . Since the elements in the superdiagonal of are nonzero the block tridiagonal matrix is nonderogatory, i.e., its eigenvalues have geometric multiplicity . Indeed, if is an eigenvalue, then deleting the first column and the last row of gives a lower triangular nonsingular matrix (with the identity matrix). Thus the null space of has dimension . Proving the Matching Moment Property will need the following lemmas.
Lemma 5.1
Let and be as in (3.12). Then is the characteristic polynomial of (up to a nonzero rescaling).
Lemma 5.1 is a consequence of Lemma 2 in kautsky:81 ; see also (DraBook83, , Theorem 1.11).
Lemma 5.2
Let be a sequence of block tridiagonal matrices (3.13). For the matrices and satisfy
[TABLE]
where the vectors have dimension on the left-hand side and on the right-hand side (we use the same notation for the sake of simplicity).
Proof
Consider the -dimensional vectors
[TABLE]
If the last element of is zero for , then
[TABLE]
proving the lemma. In the following, when the elements from the position to the position of a vector are possibly nonzero, we denote them by (). Similarly, when the elements from the position to the position are null, we denote them by . Direct computations show that
[TABLE]
Moreover,
[TABLE]
and
[TABLE]
Repeating the argument gives
[TABLE]
concluding the proof. ∎
Theorem 5.3** (Matching Moment Property)**
Let be a linear functional with complex moments , and let be the associated block tridiagonal matrix (3.13) with the corresponding polynomials . Denote the indexes of the regular FOPs by . For every let be so that , the matrix satisfies
[TABLE]
with for , for , and when is the last regular FOP.
Proof
Consider the linear functional
[TABLE]
If the linear functionals and are identical on the space , then the proof is given. By Lemma 5.1 and the Cayley–Hamilton Theorem, the polynomial satisfies the orthogonality conditions
[TABLE]
Proceeding by induction on , first consider the case . Since is a Hessenberg matrix it satisfies
[TABLE]
Direct computations give . Therefore
[TABLE]
which also trivially stands for . Using property (5.1) and Theorem 4.6, determines the quadrature for so that for every . Moreover, determines the Gauss quadrature for , exact for polynomials of degree at most . The two quadratures and coincide since they have the same weights. Indeed, if is the interpolatory polynomial (4.3) for , then the weights of and are respectively given by
[TABLE]
Since has degree , equality (5.2) gives
[TABLE]
proving the theorem for .
Assume , with so that , and define the quadrature for , determined by the polynomial . By (5.1) and Theorem 4.6, for every . Furthermore, determines the quadrature for , exact for every polynomials of degree at most . As noticed above, ad coincide if and only if the respective weights and coincide. Let be the interpolatory polynomials (4.3). Since has degree the weight satisfies
[TABLE]
where Lemma 5.2 and the inductive assumption were used. ∎
We recall the definition of matrix function. A function is defined on the spectrum of the given matrix when for every eigenvalue of there exist for , with the order of the largest Jordan block of in which appears. Consider the Jordan block of the size corresponding to the eigenvalue , then the matrix function is defined as
[TABLE]
Denoting
[TABLE]
the Jordan decomposition of , the matrix function is defined as
[TABLE]
We refer to HigBook08 for further information and for the equivalence to the other definitions of matrix function.
Consider the block tridiagonal matrix of Theorem 5.3 and its Jordan decomposition . Since is nonderogatory, there are distinct eigenvalues corresponding to the Jordan blocks of the sizes respectively . If is a smooth enough function so that is well defined, then the Jordan decomposition of and some algebraic manipulations give
[TABLE]
with complex weights, and as in Theorem 5.3; see kautsky:81 and (pinar:ramirez, , Section 3) for algebraic expressions of the weights. This observation together with the proof of Theorem 5.3 shows that when the bilinear form is a matrix formulation of the -node Gauss quadrature for the linear functional . Moreover, if , then Lemma 4.4 gives ; hence and correspond to the same Gauss quadrature , despite being different.
6 The minimal partial realization and Gauss quadrature
Any triplet composed of a matrix and vectors , can be associated with a dynamical system
[TABLE]
[TABLE]
with the state vector, the scalar input (control), and the scalar output. The transfer function
[TABLE]
connects with and it is obtained applying the Laplace transform; refer, e.g., to (HoKal66, , Section 2), (Par92, , Section 4), (AntBook05, , Section 4.1, 4.2 and 11.1). The series representation holds only for large enough, and the coefficients are usually known as Markov parameters. The triplet is called a realization of . One of the questions in systems theory is to determine all the realizations that yield a given (rational) function , or equivalently, its Markov parameters. When the realization matches a finite number of Markov parameters it is said to be a partial realization. A partial realization in which has minimal dimension is called a minimal partial realization. Among the extensive literature about the realization problem we refer the reader to the papers by Kalman Kal63 ; Kal79 , Gilbert Gil63 , Ho and Kalman HoKal66 , Gragg Gra74 , Gragg and Lindquist GraLin83 , Parlett Par92 (which offers an algebraic point of view), Heinig and Jankowski HeiJan92 , and to the monographs by Kailath Kai80 , Bultheel and Van Barel (BulVBaBook97, , Chapter 6), Antoulas (AntBook05, , Section 4.4), and by Liesen and Strakoš (LieStrBook13, , Section 3.9); see also Moo81 . In the papers by Chebyshev from 1855–1859 Che1855 ; Che1859a and Christoffel from 1858 Chr1858 the concept equivalent to the minimal partial realization is present (without using the name) for a sequence of moments defining a positive definite linear functional; cf. the comment in (BulVBaBook97, , p. 23). The seminal paper by Stieltjes on continued fractions published in 1894 (Sti1894, , Sections 7–8, pp. 623–625, and Section 51, pp. 688–690) provides an instructive description; see also (LieStrBook13, , Section 3.9.1) and PozStr18 . The results about the Gauss quadrature for real linear functionals and about the minimal partial realization of a sequence of real numbers appeared in the same year (1983) respectively in the monograph by Draux (DraBook83, , Chapter 5) and in the paper by Gragg and Lindquist GraLin83 . Section 4 has presented the results by Draux extending them to the complex case. Here the minimal partial realization of a sequence of complex numbers will be described together with the relationships between results in DraBook83 and GraLin83 (with extension to the complex case).
In the following we offer a non-standard formulation of the realization problem in systems theory.
Problem 1: For a given finite sequence of complex numbers
[TABLE]
find all the triplets such that
[TABLE]
Notice that usually the Markov parameters are defined as .
There always exists a solution of dimension of Problem 1. For instance, take and as
[TABLE]
The sequence (6.1) defines the linear functional on with moments
[TABLE]
For any solution of dimension , let be the distinct eigenvalues of and be the maximal geometric multiplicity of (the size of the largest Jordan block corresponding to ). Then the definition of matrix function (5.3) and algebraic manipulations give
[TABLE]
with complex weights. Therefore every realization of the sequence (6.1) defines a quadrature rule for the linear functional (6.3).
Problem 2: Among all the realizations for (6.1) find those of smallest dimension.
Let be the smallest index so that the unique -node Gauss quadrature determined by the regular FOP is exact for every polynomial of degree smaller than or equal to , i.e.,
[TABLE]
If is the block tridiagonal matrix (3.13) corresponding to , then Theorem 5.3 shows that the triplet is a minimal partial realization for (6.1). All the other minimal partial realizations can be expressed as
[TABLE]
with any invertible matrix (notice that this is a straightforward extension of the result given in (GraLin83, , Theorem 5) to complex Markov parameters). Hence any minimal partial realization of a sequence of complex number corresponds to a Gauss quadrature for the linear functional having as moments.
Finally, we recall the following well-known spectral result about minimal realizations, giving a proof based on the previous developments.
Theorem 6.1
Consider the matrix and the vectors . If the triplet is a minimal realization of the sequence of Markov parameters given by
[TABLE]
then the spectrum of is a subset of the spectrum of .
Proof
Let be the characteristic polynomial of the matrix and consider the linear functional defined by
[TABLE]
By Lemma 5.1 and the Cayley–Hamilton Theorem the -degree polynomial is formally orthogonal to every polynomial, i.e., for every . Consider the last regular FOP in the sequence of the FOPs with respect to . The polynomial is -quasi-orthogonal (note that ). Hence the roots of are roots of by Theorem 3.7. As discussed above, every minimal realization can be expressed as , with the block tridiagonal matrix (3.13) corresponding to and an invertible matrix. Thus Lemma 5.1 concludes the proof. ∎
We remark that the previous theorem is a consequence of the Canonical Structure Theorem of the linear system theory; see, e.g., Kal62 , (Kal63, , Theorem 5), Gil63 and the description in (Par92, , Section 7).
7 The look-ahead Lanczos algorithm and Gauss quadrature
Consider a complex matrix and a complex vector of the corresponding dimension. The th Krylov subspace generated by and is the subspace
[TABLE]
which can be equivalently expressed as
[TABLE]
The basic facts about Krylov subspaces had been given by Gantmacher in Gan34 ; other results can be found, e.g., in (LieStrBook13, , Section 2.2).
Let be a complex matrix, be complex vectors, and be the linear functional defined by
[TABLE]
Denoting with the polynomial whose coefficients are the conjugates of the coefficients of and noticing that
[TABLE]
for , give
[TABLE]
with and .
The non-Hermitian Lanczos algorithm (formulated by Lanczos in Lan50 and Lan52 ) gives, when possible, the vectors
[TABLE]
which are respectively basis of and satisfying the biorthogonality conditions
[TABLE]
In this case, there exist regular FOPs with respect to the linear functional (7.1) so that
[TABLE]
Hence bases satisfying (7.2) exist if and only if is quasi-definite on ; see, e.g., (PozPraStr18, , Theorem 2.1).
In the non-Hermitian Lanczos algorithm, the vectors , , are obtained by the three-term recurrences satisfied by the regular FOPs ; for details refer to (BreBook80, , Section 2.7.2), Gut92 ; Gut94b ; Gut94 , (SaaBook03, , Chapter 7), (GolMeuBook10, , Chapter 4), (LieStrBook13, , Section 2.4), also refer to the survey PozPraStr18 where the connection with the Gauss quadrature for quasi-definite linear functionals is described. Considering biorthonormal vectors, i.e., , the non-Hermitian Lanczos algorithm corresponds to the three-term recurrences (2.2) and can be given as Algorithm 7.1; see, e.g., Cull86 ; CULLUM198919 . The outputs of the first iterations of Algorithm 7.1 define the matrices
[TABLE]
which satisfy , with the identity matrix of dimension . Moreover,
[TABLE]
with the complex Jacobi matrix (2.3) associated with the linear functional (7.1), and the Jacobi matrix with conjugate elements ( and are defined in Algorithm 7.1). Therefore the non-Hermitian Lanczos algorithm can be seen as a way to compute and hence the Gauss quadrature for the functional (7.1); see (FreHoc93, , Theorem 2) and also PozPraStr18 (for the block Lanczos algorithm see, e.g., (fenu:reichel:2013, , Section 3)).
If the th iteration of Algorithm 7.1 gives , then the algorithm has a breakdown. Since , a breakdown arises if and only if is not quasi-definite on . In this case, the FOP is orthogonal to itself. Therefore there do not exist biorthonormal bases of the Krylov subspaces and . Moreover, there does not exist a regular FOP . There are two kinds of breakdown for Algorithm 7.1:
lucky breakdown (or benign breakdown), when or ; 2. 2.
serious breakdown, when and , but .
In the first case either is -invariant or is -invariant. Then the algorithm is usually stopped since an invariant subspace is often a desirable result; see, e.g., BreRedSad91 , (Par92, , Section 5) and (GolVLoBook13, , Section 10.5.5). The second case is problematic. In (WilBook65, , pp. 389–391) Wilkinson showed with some examples that well-conditioned matrices with well-conditioned eigenvectors can produce a breakdown. Hence as Wilkinson wrote, serious breakdown “is not associated with any shortcoming in the matrix . It can happen even when the eigenproblem of is very well-conditioned. We are forced to regard it as a specific weakness of the Lanczos method itself.” The interested reader can also refer to Rut53 , (HouBau59, , p. 34), (Tay82, , Chapter IV), ParTayLiu85 , (Par92, , Section 7), and Gut92 ; Gut94b ; Gut94 .
Taylor in Tay82 and Parlett, Taylor, and Liu in ParTayLiu85 first proposed the look-ahead Lanczos algorithm, a strategy able to deal with the breakdown problem. When , the idea behind their strategy is to look for a vector , with big enough, so that and for . In FreGutNac93 Freund, Gutknecht, and Nachtigal implemented a different look-ahead strategy considering sequences of FOPs and quasi-orthogonal polynomials. Their procedure is based on the work of Gutknecht published in Gut92 and later in Gut94b ; see also the thesis Nac91 by Nachtigal and the description in Fre93b by Freund. We also refer the reader to the strategy in BreRedSad91 ; Brezinski1992 and the related work Draux96 . The following part will describe the basic ideas behind the look-ahead Lanczos algorithm by Freund, Gutknecht, and Nachtigal.
Consider the linear functional (7.1) and let be the sequence (3.7) of polynomials so that are the regular FOPs and is an -quasi-orthogonal polynomial for , with when is the last of the regular FOPs. Moreover, consider the vectors
[TABLE]
and the matrices , , with and for simplicity of notation. Hence for , the columns of and of are respectively basis of and . However, instead of the biorthogonality conditions (7.2), the following block-biorthogonality conditions hold
[TABLE]
with
[TABLE]
we denote by .
By Theorem 3.7 and the recurrences (3.8) if , then for some complex coefficients and the following recurrences hold
[TABLE]
If with , then for some the recurrences (3.10) give
[TABLE]
where , , ,
[TABLE]
and the coefficients are given as the solution of the system
[TABLE]
see the linear system (3.11). The described recurrences can be expressed in the matrix form
[TABLE]
with the transpose of the block tridiagonal matrix defined in (3.13) and the conjugate transpose of . The resulting form of the look-ahead Lanczos algorithm is given as Algorithm 7.2 and corresponds to the algorithm proposed in (FreGutNac93, , Algorithm 3.1); see also (Fre93b, , Algorithm 5.1).
The first iterations of Algorithm 7.2 produce the coefficients of the block tridiagonal matrix . If , then the Gauss quadrature for the linear functional (7.1) has the matrix formulation (5.4) which is determined by the matrix . Hence Algorithm 7.2 produces Gauss quadratures for the linear functional (7.1). Notice that by Lemma 4.4 the matrix corresponds to the Gauss quadrature for . Nevertheless, the iterations of Algorithm 7.2 are informative since they show that has degree of exactness larger than . At the same time, Algorithm 7.2 also produces the triplet , i.e., the minimal partial realization (6.4) (with ) of the sequence of Markov parameters defined by
[TABLE]
Consider the case in which a benign breakdown does not arise and the determinants of the Hankel submatrices (3.1) composed of the moments of the linear functional (7.1) are such that
[TABLE]
known as incurable breakdown; see (Tay82, , page 56), (ParTayLiu85, , Section 7), (Par92, , p. 577). Then is the last of the regular FOPs (). By Theorem 4.6, the quadrature determined by is the Gauss quadrature with maximal number of nodes (counting the multiplicities) and it is exact for every polynomial. Equivalently, let be the block tridiagonal matrix obtained at the th step of the Lanczos algorithm. Theorem 5.3 gives
[TABLE]
Moreover, if is a function so that and are well defined matrix functions, then there exists a polynomial interpolating in the Hermite sense at the spectra of and (note that depends on , , and ); see, e.g., (HigBook08, , Section 1.2). Therefore
[TABLE]
Looking at the Lanczos algorithm as a method for getting the Gauss quadrature for a linear functional (7.1), the incurable breakdown corresponds to the solution of the problem as well as the lucky breakdown. Furthermore, the triplet is a minimal realization of the transfer function associated with , i.e., it matches the Markov parameters (7.4). The previous considerations together with Theorem 6.1 give a new proof for the Mismatch Theorem based on the properties of the Gauss quadrature for linear functionals. The Mismatch Theorem was first proved in (Tay82, , Theorem 4.2) by Taylor; see also (ParTayLiu85, , p. 117), and (Par92, , Section 7) where the theorem was connected with the minimal realization problem.
Theorem 7.3** (Mismatch Theorem)**
Le be the block tridiagonal matrix obtained at the th step of Algorithm 7.2 with as the input matrix and as the input vectors. If the algorithm has an incurable breakdown at the th step, i.e., the Hankel determinants corresponding to the linear functional (7.1) satisfy (7.5), then each eigenvalue of (known as Ritz value) is an eigenvalue of .
Notice that the look-ahead Lanczos algorithm in Tay82 produces a block tridiagonal matrix different from the matrix (3.13). However, both the matrices are minimal realization of the same sequence of numbers and therefore they are similar.
8 Conclusion
The -node Gauss quadrature for a linear functional described in Section 4 is a straightforward extension of the quadrature introduced for real-valued linear functionals in (DraBook83, , Chapter 5) to the complex case and it satisfies the following properties:
the Gauss quadrature has degree of exactness at least ; 2. 2.
the Gauss quadrature exists and is unique if and only if the Hankel submatrix of moments is nonsingular, i.e., ; 3. 3.
by Theorem 5.3 the Gauss quadrature can be written in the matrix form .
Note that such properties are weaker forms of the properties G1–G3 in Section 2.
Figure 2 summarizes the connections between the Gauss quadrature for linear functionals, minimal partial realization, and look-ahead Lanczos algorithm. On the right-hand side, the triplet is a partial realization matching the first elements of the sequence of complex numbers . A minimal partial realization can be obtained applying the look-ahead Lanczos algorithm to the matrix and the vectors (this is also connected with the concept of model reduction, see, e.g., (LieStrBook13, , Chapter 3, in particular Section 3.9)). Notice that the Lanczos algorithm applied to the partial realization (6.2) is related to the Berlekamp-Massey algorithm Ber68 ; Mass69 (see Kun77 , GraLin83 , and BoLeLu92 ). On the left-hand side, the sequence determines the linear functional by defining its moments. The functional can be approximated by a Gauss quadrature. Among all the Gauss quadratures exact on , there is one with the minimal number of nodes (counting the multiplicities). Such quadrature can be written in the matrix form
[TABLE]
i.e., it corresponds to the minimal partial realization matching .
In Sections 6 and 7, we discussed the correspondence between the incurable breakdown in the look-ahead Lanczos algorithm and the minimal realization of an infinite sequence of complex numbers (and to the unique Gauss quadrature exact for every polynomial). This connection led us to a new proof for the Mismatch Theorem 7.3.
Acknowledgements.
We would like to thank Zdeněk Strakoš for the helpful comments and improvements suggested. This work has been supported by Charles University Research program No. UNCE/SCI/023 and by the Ministry for Scientific and Technological Development, Higher Education and Information Society of R. Srpska.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1(1) Antoulas, A.C.: Approximation of Large-Scale Dynamical Systems, Advances in Design and Control , vol. 6. SIAM, Philadelphia, PA (2005). With a foreword by Jan C. Willems
- 2(2) Bai, Z.: Error analysis of the Lanczos algorithm for the nonsymmetric eigenvalue problem. Math. Comp. 62 (205), 209–226 (1994). DOI 10.2307/2153404 . URL https://doi.org/10.2307/2153404 · doi ↗
- 3(3) Bai, Z., Day, D.M., Ye, Q.: ABLE: an adaptive block Lanczos method for non-Hermitian eigenvalue problems. SIAM J. Matrix Anal. Appl. 20 (4), 1060–1082 (1999). DOI 10.1137/S 0895479897317806 . URL https://doi.org/10.1137/S 0895479897317806 · doi ↗
- 4(4) Beckermann, B.: Complex Jacobi matrices. J. Comput. Appl. Math. 127 , 17–65 (2001)
- 5(5) Berlekamp, E.R.: Algebraic coding theory. Mc Graw-Hill Book Co., New York-Toronto, Ont.-London (1968)
- 6(6) Boley, D.L., Lee, T.J., Luk, F.T.: The Lanczos algorithm and Hankel matrix factorization. Linear Algebra Appl. 172 , 109–133 (1992). DOI 10.1016/0024-3795(92)90022-3 . URL https://doi.org/10.1016/0024-3795(92)90022-3 . Second NIU Conference on Linear Algebra, Numerical Linear Algebra and Applications (De Kalb, IL, 1991) · doi ↗
- 7(7) Brezinski, C.: Padé-type approximation and general orthogonal polynomials. Internat. Ser. Numer. Math. Birkhäuser (1980)
- 8(8) Brezinski, C.: Computational aspects of linear control, Numerical Methods and Algorithms , vol. 1. Kluwer Acad. Publ., Dordrecht (2002). DOI 10.1007/978-1-4613-0261-2 . URL https://doi.org/10.1007/978-1-4613-0261-2 · doi ↗
