On Parametric Linear System Solving
Robert M. Corless, Mark Giesbrecht, Leili Rafiee Sevyeri, and B. David Saunders

TL;DR
This paper presents a polynomial-time method for solving parametric linear systems with up to three parameters, leveraging Hermite and Smith normal forms to efficiently analyze solution regimes and singularities.
Contribution
It introduces a novel approach that reduces the complexity of solving parametric linear systems by exploiting algebraic normal forms, improving over previous exponential methods.
Findings
Polynomial-time solution method for systems with up to three parameters.
Effective identification of singularities and structural changes in the system.
Reduction of regimes needed from exponential to polynomial in system size.
Abstract
Parametric linear systems are linear systems of equations in which some symbolic parameters, that is, symbols that are not considered to be candidates for elimination or solution in the course of analyzing the problem, appear in the coefficients of the system. In this work we assume that the symbolic parameters appear polynomially in the coefficients and that the only variables to be solved for are those of the linear system. The consistency of the system and expression of the solutions may vary depending on the values of the parameters. It is well-known that it is possible to specify a covering set of regimes, each of which is a semi-algebraic condition on the parameters together with a solution description valid under that condition. We provide a method of solution that requires time polynomial in the matrix dimension and the degrees of the polynomials when there are up to three…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
[2]\fnmMark \surGiesbrecht
1]\orgdivOntario Research Centre for Computer Algebra, School of Mathematical and Statistical Sciences, \orgnameUniversity of Western Ontario, \orgaddress\cityLondon, \stateON, \countryCanada
2]\orgdivCheriton School of Computer Science, \orgnameUniversity of Waterloo, \orgaddress\cityWaterloo, \stateON, \postcodeN2L 3G1, \countryCanada
3]\orgdivDepartment of Computer and Information Sciences, \orgnameUniversity of Delaware, \orgaddress\cityNewark, \stateDelaware, \countryUSA
On Parametric Linear System Solving
\fnmRobert M. \surCorless
\fnmLeili \surRafiee Sevyeri
\fnmB. David \surSaunders
[
[
[
Abstract
Parametric linear systems are linear systems of equations in which some symbolic parameters, that is, symbols that are not considered to be candidates for elimination or solution in the course of analyzing the problem, appear in the coefficients of the system. In this paper we assume that the symbolic parameters appear polynomially in the coefficients and that the only variables to be solved for are those of the linear system. The consistency of the system and expression of the solutions may vary depending on the values of the parameters. It is well-known that it is possible to specify a covering set of regimes, each of which is a Zariski-constructible condition on the parameters together with a solution description valid under that condition.
We provide a method of solution that requires time polynomial in the matrix dimension and the degrees of the polynomials when there are up to three parameters. We also discuss examples suggesting how the method may be useful beyond the formal three-parameter setting.
In previous methods the number of regimes needed is exponential in the system dimension and polynomial degree of the parameters. Our approach exploits the Hermite and Smith normal forms that may be computed when the system coefficient domain is mapped to the univariate polynomial domain over suitably constructed fields. Our method identifies intrinsic singularities and ramification points where the algebraic and geometric structure of the matrix changes.
Parametric eigenvalue problems are addressed as well: we treat as a parameter in addition to those in and solve the parametric system . The algebraic conditions on required for a nontrivial nullspace define the eigenvalues. We do not directly address the problem of computing the Jordan form, but our approach allows the construction of the algebraic and geometric eigenvalue multiplicities revealed by the Frobenius form, which is a key step in the construction of the Jordan form of a matrix.
1 Introduction
In broad generality, symbolic computation is concerned with mathematical equations that contain symbols; symbols are used both for variables, which are typically to be solved for, and parameters, which are typically carried through and appear in the solutions, which are then interpreted as formulae: that is, objects that can be further studied, perhaps by varying the parameters. One prominent early researcher said that the difference between symbolic and numeric computation was merely a matter of when numerical values were inserted into the parameters: before the computation meant you were going to do things numerically, and after the computation meant you had done symbolic computation. The words “parameters” and “variables” are therefore not precisely descriptive, and can often be used interchangeably. Indeed as a matter of practice, polynomial equations can often be taken to have one subset of its symbols taken as variables rather than any other subset in quite strategic fashion: it may be better to solve for as a function of than to solve for as a function of .
In this paper we are concerned with systems of equations containing several symbols, some of which we take to be variables, and all the rest as parameters. In addition, we restrict our attention to problems in which the variables appear only linearly. Parameters are allowed to appear polynomially, of whatever degree.
Parametric Linear Systems (PLS) arise in many contexts, for instance in the analysis of the stability of equilibria in dynamical systems models such as occur in mathematical biology and other areas. Understanding the different potential kinds of dynamical behavior can be important for model selection as well as analysis. Another important area of interest is the role of parametric linear systems in dealing with the stability of the equilibria of parametric autonomous systems of ordinary differential equations (see [26] and [12]). One particularly famous example is the Lotka-Volterra system which arises naturally from predator-prey equations. See also [24] and [25]. Other examples of the use of parametric linear system from science and engineering include their application in computing the characteristic solutions for differential equations [9], dealing with colored Petri nets [14] and in operations research and engineering [10], [18], [22], [32]. Some problems in robotics [2] and certain modelling problems in mathematical biology, see e.g., [30], also can benefit from the ability to effectively solve parametric linear systems.
After some discussion of prior comprehensive solving work in Section 2, we proceed with formal problem and solution definitions for parametric linear systems in Section 3. Our primary tool for solving these is by way of Comprehensive Triangular Smith Normal Form (CTSNF), which is introduced in Section 5, where we also reduce PLS to CTSNF. Section 6 describes the solution of CTSNF problems for the case of up to three parameters.
An application that seems at first to be of only theoretical interest is the computation of the matrix logarithm, or indeed any of several other matrix functions such as matrix square root. We briefly discuss this example in more detail with a pair of small matrices in Section 7.2. We also give other examples in Section 7.
A preliminary conference version of this work appeared in [5]. The present version expands the definitions, proofs, and treatment of constructible regimes.
2 Previous Work on Parameterized Linear Systems
Interest in computation of the solution of parameterized linear systems dates back to the beginning of symbolic computation. For instance, one of the first things users have requested of computer algebra systems is the explicit form of the inverse of a matrix containing only symbolic entries111This is merely an anecdote, but one of the present authors attests that this really has happened.: the user is then typically quite dissatisfied at the complexity of the answer if the dimension is greater than, say, three. Of course, the determinant itself, which must appear in such an answer, has a factorial number of terms in it, and thus growth in the size of the answer must be more than exponential. Therefore the complexity of any algorithm to solve parameterized linear systems must be at least exponential in the number of parameters.
An interesting pair of papers addressing the case of only one parameter is [1] and [16]. These papers assume full rank of the linear system, and thus compute the “generic” case when in fact there are isolated values of the parameter for which the rank drops, and use rational interpolation of the numerical solutions of specialized linear systems to recover this generic solution.
Many authors have sought comprehensive solutions, by which is meant complete coverage of all parametric regimes, through various means. One of the first explicit methods was the matrix-minor based approach of [26], which enables practical solution of many problems of interest. Recently, the problem of computing the Jordan form of a parametric matrix once the Frobenius form is known has been approached using Regular Chains [7], and this has been moderately successful in practice. Simple methods and heuristics for linear systems containing parameters continue to generate interest, even when Regular Chains are used, such as in [3].
Other authors such as [31], [19], [21], [17], and [20] have tackled the even more difficult problem of computing the comprehensive solution of systems of polynomial equations containing parameters, and of course their methods can be applied to the linear equations being considered here.
By restricting our attention in this paper to linear problems and to those of a constant number of parameters (e.g., three or fewer) we are able to guarantee better worst case performance (polynomially many solution regimes) and hope to provide better efficiency in many instances than is possible using those general-purpose approaches.
3 Definitions and Notation
Let be a field and a list of parameters, and fix an algebraic closure of . Then is the ring of polynomials and is the field of rational functions in . For each tuple in , evaluation at is a mapping . We will extend this mapping componentwise to polynomials, vectors, matrices, and sets thereof over (i.e., in ). Equivalently, one may interpret evaluation in arbitrary field extensions of ; we fix for notational convenience.
In many of our constructions an additional symbol will play a special role: it is a parameter of the PLS problems but, for purposes of computing Hermite/Smith forms, we treat as the univariate polynomial variable over . Accordingly, evaluation at extends to a partial evaluation map by substituting and leaving unbound. For we write for the resulting polynomial in , and for we write (equivalently ) for its value under the further specialization .
We use the Householder convention, typesetting matrices in upper case bold, e.g. , and lower case bold for vectors, e.g. .
For the most part, for such objects over , we know and from context and write rather than , but write for the partial evaluation at , and for the full evaluation at .
For a set of polynomials , we will denote by the variety of the ideal generated by in . This is the set of tuples such that , for all . For a polynomial let denote its principal Zariski-open set, . For a finite set define .
When polynomial sets also involve , we use the same notation with respect to the extended parameter space . For example, for we write for the set of pairs such that for all , and for we write and for finite . We will be concerned with basic Zariski-constructible (locally closed) sets of the form . We use Zariski-constructible rather than “semi-algebraic” since we do not assume that is an ordered field.
Our inputs are polynomial in the parameters but the output coefficients in general are rational functions. The evaluation mapping extends partially to : for a rational function in lowest terms ( and relatively prime), define ; then is well defined for . We extend componentwise to vectors and matrices over (e.g., by taking the least common multiple of entry denominators). Throughout, whenever regime data contains rational function coefficients, we explicitly adjoin the relevant denominator factors to the nonvanishing constraint set so that all expressions are well defined on the corresponding constructible set.
When a condition is imposed on an element of a rational function field such as , we use the following convention. After writing in lowest terms, the condition means together with , and the condition means together with . Thus writing in a vanishing set is shorthand for adjoining to and the factors of to , while writing in a nonvanishing set is shorthand for adjoining the factors of both and to . The same convention is applied componentwise to vectors and matrices. For regimes over an algebraic parameterized extension, these conditions are interpreted in the corresponding quotient coordinate ring; equivalently, after choosing a basis for the quotient, one may clear denominators and equate the corresponding coordinate polynomials over the base parameter ring.
Definition 3.1**.**
For a polynomial in a polynomial ring over (e.g., in or ) we write for the (finite) set of monic irreducible factors of in that ring. For rational matrices/vectors with entries in we write and .
Definition 3.2**.**
A PLS instance is well defined if all entries of and are well defined on ; equivalently, .
Definition 3.3**.**
The data for a PLS problem is a matrix and right hand side vector over , together with a Zariski-constructible constraint, , with . We are only interested in those parameter value tuples in , i.e., on which the polynomials in are nonzero and the polynomials in are zero.
For the PLS problem :
- •
a solution regime is a tuple , with entries of and in for some parameterized extension of (Definition 5.1), such that all denominator factors occurring in and are contained in , and such that, for all , is a solution vector and is a matrix whose columns form a nullspace basis for .
- •
An inconsistency regime is a triple such that, for all , the specialized system has no solution. The symbol is to remind that no solution vector is on offer in this case.
- •
A PLS solution is a set of regimes (solution regimes and inconsistency regimes) that covers , which means every parameter value assignment that satisfies the problem Zariski-constructible constraint also satisfies at least one regime Zariski-constructible constraint. In other words
[TABLE]
We call entries that must occur in any in the solution an intrinsic restriction, or singularity. We call the differing sets that may occur in covers of the ramifications of the cover.
The following examples illustrate the PLS definition and also sketch the prior approach to PLS given by [26].
Example 3.4**.**
Let be a field, , and consider the PLS instance with
[TABLE]
and with empty initial constraints . If then the second equation reads and the system is inconsistent; thus we have inconsistency at the points (i.e., we have inconsistency regimes . On the matrix has rank and the solution set is , so one solution regime is . On the matrix is zero and every vector solves; one may take .
We now sketch the minor-based regime construction from [26]. If, for of size , is , and conformally , then a solution satisfies
[TABLE]
and
[TABLE]
Under the condition that is nonzero and all larger minors of are zero, equation (3.1) can be solved with specific solution and . Provided the system is consistent (equation (3.2) holds), we have the regime
[TABLE]
where and .
Definition 3.5**.**
A solution regime obtained in this way (from a choice of nonsingular submatrix ) is called a minor-defined regime.
Since an matrix has minors, there are exponentially many minor defined regimes. However, some of these regimes may not be solutions due to inconsistency or it may be possible to combine several regimes into one. For instance if is a constant, and , then all rank solutions are covered by this one regime. [26] has made a thorough study of minor defined regimes and their simplifications.
Another approach is to base solution regimes on the pivot choices in an LU decomposition. The simplest thing to do is to leave it to the user, although one has to also inform the user through a proviso when this might be necessary [6]. That is, provide the generic answer, but also provide a description of the set . A more sophisticated approach is developed by [7, 3] using the theory of regular chains and its implementation in Maple [19] to manage the algebraic conditions. For example a given matrix entry may be used as a pivot, with validity dependent on adding the polynomial to the non-zero part, , of the Zariski-constructible set. For a comprehensive solution the case that that entry is zero must also be pursued. In the worst case, this leads to a tree of zero/nonzero choices of depth and branching factor .
4 Triangular Smith forms and degree bounds
In this paper we take a different approach, with the solution regimes arising from Hermite normal forms, of which triangular Smith forms are a special case. We give a system of solution regimes of polynomial size in the matrix dimension, , and polynomial degree, . Each regime is computed in polynomial time and the regime count is exponential only in the number of parameters. To use Hermite forms we will need to work over a principal ideal domain such as, for parameters , . We will restrict our input matrix to be polynomial in the parameters. This first lemma shows it is not a severe constraint.
Lemma 4.1**.**
Let be a well defined PLS over field , for parameter set , with and with numerator and denominator degrees bounded by in each parameter of and well defined in the sense of Definition 3.2. The problem is equivalent (same solutions) to one in which the entries of the matrix and vector are polynomial in the parameters , the dimension is the same, and the degrees are bounded by .
Proof.
Because the PLS is well defined, it is specified by that all denominator factors of are nonzero for . Let be a diagonal matrix with the -th diagonal entry being the least common multiple () of the denominators in row of , . These s also evaluate to nonzero on . It follows that if and only if . Thus the PLS is equivalent and its matrix and vector have polynomial entries of degrees bounded by . ∎
We will reduce PLS to triangular Smith normal form computations. The rest of this section concerns computation of triangular Smith normal form and bounds for the degrees of the form and its unimodular cofactor.
Definition 4.2**.**
Given field and variable , a matrix over is in (reduced) Hermite normal form if it is upper triangular, its diagonal entries are monic, and, for each column in which the diagonal entry is nonzero, the off-diagonal entries are of lower degree than the diagonal entry.
If each diagonal entry of exactly divides all those below and to the right, then is column equivalent to a diagonal matrix with the same diagonal entries (its Smith normal form). An equivalent condition is that, for each , the greatest common divisor of the minors in the leading columns equals the greatest common divisor of all minors. Following [28, Section 8, Definition 8.2] we call such a Hermite normal form a triangular Smith normal form. It will be the central tool in our PLS solution.
For notational simplicity, we’ve left out the possibility of echelon structure in a Hermite normal form. We will talk of Hermite normal forms only for matrices having leading columns independent up to the rank of the matrix. In our algorithms we assume this hypothesis (or enforce it by a generic right multiplication by a constant nonsingular matrix, as in Fact 4.4). Every such matrix over is row equivalent to a unique matrix in Hermite form as defined above. For given we have , with unimodular, i.e. , and in Hermite form. If is nonsingular, the unimodular cofactor is unique and has determinant , where is the leading coefficient of . This follows since , which is monic.
The next definition and lemma concern assurance that Hermite form computation will yield a triangular Smith form.
Definition 4.3**.**
Call a matrix well-tempered if its Hermite form is a triangular Smith form (each diagonal entry exactly divides those below and to the right). In particular, a well-tempered matrix has leading columns independent up to the rank.
There is always a column transform (unimodular matrix applied from the right) such that is well-tempered. The following fact, proven by [15] shows that a random transform over suffices with high probability.
Fact 4.4**.**
Let be a matrix over of degree in at most . Let be a unit lower triangular matrix with below diagonal elements chosen from subset of uniformly at random. Then is well-tempered over with probability at least .
Note that and, for and we also have .
We continue with analysis of degree bounds for Hermite forms of matrices, particularly degree bounds for triangular Smith forms of well-tempered matrices. The first result needed is the following fact from [11]. Through the remainder of this paper we will employ “soft O” notation, where, for functions we write if and only if for some constant .
Fact 4.5**.**
Let be a field, parameters, and let be in , nonsingular, with , . Over , let be the unique Hermite form row equivalent to and be the unique unimodular cofactor such that . The coefficients of the entries of , are rational functions of . Let be the least common multiple of the denominators of the coefficients in , , as expressed in lowest terms.
- (a)
* and * 2. (b)
* (bounds both numerator and denominator degrees).* 3. (c)
. 4. (d)
* and can be computed in polynomial time: deterministically in time and Las Vegas probabilistically (never returns incorrect result) in expected time.*
Proof.
This is [11, Summary Theorem]. The situation there is more abstract, more involved. We offer this tip to the reader: their correspond respectively to our , identity, zero.
Item (c) is not stated explicitly in a theorem of [11] but is evident from the proofs of Theorems 5.2 and 5.6 there. The common denominator is the determinant of a matrix over of dimension and with entries of degree in at most . ∎
We will generalize this fact to nonsingular and non-square matrices in Theorem 4.6. In that case the unimodular cofactor, , is not unique and may have arbitrarily large degree entries. The following algorithm is designed to produce a with bounded degrees.
Theorem 4.6**.**
Let be a field, parameters, and let be in of rank , , and . Let be a random unit lower triangular matrix chosen as in Fact 4.4, so that is well-tempered with the stated probability. Then, for the triangular Smith form computed as , = HermiteForm(), we have
- (a)
Algorithm HermiteForm is (Las Vegas) correct and runs in expected time ; 2. (b)
; 3. (c)
.
Proof.
Let be as in Fact 4.4 with and . If the field is small, an extension field can be used to provide large enough .
We apply HermiteForm to to obtain , , and use the notation of Algorithm 1 in this proof. We see by construction that is nonsingular, from which it follows that and are uniquely determined (since is the unique Hermite form of ).
By construction the first columns of equal the first columns of , hence the first columns of equal the first columns of . Since is upper triangular, all entries of below row in its first columns are zero, and therefore the same holds for . Because and the leading block is nonsingular, the top rows of are linearly independent and span the row space of . Any linear combination of these rows that is zero in the first columns must therefore be trivial (since the restriction to the first columns has nonsingular coefficient matrix ). It follows that the last rows of are zero, so has the block form .
Moreover, the first columns of coincide with those of the Hermite form of , so they satisfy the reduced degree conditions required in Hermite normal form. Thus is in Hermite normal form and is row equivalent to . By uniqueness of Hermite normal form for matrices whose leading columns are independent up to rank, is the Hermite form of . Since is well-tempered (Fact 4.4), this Hermite form is a triangular Smith form, as required. The runtime is dominated by computation of and for , so Fact 4.5 provides the bound in (a).
For the degree in , applying Fact 4.5, we have . Noting that has degree zero, we have and .
For the degree in , note first that the bounds for degrees in apply as well to . We have, by Fact 4.5, that and the same bound for . For , note that so that , and . ∎
5 Reduction of PLS to triangular Smith forms
In this section we define the Comprehensive Triangular Smith Normal form problem and solution and show that PLS can be reduced to it. The next section addresses the solution of CTSNF itself.
Definition 5.1**.**
For field and parameters , is a parameterized extension of if is the top of a tower obtained by adjoining the parameters of in some order, each either as a rational function parameter or as algebraic over the field constructed so far. Thus, at an algebraic step adjoining , the field is extended by , where is irreducible over as a polynomial in . When a solution regime to a PLS or CTSNF problem is over a parameterized extension , we record the irreducible polynomials defining the algebraic steps of the tower in the vanishing constraint set of that regime.
A comprehensive triangular Smith normal form problem (CTSNF problem) is a triple of a matrix over together with polynomial sets defining a Zariski-constructible constraint on the parameters . Here is treated as a distinguished variable: in CTSNF regimes we do not specialize , and evaluation at refers to the partial evaluation .
For CTSNF problem , a triangular Smith regime is of the form , with , over , where is a parameterized extension of and any polynomials defining algebraic extensions in the tower are in , and with , such that all denominator factors occurring in and are contained in , and on all , is in triangular Smith form over , is unimodular in , is nonsingular over , and .
A CTSNF solution is a list , of triangular Smith regimes that cover , which is to say .
The goal in this section is to reduce the PLS problem to the CTSNF problem. The first step is to show it suffices to consider PLS with a matrix already in triangular Smith form. The second step is to show each CTSNF solution regime generates a set of PLS solution regimes.
Lemma 5.2**.**
Given a parameterized field and matrix over , let be a triangular Smith form of over , with unimodular over , and nonsingular over such that = . Let be constraints such that all data below are defined and is unimodular for every . Then is a solution regime for PLS problem over if and only if is a solution regime for PLS problem , . Moreover, is an inconsistency regime for one system if and only if it is an inconsistency regime for the other.
Proof.
Let . By hypothesis, all expressions appearing below are well defined and is unimodular over . Hence is invertible. The matrix is constant and nonsingular over . Thus the following are equivalent.
2. 2.
3. 3.
The same equivalence shows that is consistent if and only if is consistent, so inconsistency regimes are preserved as well. ∎
Then we have the following algorithm to solve a PLS with the matrix already in triangular Smith form. For simplicity we assume a square matrix, the rectangular case being a straightforward extension.
Lemma 5.3**.**
Algorithm TriangularSmithPLS is correct: it outputs a list of solution regimes and inconsistency regimes whose constructible sets cover . Let where are the diagonal entries of (so has rank over ), and let . Then the algorithm outputs at most solution regimes, and at most regimes total.
Proof.
For , the algorithm outputs a solution regime . Using the numerator/denominator convention of Section 3, imposes and imposes wherever the relevant rational functions are defined. Thus for any we have . If then , so implies while , hence . If then is identically zero, so holds automatically. In either case, the divisibility properties of triangular Smith form imply that all entries in rows of are zero.
Since in a triangular Smith form, the conditions above also imply . Therefore the leading block is invertible. The regime adjoins to , so for the last equations reduce to and are consistent. The vector satisfies the leading equations by construction, and the additional condition (enforced by ) guarantees that is well defined.
For the nullspace, write after evaluation at . Then
[TABLE]
and the bottom block shows that the columns of are linearly independent. The additional condition guarantees that is well defined. Thus is a correct solution regime.
The algorithm also outputs inconsistency regimes for . For any we have the same rank conditions as above (since and ), so the last rows of are zero, while (since ). Hence is inconsistent, so is a correct inconsistency regime.
To see that these regimes cover , take and let . By the divisibility chain on the diagonal entries, and , which implies either or . Thus and the algorithm generates the corresponding regimes. If is consistent, then necessarily , so and the solution regime applies. If it is inconsistent, then there exists with , so and an inconsistency regime applies.
For the regime count, let be the indices with . Each has a square-free irreducible factor of degree at least that divides , hence contributes at least to . Therefore
[TABLE]
so . Since the algorithm outputs at most one additional solution regime coming from , it outputs at most solution regimes. Each such produces at most inconsistency regimes, so the total number of regimes is at most . ∎
Theorem 5.4**.**
Algorithm 3 is correct.
Proof.
Let . By construction satisfies the -only constraints , so at least one triangular Smith regime of in step is valid at . For that regime, step applies Algorithm 2 and produces either a solution regime or an inconsistency regime whose constructible set contains (Lemma 5.3). Lemma 5.2 transfers solution data back through without changing the underlying constraint set, so the corresponding regime for the original PLS problem is correct and covers . ∎
6 Solving Comprehensive Triangular Smith Normal Form
In view of the reductions of the preceding section, to solve a parametric linear system it remains only to solve a comprehensive triangular Smith form problem. This is difficult in general but we give a method to give a comprehensive solution with polynomially many regimes in the bivariate and trivariate cases.
Theorem 6.1**.**
Let , and let have degree in and degree in . Let be polynomial sets defining a Zariski-constructible constraint on . Then the CTSNF problem has a solution of at most triangular Smith regimes.
Proof.
We first solve the unconstrained case . Compute a triangular Smith form over such that . This regime is valid for evaluations that do not zero the denominators (polynomials in ) of and . Let and set , with , to complete the generic regime.
Then, for each , adjoin the regime , obtained by computing the triangular Smith form over . Together with the generic regime indexed by [math], these regimes cover all specializations of : at any point either no factor of vanishes, or at least one irreducible factor of vanishes. From the bounds of Theorem 4.6, , so the number of irreducible factors, and hence the number of regimes, is .
For general input constraints , intersect each produced regime with , i.e., replace each output by . This changes neither the validity of the regimes nor their number. ∎
We can proceed in a similar way when there are three parameters, but must address an additional complication that arises.
Before proving the trivariate bound, we use the following degree-bound observation. The denominator degree bounds used in Theorem 4.6 apply, with the same proof, when the single coefficient parameter is replaced by a pair of coefficient parameters . In particular, for with degree at most in and at most in each of , the common denominator of the computed generic triangular Smith regime over has degree in each of and . The same bound applies to the one-parameter denominators arising after restricting to an irreducible curve factor and computing over the corresponding parameterized coefficient field. This follows from the determinant expressions controlling the common denominators in Fact 4.5 and Theorem 4.6, with the remaining coefficient parameters carried through in the coefficient field.
Theorem 6.2**.**
Let , and let have degree in and degree in . Let be polynomial sets defining a Zariski-constructible constraint on and . Then the CTSNF problem has a solution of at most triangular Smith regimes.
Proof.
As in the bivariate case above, we solve the unconstrained case and just adjoin , if nontrivial, to the Zariski-constructible condition of each solution regime.
First compute a triangular Smith form over such that . This will be valid for evaluations that don’t zero the denominators (polynomials in ) of and . Let and set , and set to complete the first regime.
Next, for each , if occurs in compute a triangular Smith form over . If does not occur in , interchange the roles of and compute over . In either case the resulting coefficients lie in a parameterized extension in the sense of Definition 5.1.
Let . Here arises from divisions by elements of the coefficient field (respectively ), hence after clearing denominators it may be taken in (respectively ). We therefore adjoin the regime
[TABLE]
which is valid on .
The regimes indexed by [math] and by single factors cover all specializations except for those in which either (i) at least two distinct factors from vanish simultaneously, or (ii) for some the denominator vanishes on . Both situations give rise only to zero-dimensional exceptional sets. Indeed, for distinct irreducible the intersection is finite, and by Bézout’s theorem [8] it contains at most points. Similarly, since is irreducible and depends on (respectively ) while (respectively ), we have and again Bézout’s theorem bounds by .
These finitely many points can be enumerated (for example by a resultant computation or a triangular decomposition of the appropriate ideals). For each such point we produce a separate regime by evaluating at that point and computing a triangular Smith form over the corresponding (possibly algebraic) extension field, as allowed in Definition 5.1.
The degree bounds and follow from the degree-bound observation above. Since , the total number of point regimes arising from the intersections above is bounded by , yielding the stated overall regime bound. ∎
Corollary 6.3**.**
For a PLS with matrix , an -vector, let , and suppose . Counting both solution regimes and inconsistency regimes, we have
* regimes in the PLS solution for the univariate case (domain of , is ).* 2. 2.
* regimes in the PLS solution for the bivariate case (domain of , is ).* 3. 3.
* regimes in the PLS solution for the trivariate case (domain of , is ).*
Proof.
Consider one CTSNF regime for , so that is in triangular Smith form with diagonal entries . Let and , as in Lemma 5.3. The product of the nonzero diagonal entries of a triangular Smith form is the rank- determinantal divisor. Hence it divides every nonzero minor of the input matrix (after the constant right transformation ). Since some such minor has degree at most , we have .
Lemma 5.3 therefore implies that applying Algorithm TriangularSmithPLS to this regime produces at most PLS regimes, including inconsistency regimes. In the univariate case there is a single CTSNF regime. In the bivariate case we multiply this per-regime bound by the CTSNF-regime bound of Theorem 6.1, and in the trivariate case by the bound of Theorem 6.2. This gives the three stated estimates. ∎
7 Normal forms and Eigenproblems
Comprehensive Hermite Normal form and comprehensive Smith Normal form are immediate corollaries of our comprehensive triangular Smith form. For Hermite form, use the same comprehensive construction but take the right hand cofactor to be the identity, , and omit the well-tempered/triangular-Smith divisibility requirement in the Hermite-form computation. For Smith form one can convert each regime of CTSNF to a Smith regime. Where with a triangular Smith form, perform column operations to obtain with the diagonal of . In the diagonal entries divide the off diagonal entries in the same row. Subtract multiples of the -th column from the subsequent columns to eliminate the off diagonal entries. Because the diagonal entries are monic, no new denominator factors arise and . Thus when is a valid regime in a CTSNF solution for , then is a valid regime for Smith normal form.
It is well known that if for field (that may involve parameters) and is an additional variable, then the Smith invariants of are the Frobenius invariants of and is similar to its Frobenius normal form, , where denotes the companion matrix of polynomial . Thus we have comprehensive Frobenius normal form as a corollary of CTSNF, however it is without the similarity transform. It would be interesting to develop a comprehensive Frobenius form in which each regime also includes a similarity transform.
Parametric eigenvalue problems for correspond to PLS for with zero right hand side. Often eigenvalue multiplicity is the concern. The geometric multiplicity is available from the Smith invariants, as for example on the diagonal of a triangular Smith form. Common roots of or more of the invariants expose geometric multiplicity and square-free factorization of the individual invariants exposes algebraic multiplicity. Note that square-free factorization may impose further restrictions on the parameters. Comprehensive treatment of square-free factorization is considered in [19].
7.1 Eigenvalue multiplicity example
The following matrix, due originally to a question on sci.math.num-analysis in 1990 by Kenton K. Yee, is discussed in [4]. We change the notation used there to avoid a clash with other notation used here. The matrix is
[TABLE]
One of the original questions was to compute its eigenvectors. Since it contains a symbolic parameter , this is a parametric eigenvalue problem which we can turn into a parametric linear system, namely to present the nullspace regimes for .
Over , after preconditioning, we get as the triangular Smith form diagonal , where .
Remark 7.1*.*
Without preconditioning, the Hermite form diagonal is instead .
The denominator of , is a power of , so the only constraint is which is already a constraint for the input matrix. We get regimes of rank for , rank for being a root of , and rank for all other . In terms of the eigenvalue problem, we get eigenspaces of dimension for each of , and of dimension for the two roots of .
To explore algebraic multiplicity, we can examine when has or as a root. When is a root of , factors as and when we have . These factorizations may be discovered by taking resultants of with or .
7.2 Matrix Logarithm
Theorem 1.28 of [13] states conditions under which the matrix equation has so-called primary matrix logarithm solutions, and under which conditions there are more. If the number of distinct eigenvalues of is strictly less than the number of distinct Jordan blocks of (that is, the matrix is derogatory), then the equation also has so-called nonprimary solutions as well, where the branches of logarithms of an eigenvalue may be chosen differently in each instance it occurs.
As a simple example of what this means, consider
[TABLE]
When we compute its matrix logarithm (for instance using the MatrixFunction command in Maple), we find
[TABLE]
This is what we expect, and taking the matrix exponential (a single-valued matrix function) gets us back to , as expected. However, if instead we consider the derogatory matrix
[TABLE]
then its matrix logarithm as computed by MatrixFunction is also derogatory, namely
[TABLE]
Yet there are other solutions as well: if we add to the first entry and to the second logarithm, we unsurprisingly find another matrix which also satisfies . But adding to the first entry of while adding to its second logarithm, we get another matrix
[TABLE]
which has the (somewhat surprising) property that , not .
This example demonstrates in a minimal way that the detailed Jordan structure of strongly affects the nature of the solutions to the matrix equation . This motivates the ability of code to detect automatically the differing values of the parameters in a matrix that make it derogatory. To explicitly connect this example to CTSNF, consider
[TABLE]
so that above is and . The CTSNF applied to produces two regimes, with forms
[TABLE]
exposing when the logarithms will be linked or distinct. Note that in this case the Frobenius structure equals the Jordan structure.
7.3 Model of infectious disease vaccine effect
[23] have made a model of vaccine effect when there are two subpopulations with differing disease susceptibility and vaccination rates. Within this study stability of the model is a function of the eigenvalues of a Jacobian . Thus we are interested in cases where the following matrix is singular.
[TABLE]
Here are vaccination rates for the two populations, are death rates, are within population transmission rates, and are the between population transmission rates. We have simplified somewhat: for instance are transmission rates multiplied by other parameters concerning population counts. Stability depends on the positivity of the largest real part of an eigenvalue. For the sake of reducing expression sizes in this example we will arbitrarily set . For the same reason we will skip right multiplication by an R to achieve triangular Smith form. Hermite form of will suffice, revealing the eigenvalues that are wanted.
[TABLE]
The discriminant of the last entry gives the desired information for the application subject to the denominator validity: . When the matrix is already in Hermite form, so again the desired information is provided.
This example illustrates that often more than three parameters can be easily handled. In experiments with this model not reported here, we did encounter cases demanding solution beyond the methods of this paper. On a more positive note, we feel that comprehensive normal form tools could help analyze models like this when larger in scope, for instance modeling or more subpopulations.
7.4 The Kac-Murdock-Szegö example
[7] report times for computation of the comprehensive Jordan form for matrices of the following form, taken from [29], of dimensions to about :
[TABLE]
This is, apart from the entry and the entry, a Toeplitz matrix containing one parameter, . The reported times to compute the Jordan form were plotted in [7] on a log scale, and looked as though they were exponentially growing with the dimension, and were reported in that paper as growing exponentially.
The results of this paper show instead that polynomial time is possible for this family, because there are only two parameters ( and the eigenvalue parameter, say ). The Hermite forms for these matrices are all (as far as we have computed) trivial, with diagonal all except the final entry which contains the determinant. Thus all the action for the Jordan form must happen with the discriminant of the determinant. Experimentally, the discriminant with respect to has degree for KMS matrices of dimension (this formula was deduced experimentally by giving a sequence of these degrees to the Online Encyclopedia of Integer Sequences [27]) and each discriminant has a factor , leaving a nontrivial factor of degree growing only linearly with dimension. The case does indeed give a derogatory KMS matrix (the identity matrix). The other factor has at most a linearly-growing number of roots for each of which we expect the Jordan form of the corresponding KMS matrix to have one block of size two and the rest of size one. We therefore see only polynomial cost necessary to compute comprehensive Jordan forms for these matrices, in accord with our theorem.
8 Conclusions
We have shown that using the CTSNF to solve parametric linear systems is of cost polynomial in the dimension of the linear system and polynomial in parameter degree, for problems containing up to three parameters. This shows that polynomially many regimes suffice for problems of this type. To the best of our knowledge, this is the first method to prove a polynomial regime bound for parametric linear systems with at most three parameters, certainly in the constructible-regime model considered here.
It remains an open question whether, for linear systems with a fixed number of parameters greater than three, a number of regimes suffices that is polynomial in the input matrix dimension and polynomial degree of the parameters, being exponential only in the number of parameters.
Through experiments with random matrices we have indications that the worst case bounds we give are sharp, though we haven’t proven this point. As the examples indicated, many problems will have fewer regimes, and sometimes substantially fewer regimes. We have not investigated the effects of further restrictions of the type of problem, such as to sparse matrices.
Acknowledgements
This work was supported by the Natural Sciences and Engineering Research Council of Canada and by the Ontario Research Centre for Computer Algebra. The third author, L. Rafiee Sevyeri, would like to thank the Symbolic Computation Group (SCG) at the David R. Cheriton School of Computer Science of the University of Waterloo for their support while she was a visiting researcher there.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Boyer and Kaltofen [2014] Boyer B, Kaltofen EL. Numerical linear system solving with parametric entries by error correction. In: Proceedings of the 2014 Symposium on Symbolic-Numeric Computation; 2014. p. 33–38. 10.1145/2631948.2631956 . · doi ↗
- 2Buchberger [1988] Buchberger B. Applications of Gröbner bases in non-linear computational geometry. In: Janßen R, editor. Trends in Computer Algebra Berlin, Heidelberg: Springer Berlin Heidelberg; 1988. p. 52–80. 10.1007/3-540-18928-9_5 . · doi ↗
- 3Camargos Couto et al. [2020] Camargos Couto AC, Moreno Maza M, Linder D, Jeffrey DJ, Corless RM. Comprehensive LU Factors of Polynomial Matrices. In: Slamanig D, Tsigaridas E, Zafeirakopoulos Z, editors. Mathematical Aspects of Computer and Information Sciences Cham: Springer International Publishing; 2020. p. 80–88. 10.1007/978-3-030-43120-4_8 . · doi ↗
- 4Corless [2002] Corless RM. Essential Maple 7: an introduction for scientific programmers. Springer Science & Business Media; 2002. 10.1007/b 97270 . · doi ↗
- 5Corless et al. [2020] Corless RM, Giesbrecht M, Rafiee Sevyeri L, Saunders BD. On Parametric Linear System Solving. In: Computer Algebra in Scientific Computing Springer; 2020. p. 188–205. ”10.1007/978-3-030-60026-6_11” . · doi ↗
- 6Corless and Jeffrey [1997] Corless RM, Jeffrey DJ. The Turing Factorization of a Rectangular Matrix. SIGSAM Bull. 1997 Sep;31(3):20–30. 10.1145/271130.271135 . · doi ↗
- 7Corless et al. [2017] Corless RM, Moreno Maza M, Thornton SE. Jordan Canonical Form with Parameters from Frobenius Form with Parameters. In: Blömer J, Kotsireas IS, Kutsia T, Simos DE, editors. Mathematical Aspects of Computer and Information Sciences Springer International Publishing; 2017. p. 179–194. 10.1007/978-3-319-72453-9_13 . · doi ↗
- 8Cox et al. [2013] Cox D, Little J, O’Shea D. Ideals, varieties, and algorithms: an introduction to computational algebraic geometry and commutative algebra. Springer Science & Business Media; 2013. 10.1007/978-0-387-35651-8 . · doi ↗
