Computing Canonical Bases of Modules of Univariate Relations
Vincent Neiger
Technical University of DenmarkKgs. LyngbyDenmark
[email protected]
and
Vu Thi Xuan
ENS de Lyon, LIP (CNRS, Inria, ENSL, UCBL)LyonFrance
[email protected]
( 2017)
Abstract.
We study the computation of canonical bases of sets of univariate relations
(p1,…,pm)∈K[x]m such that p1f1+⋯+pmfm=0; here, the input elements f1,…,fm are from a quotient
K[x]n/M, where M is a K[x]-module
of rank n given by a basis M∈K[x]n×n in
Hermite form. We exploit the triangular shape of M to generalize a
divide-and-conquer approach which originates from fast minimal approximant
basis algorithms. Besides recent techniques for this approach, we rely on
high-order lifting to perform fast modular products of polynomial matrices of
the form PFmodM.
Our algorithm uses O\leavevmode ~(mω−1D+nωD/m) operations in
K, where D=deg(det(M)) is the K-vector space
dimension of K[x]n/M, O\leavevmode ~(⋅) indicates that
logarithmic factors are omitted, and ω is the exponent of matrix
multiplication. This had previously only been achieved for a diagonal matrix
M. Furthermore, our algorithm can be used to compute the shifted
Popov form of a nonsingular matrix within the same cost bound, up to
logarithmic factors, as the previously fastest known algorithm, which is
randomized.
Polynomial matrix; shifted Popov form; division with remainder;
univariate equations; syzygy module.
††journalyear: 2017††copyright: acmlicensed††conference: ISSAC ’17; July 25-28, 2017; Kaiserslautern, Germany††price: 15.00††doi: 10.1145/3087604.3087656††isbn: 978-1-4503-5064-8/17/07
1. Introduction
In what follows, K is a field, K[x] denotes the set of univariate
polynomials in x over K, and K[x]m×n denotes
the set of m×n (univariate) polynomial matrices.
Univariate relations
Let us consider a (free) K[x]-submodule M⊆K[x]n of rank n, specified by one of its bases, represented as
the rows of a nonsingular matrix M∈K[x]n×n. Besides, let some
elements f1,…,fm∈K[x]n/M be represented as a
matrix F∈K[x]m×n. Then, the kernel of the module morphism
[TABLE]
consists of relations between the fi’s, and is known as a syzygy
module (Eisenbud, 2005). From the matrix viewpoint above, we write it as
[TABLE]
where the notation A=0modM stands for “A=QM for some Q”, which means that the rows of A
are in the module M. Hereafter, the elements of R(M,F) are called
relations of R(M,F).
Examples of such relations are the following.
Hermite-Padé approximants are relations for n=1 and M=xDK[x]. That is, given polynomials f1,…,fm, the
corresponding approximants are all (p1,…,pm)∈K[x]m such that p1f1+⋯+pmfm=0modxD. Fast algorithms for finding such approximants include
(Van Barel and
Bultheel, 1991; Beckermann and
Labahn, 1994; Giorgi
et al., 2003; Zhou and Labahn, 2012; Jeannerod et al., 2016).
Multipoint Padé approximants: the fast computation of
relations when M is a product of ideals, corresponding to a
diagonal basis M=diag(M1,…,Mn), was studied in
(Beckermann, 1992; Van Barel and
Bultheel, 1992; Beckermann and
Labahn, 1997; Jeannerod et al., 2017, 2016; Neiger, 2016b). Many
of these references focus on M1,…,Mn which split over K
with known roots and multiplicities; then, relations are known as
multipoint Padé approximants (Baker and
Graves-Morris, 1996), or also
interpolants (Beckermann and
Labahn, 1997; Jeannerod et al., 2017). In this case, a relation
can be thought of as a solution to a linear system over K[x] in which
the jth equation is modulo Mj.
Canonical bases
Since det(M)K[x]m⊆R(M,F)⊆K[x]m, the module R(M,F) is free of rank
m (Dummit and Foote, 2004, Sec. 12.1, Thm. 4). Hence, any of its bases can be
represented as the rows of a nonsingular matrix in K[x]m×m, which we
call a relation basis for R(M,F).
Here, we are interested in computing relation bases in shifted Popov
form (Popov, 1972; Beckermann
et al., 1999). Such bases are canonical in terms of the module
R(M,F) and of a shift, the latter being a tuple s∈Zn used as column weights in the notion of degree for row vectors.
Furthermore, the degrees in shifted Popov bases are well controlled, which
helps to compute them faster than less constrained types of bases (see
(Jeannerod et al., 2016) and (Neiger, 2016a, Sec. 1.2.2)) and then, once obtained,
to exploit them for other purposes (see for example (Rosenkilde and
Storjohann, 2016, Thm. 12)).
Having a shifted Popov basis of a submodule M⊆K[x]n
is particularly useful for efficient computations in the quotient
K[x]n/M (see Section 3).
In fact, shifted Popov bases coincide with Gröbner bases for
K[x]-submodules of K[x]n (Eisenbud, 1995, Chap. 15), for a
term-over-position monomial order weighted by the entries of the shift. For
more details about this link, we refer to (Middeke, 2011, Chap. 6) and
(Neiger, 2016a, Chap. 1).
For a shift s=(s1,…,sn)∈Zn, the s-degree of a row vector p=[p1,…,pn]∈K[x]1×n is max1⩽j⩽n(deg(pj)+sj); the s-row degree of a matrix P∈K[x]m×n is rdegs(P)=(d1,…,dm) with di the s-degree of the ith row of
P. Then, the s-leading matrix of P=[pi,j]ij is the matrix lms(P)∈Km×n whose entry (i,j) is the coefficient of degree di−sj of pi,j. Similarly, the list of column degrees of a matrix
P is denoted by cdeg(P).
Definition 1.0 ((Kailath, 1980; Beckermann
et al., 1999)).
Let P∈K[x]m×m be nonsingular, and let s∈Zm. Then, P is said to be in
s-reduced form if lms(P) is
invertible;
s-Popov form if lms(P)
is unit lower triangular and lm0(PT)
is the identity matrix.
Hereafter, when we introduce a matrix by saying that it is reduced, it is
understood that it is nonsingular. Similar forms can be defined for modules
generated by the columns of a matrix rather than by its rows; in the context of
polynomial matrix division with remainder, we will use the notion of P
in column reduced form, meaning that
lm0(PT) is invertible. In particular, we
remark that any matrix in shifted Popov form is also column reduced.
Considering relation bases P for R(M,F) in shifted Popov form
offers a strong control over the degrees of their entries. As shifted (row)
reduced bases, they satisfy the predictable degree property
(Forney, Jr., 1975), which is at the core of the correctness of a
divide-and-conquer approach behind most algorithms for the two specific
situations described above, for example
(Beckermann and
Labahn, 1994; Giorgi
et al., 2003; Giorgi and
Lebreton, 2014; Jeannerod et al., 2017). Furthermore, as column reduced
matrices they have small average column degree, which is central in the
efficiency of fast algorithms for non-uniform shifts
(Jeannerod et al., 2016; Neiger, 2016b). Indeed, we will see in Corollary 2.0 that
[TABLE]
where ∣⋅∣ denotes the sum of the entries of a tuple.
Below, triangular canonical bases will play an important role. A matrix
M∈K[x]n×n is in Hermite form if M is upper
triangular and lm0(MT) is the identity matrix;
or, equivalently, if M is in (dn,d(n−1),…,d)-Popov
form for any d⩾deg(det(M)).
Relations modulo Hermite forms
Our main focus is on the case where M is in Hermite form and F is
already reduced modulo M. In this article, all comparisons of tuples
are componentwise.
Theorem 1.0.
If M is in Hermite form and cdeg(F)<cdeg(M), there is
a deterministic algorithm which solves Problem 1 using
[TABLE]
operations in K, where D=deg(det(M))=∣cdeg(M)∣.
Here, the exponent ω is so that we can multiply m×m
matrices over K in O(mω) operations in K, the
best known bound being ω<2.38 (Coppersmith and
Winograd, 1990; Le Gall, 2014). The
notation O\leavevmode ~(⋅) means that we have omitted the logarithmic factors in
the asymptotic bound.
To put this cost bound in perspective, we note that the representation of the
input F and M requires at most (m+n)D field
elements, while that of the output basis uses at most mD elements.
In many applications we have n∈O(m), in which case the cost
bound becomes O\leavevmode ~(mω−1D), which is satisfactory.
To the best of our knowledge, previous algorithms with a comparable cost bound
focus on the case of a diagonal matrix M.
The case of minimal approximant bases M=xdIn has
concentrated a lot of attention. A first algorithm with cost quasi-linear in
d was given (Beckermann and
Labahn, 1994). It was then improved in
(Giorgi
et al., 2003; Storjohann, 2006; Zhou and Labahn, 2012), obtaining the cost bound
O\leavevmode ~(mω−1nd)=O\leavevmode ~(mω−1D)
under assumptions on the dimensions m and n or on the shift.
In (Jeannerod et al., 2017), the divide-and-conquer approach of (Beckermann and
Labahn, 1994) was
carried over and made efficient in the more general case M=diag(M1,…,Mn), where the polynomials Mi split over K
with known linear factors. This approach was then augmented in
(Jeannerod et al., 2016) with a strategy focusing on degree information to efficiently
compute the shifted Popov bases for arbitrary shifts, achieving the cost bound
O\leavevmode ~(mω−1D).
Then, the case of a diagonal matrix M, with no assumption on the
diagonal entries, was solved within O\leavevmode ~(mω−1D+nωD/m) (Neiger, 2016b). The main new ingredient
developed in (Neiger, 2016b) was an efficient algorithm for the case n=1,
that is, when solving a single linear equation modulo a polynomial; we will
also make use of this algorithm here.
In this paper we obtain the same cost bound as (Neiger, 2016b) for any matrix
M in Hermite form. For a more detailed comparison with earlier
algorithms focusing on diagonal matrices M, we refer the reader to
(Neiger, 2016b, Sec. 1.2) and in particular Table 2 therein.
Our algorithm essentially follows the approach of (Neiger, 2016b). In
particular, it uses the algorithm developed there for n=1. However,
working modulo Hermite forms instead of diagonal matrices makes the computation
of residuals much more involved. The residual is a modular product
PFmodM which is computed after the first recursive call
and is to be used as an input replacing F for the second recursive call.
When M is diagonal, its computation boils down to the multiplication of
P and F, although care has to be taken to account for their
possibly unbalanced column degrees. However, when M is triangular,
computing PFmodM becomes a much greater challenge: we
want to compute a matrix remainder instead of simply taking polynomial
remainders for each column separately. We handle this, while still taking
unbalanced degrees into account, by resorting to high-order lifting
(Storjohann, 2003).
Shifted Popov forms of matrices
A specific instance of Problem 1 yields the following problem: given a
shift s∈Zn and a nonsingular matrix M∈K[x]n×n, compute the s-Popov form of M. Indeed,
the latter is the s-Popov relation basis for
R(M,In) (see Lemma 2.0).
To compute this relation basis efficiently, we start by computing the Hermite
form H of M, which can be done deterministically in
O\leavevmode ~(nω⌈DM/n⌉) operations
(Labahn
et al., 2017). Here, DM is the generic determinant bound
(Gupta et al., 2012); writing M=[aij], it is defined as
[TABLE]
where Sn is the set of permutations of {1,…,n}. In
particular, DM/n is bounded from above by both the average of the
degrees of the columns of M and that of its rows. For more details
about this quantity, we refer to (Gupta et al., 2012, Sec. 6) and
(Labahn
et al., 2017, Sec. 2.3).
Since the rows of H generate the same module as M, we have
R(M,In)=R(H,In)
(see Lemma 2.0). Then, applying our algorithm for relations
modulo H has a cost of O\leavevmode ~(nω−1deg(det(H))) operations, according to Theorem 1.0. This
yields the next result.
Theorem 1.0.
Given a shift s∈Zn and a nonsingular matrix M∈K[x]n×n, there is a deterministic algorithm which computes the
s-Popov form of M using
[TABLE]
operations in K.
A similar cost bound was obtained in (Neiger, 2016b), yet with a randomized
algorithm. The latter follows the approach of (Gupta and
Storjohann, 2011) for computing
Hermite forms, whose first step determines the Smith form S of M
along with a matrix F such that the sought matrix is the s-Popov
relation basis for R(S,F), with S being therefore
a diagonal matrix. Here, relying on the deterministic computation of the
Hermite form of M, our algorithm for relation bases modulo Hermite
forms allows us to circumvent the computation of S, for which the
currently fastest known algorithm is Las Vegas randomized (Storjohann, 2003).
For a more detailed comparison with earlier row reduction and Popov forms
algorithms, we refer to (Neiger, 2016b, Sec. 1.1) and Table 1 therein.
General relation bases
To solve the general case of Problem 1, one can proceed as follows:
find the Hermite form H
of M, using (Labahn
et al., 2017, Algo. 1 and 3);
reduce F modulo H, for example using
Algorithm 1;
apply Algorithm 5 for relations modulo a Hermite form.
Outline
We first give basic properties about matrix division and relation bases
(Section 2). We then focus on the fast computation of residuals
(Section 3). After that, we discuss three situations which have
already been solved efficiently in the literature (Section 4):
when n=1, when information on the output degrees is available, and when
D⩽m. Finally, we present our algorithm for relations modulo
Hermite forms (Section 5).
2. Preliminaries on polynomial matrix division and modules of relations
Division with remainder
Polynomial matrix division is a central notion in this paper, since we aim at
solving equations modulo M.
Theorem 2.0 ((Gantmacher, 1959, IV.§2),(Kailath, 1980, Thm. 6.3-15)).
For any F∈K[x]m×n and any column reduced M∈K[x]n×n,
there exist unique matrices Q,R∈K[x]m×n such that F=QM+R and cdeg(R)<cdeg(M).
Hereafter, we write Quo(F,M) and Rem(F,M) for the
quotient Q and the remainder R. We have the following
properties.
Lemma 2.0.
We have Rem(PRem(F,M),M)=Rem(PF,M) and Rem([FG],M)=[Rem(F,M)Rem(G,M)]
for any F∈K[x]m×n, G∈K[x]∗×n, P∈K[x]∗×m and any column reduced M∈K[x]n×n.
Degree control for relation bases
We first relate the vector
space dimension of quotients and the degree of determinant of bases.
Lemma 2.0.
Let M be a K[x]-submodule of K[x]n of rank n.
Then, the dimension of K[x]n/M as a K-vector space
is deg(det(M)), for any matrix M∈K[x]n×n whose rows
form a basis of M.
Proof.
Since the degree of the determinant is the same for all bases of M,
we may assume that M is column reduced. Then, Theorem 2.0
implies that there is a K-vector space isomorphism
K[x]n/M≅K[x]/(xd1)×⋯×K[x]/(xdn), where (d1,…,dn)=cdeg(M). Thus, the dimension of
K[x]n/M is d1+⋯+dn, which is
equal to deg(det(M)) according to (Kailath, 1980, Sec. 6.3.2).
∎
This allows us to bound the sum of column degrees of any column reduced
relation basis; for example, a shifted Popov relation basis.
Corollary 2.0.
Let F∈K[x]m×n, and let M∈K[x]n×n be nonsingular.
Then, any relation basis P∈K[x]m×m for R(M,F) is such
that deg(det(P))⩽deg(det(M)). In particular, if
P is column reduced, then ∣cdeg(P)∣⩽deg(det(M)).
Proof.
Let M be the row space of M. By definition, R(M,F) is
the kernel of φM,f (see Section 1), hence K[x]m/R(M,F) is isomorphic to a submodule of K[x]n/M.
Since, by Lemma 2.0, the dimensions of K[x]m/R(M,F) and K[x]m/M are deg(det(P)) and
deg(det(M)), we obtain deg(det(P))⩽deg(det(M)).
∎
Properties of relation bases
We now formalize the facts that R(M,F) is not changed if M is
replaced by another basis of the module generated by its rows; or if F and
M are right-multiplied by the same nonsingular matrix; or yet if F
is considered modulo M.
Lemma 2.0.
Let F∈K[x]m×n, and let M∈K[x]n×n be nonsingular.
Then, for any nonsingular A∈K[x]n×n, any matrix B∈K[x]m×n, and any unimodular U∈K[x]m×m, we have
[TABLE]
A first consequence is that we may discard identity columns in M.
Corollary 2.0.
Let F∈K[x]m×n, and let M∈K[x]n×n be nonsingular.
Suppose that M has at least k∈Z>0 identity columns, and that the
corresponding columns of F are zero. Then, let π1,π2
be n×n permutation matrices such that
[TABLE]
where G∈K[x]m×(n−k). Then, R(M,F)=R(N,G).
Another consequence concerns the transformation of a matrix into shifted Popov
form. Indeed, Lemma 2.0 together with the next lemma imply in
particular that the s-Popov form of M is the s-Popov
relation basis for R(H,In), where H
is the Hermite form of M.
Lemma 2.0.
Let M∈K[x]n×n be nonsingular. Then, M is a relation
basis for R(M,In). It follows that the
s-Popov form of M is the s-Popov relation basis for
R(M,Im), for any s∈Zn.
Proof.
Let P∈K[x]n×n be a relation basis for
R(M,In). Then, PIn=QM for some Q∈K[x]n×n; since the rows of M
belong to R(M,In), we also have M=RP for some R∈K[x]n×n. Since P is
nonsingular, P=QRP implies that QR=In, and therefore R is unimodular. Thus, M=RP is a relation basis for
R(M,In).
∎
Divide and conquer approach
Here we give properties in the case of a block triangular matrix M.
They imply, if M is in Hermite form, that Problem 1 can be
solved recursively by splitting the instance in dimension n into two
instances in dimension n/2.
Lemma 2.0.
Let M1∈K[x]n1×n1, M2∈K[x]n2×n2,
and A∈K[x]n1×n2 be such that
\mathbf{{M}}=\big{[}\begin{smallmatrix}\mathbf{{M}}_{1}&\mathbf{{A}}\\
\mathbf{{0}}&\mathbf{{M}}_{2}\end{smallmatrix}\big{]} is column reduced. For any F1∈K[x]m×n1 and F2∈K[x]m×n2,
we have Rem([F1F2],M)=[Rem(F1,M1)Rem(F2−Quo(F1,M1)A,M2)].
Proof.
Writing [F1F2]=[Q1Q2]M+[R1R2] where
cdeg([R1R2])<cdeg(M), we obtain F1=Q1M1+R1 as well as
cdeg(R1)<cdeg(M1), and therefore R1=Rem(F1,M1) and Q1=Quo(F1,M1). The result
follows from F2=Q1A+Q2M2+R2.
∎
Theorem 2.0.
Let \mathbf{{M}}=\big{[}\begin{smallmatrix}\mathbf{{M}}_{1}&\boldsymbol{\ast}\\
\mathbf{{0}}&\mathbf{{M}}_{2}\end{smallmatrix}\big{]} be column reduced, where
M1∈K[x]n1×n1 and M2∈K[x]n2×n2,
and let F1∈K[x]m×n1 and F2∈K[x]m×n2. If P1 is a basis for
R(M1,F1), then
Rem(P1[F1F2],M) has the form [0G]
for some G∈K[x]m×n2; if furthermore P2 is
a basis for R(M2,G), then P2P1 is a
basis for R(M,[F1F2]).
Proof.
It follows from Lemma 2.0 that the first n1 columns of
Rem(P1[F1F2],M) are
Rem(P1F1,M1), which is zero, and that
Rem([0G],M)=[0Rem(G,M2)]. Then,
the first identity in Lemma 2.0 implies both that
R(M,[0G])=R(M2,G)
and that the rows of P2P1 are in
R(M,[F1F2]). Now let
p∈R(M,[F1F2]).
Lemma 2.0 implies that
p∈R(M1,F1), hence p=λP1 for some λ. Then, the first
identity in Lemma 2.0 shows that 0=Rem(λP1[F1F2],M)=Rem(λ[0G],M), and therefore
λ∈R(M2,G). Thus λ=μP2 for some μ, and
p=μP2P1.
∎
3. Computing modular products
In this section, we aim at designing a fast algorithm for the modular products
that arise in our relation basis algorithm.
3.1. Fast division with remainder
For univariate polynomials, fast Euclidean division can be achieved by first
computing the reversed quotient via Newton iteration, and then deducing the
remainder (Gathen and
Gerhard, 2013, Chap. 9). This directly translates into the
context of polynomial matrices, as was noted for example in the proof of
(Giorgi
et al., 2003, Lem. 3.4) or in (Zhou, 2012, Chap. 10).
In the latter reference, it is showed how to efficiently compute remainders
Rem(E,M) for a matrix E as in
Eq. 1 below; this is not general enough for our purpose.
Algorithms for the general case have been studied
(Favati and Lotti, 1991; Zhang and Chen, 1983; Wolovich, 1984; Codenotti and
Lotti, 1989; Wang and Zhou, 1986), but we are not aware of
any that achieves the speed we desire. Thus, as a preliminary to the
computation of residuals in Section 3.2, we now detail
this extension of fast polynomial division to fast polynomial matrix division.
As mentioned above, we will start by computing the quotient. The degrees of its
entries are controlled thanks to the reducedness of the divisor, which ensures
that no high-degree cancellation can occur when multiplying the quotient and
the divisor.
Lemma 3.0.
Let M∈K[x]n×n, F∈K[x]m×n, and δ∈Z>0 be
such that M is column reduced and cdeg(F)<cdeg(M)+(δ,…,δ). Then, deg(Quo(F,M))<δ.
Proof.
First, lm0(MT)T=lm−d(M) where d=cdeg(M)∈Z⩾0n:
the 0-column leading matrix of M is equal to its
−d-row leading matrix. Since M is 0-column reduced,
it is also −d-row reduced.
Thus, by the predictable degree property (Kailath, 1980, Thm. 6.3-13) and
since since rdeg−d(M)=0, we have
rdeg−d(QM)=rdeg0(Q). Here, we
write Q=Quo(F,M) and R=Rem(F,M).
Now, our assumption cdeg(F)<d+(δ,…,δ) and
the fact that cdeg(R)<d imply that cdeg(F−R)<d+(d,…,d), and thus rdeg−d(F−R)<(δ,…,δ). Since F−R=QM, from the
previous paragraph we obtain rdeg0(Q)<(δ,…,δ), hence deg(Q)<δ.
∎
Corollary 3.0.
Let M∈K[x]n×n and F∈K[x]m×n be such that M
is column reduced and cdeg(F)<cdeg(M), and let P∈K[x]k×m. Then, rdeg(Quo(PF,M))<rdeg(P).
Proof.
For the case k=1, the inequality follows from Lemma 3.0
since cdeg(PF)⩽(δ,…,δ)+cdeg(F)<(δ,…,δ)+cdeg(M), where δ=deg(P).
Then, the general case k∈Z>0 follows by considering separately each
row of P.
∎
Going back to the division F=QM+R, to obtain the
reversed quotient we will right-multiply the reversed F by an expansion of
the inverse of the reversed M. This operation is performed efficiently
by means of high-order lifting; we will use the next result.
Lemma 3.0.
Let M∈K[x]n×n with M(0) nonsingular, and let F∈K[x]m×n. Then, defining d=⌈∣cdeg(M)∣/n⌉, the truncated x-adic expansion FM−1modxkd can be computed deterministically using O\leavevmode ~(⌈mk/n⌉nωd) operations in K.
Proof.
This is a minor extension of (Storjohann, 2003, Prop. 15), incorporating the
average column degree of the matrix M instead of the largest degree of
its entries. This can be done by means of partial column linearization
(Gupta et al., 2012, Sec. 6), as follows. One first expands the high-degree
columns of M and inserts elementary rows to obtain a matrix
M∈K[x]n×n such that n⩽n<2n, deg(M)⩽d, and
M−1 is the n×n principal leading submatrix of
M−1 (Gupta et al., 2012, Thm. 10 and Cor. 2). Then,
defining F=[F0]∈K[x]m×n, we have that FM−1 is the
submatrix of FM−1 formed by its first
n columns. Thus, the sought truncated expansion is obtained by computing
FM−1modxkd, which is
done efficiently by (Storjohann, 2003, Alg. 4) with the choice X=xd; this is valid since this polynomial is coprime to
det(M)=det(M) and its degree is at least the degree
of M.
∎
Proposition 3.0.
Algorithm 1* is correct. Assuming that both mδ and
n are in O(D), where D=∣cdeg(M)∣,
this algorithm uses O\leavevmode ~(⌈m/n⌉nω−1D) operations in K.
*
Proof.
Let Q=Quo(F,M), R=Rem(F,M), and
(d1,…,dn)=cdeg(M). We have the bounds
cdeg(F)<(δ+d1,…,δ+dn),
cdeg(R)<(d1,…,dn), and
Lemma 3.0 gives deg(Q)<δ. Thus, we can
define the reversals of these polynomial matrices as
[TABLE]
for which the same degree bounds hold. Then, right-multiplying both sides of
the identity F(x−1)=Q(x−1)M(x−1)+R(x−1) by
diag(xδ+d1−1,…,xδ+dn−1), we
obtain Frev=QrevMrev+xδRrev.
Now, note that the constant term Mrev(0)∈Kn×n is
equal to the column leading matrix of M, which is invertible since
M is column reduced, hence Mrev is invertible (over the
fractions). Thus, since deg(Qrev)<δ, this reversed
quotient matrix can be determined as the truncated expansion Qrev=FrevMrev−1modxδ. This proves the
correctness of the algorithm.
Concerning the cost bound, Step 2 uses O\leavevmode ~(⌈(mδ)/(nd)⌉nωd) operations
according to Lemma 3.0, where d=⌈D/n⌉. We
have by assumption d∈Θ(D/n) as well as
mδ/(nd)∈O(1), so that this cost bound is in
O\leavevmode ~(nω−1D).
In Step 3, we multiply the m×n matrix Q of
degree less than δ with the n×n matrix M such
that ∣cdeg(M)∣=D. First consider the case m⩽n. To perform this product efficiently, we expand the rows of Q
so as to obtain a O(n)×n matrix Q of
degree in O(⌈mδ/n⌉) and such that QM is easily retrieved from QM (see
Section 3.2 for more details about how such row
expansions are carried out). Thus, this product is done in
O\leavevmode ~(nω−1D), since ⌈mδ/n⌉∈O(D/n). On the other hand, if m>n,
we have δ∈O(D/m)⊆O(D/n).
Then, we can compute the product QM via
⌈m/n⌉ products of n×n matrices of degree
O(D/n), which cost each O\leavevmode ~(nω−1D)
operations; hence the total cost O\leavevmode ~(mnω−2D)
when m>n.
∎
3.2. Fast residual computation
Here, we focus on performing modular products Rem(PF,M),
where F∈K[x]m×n and P∈K[x]m×m are such that
cdeg(F)<cdeg(M) and ∣cdeg(P)∣⩽∣cdeg(M)∣, and M∈K[x]n×n is column reduced. The
difficulty in designing a fast algorithm for this operation comes from the
non-uniformity of cdeg(P): in particular, the product PF
cannot be computed within the target cost bound.
To start with, we use the same strategy as in (Jeannerod et al., 2016; Neiger, 2016b): we
make the column degrees of P uniform, at the price of introducing
another, simpler matrix E for which we want to compute
Rem(EF,M).
Let (δ1,…,δm)=cdeg(P), δ=⌈(δ1+⋯+δm)/m⌉⩾1, and for
i∈{1,…,m} write δi=(αi−1)δ+βi
with αi=⌈δi/δ⌉ and 1⩽βi⩽δ if δi>0, and with αi=1 and βi=0 if
δi=0. Then, let m=α1+⋯+αm, and define E∈K[x]m×m as the transpose of
[TABLE]
Define also the expanded column degrees δ∈Z⩾0m as
[TABLE]
Then, we expand the columns of P by considering P∈K[x]m×m such that P=PE and deg(P)⩽δ. (Note that
P can be made unique by specifying more constraints on
cdeg(P).) The aim of this construction is that the dimension
is at most doubled while the degree of the expanded matrix becomes the average
column degree of P. Precisely, m⩽m<2m and
max(δ)=δ=⌈∣cdeg(P)∣/m⌉.
Now, we have Rem(PF,M)=Rem(PEF,M)=Rem(PF,M) by
Lemma 2.0, where F=Rem(EF,M). Thus, Rem(PF,M) can be
obtained by computing first F and then
Rem(PF,M). For the latter, since
P has small degree, one can compute the product and then
perform the division (Steps 3 and 4 of
Algorithm 3). Step 2 of Algorithm 3 efficiently
computes F, relying on Algorithm 2.
Proposition 3.0.
Algorithm 2* is correct. Assuming that both 2kmδ
and n are in O(D), where D=∣cdeg(M)∣,
this algorithm uses O\leavevmode ~((2kmnω−2+knω−1)D) operations in K.*
Proof.
The correctness is a consequence of the two properties in
Lemma 2.0.
Now, if 2kmδ and n are in O(D), the
assumptions in Proposition 3.0 about the input parameters for
PM-QuoRem are always satisfied in recursive calls, since the row
dimension m is doubled while the exponent 2kδ is halved. From
the same proposition, we deduce the cost bound O\leavevmode ~((∑0⩽r⩽k−1⌈2rm/n⌉)nω−1D).
∎
Proposition 3.0.
Algorithm 3* is correct. Assuming that all of
∣cdeg(P)∣, m, and n are in O(D), where
D=∣cdeg(M)∣, this algorithm uses
O\leavevmode ~((mω−1+nω−1)D) operations in
K.*
Proof.
Let us consider E∈K[x]m×m defined
as in Eq. 1 from the parameters δ and
α1,…,αm in Step 1. We claim that the
matrix F computed at Step 2 is equal to
Rem(EF,M). Then, having
cdeg(PF)<cdeg(M)+(δ,…,δ), the correctness of PM-QuoRem implies
R=Rem(PF,M), which is
Rem(PF,M) by Lemma 2.0.
To prove our claim, it is enough to show that, for 1⩽i⩽m, the
ith block Fi of F is the matrix formed by
stacking the remainders involving the row i of F, that is,
(Rem(xrδFi,∗,M))0⩽r<αi. This is clear from the first For loop if αi=1. Otherwise, let k∈Z>0 be such that 2k−1<αi⩽2k.
Then, at the kth iteration of the second loop, we have ij=i for some
1⩽j⩽ℓ. Thus, the correctness of RemOfShifts implies
that, for 0⩽r<2k, the row j of Rr is
Rem(xrδGj,∗,M)=Rem(xrδFi,∗,M). Since 2k⩾αi, this contains the
wanted remainders and the claim follows.
Let us show the cost bound, assuming that ∣cdeg(P)∣, m,
and n are in O(D). Note that this implies mδ∈O(D).
We first study the cost of the iteration k of the second loop of
Step 2. We have that 2k−1ℓ⩽α1+⋯+αm=m⩽2m, the row dimension of G is
ℓ, and k⩽⌈log(maxi(αi))⌉∈O(log(m)). Thus, the call to RemOfShifts costs
O\leavevmode ~((mnω−2+nω−1)D) operations
according to Proposition 3.0, and the same cost bound holds
for the whole Step 2. Concerning Step 4, the cost bound
O\leavevmode ~(⌈m/n⌉nω−1D) follows
directly from Proposition 3.0.
The product at Step 3 involves the m×m
matrix P whose degree is at most δ and the
m×n matrix F such that
cdeg(F)<cdeg(M); we recall that m⩽2m. If n⩾m, we expand the columns of F
similarly to how P was obtained from P: this yields
a m×(⩽2n) matrix of degree at most
⌈D/n⌉, whose left-multiplication by P
directly yields PF by compressing back the
columns. Thus, this product is done in O\leavevmode ~(mω−2nD) operations since both δ and D/n are in
O(D/m) when n⩾m. If m⩾n, we do a
similar column expansion of F, yet into a matrix with
O(m) columns and degree O(D/m); thus, the product
can be performed in O\leavevmode ~(mω−1D) operations in this
case.
∎
4. Fast algorithms in specific cases
Here, we discuss fast solutions to specific instances of Problem 1.
This will be important ingredients of our main algorithm for relations modulo
Hermite forms (Algorithm 5).
4.1. When the input module is an ideal
We first focus on Problem 1 when n=1; this is one of the two base
cases of the recursion in Algorithm 5 (Step 2). In this case,
the input matrix M is a nonzero polynomial M∈K[x]. In other
words, the input module is the ideal (M) of K[x], and we are
looking for the s-Popov basis for the set of relations between m
elements of K[x]/(M). A fast algorithm for this task was given
in (Neiger, 2016b, Sec. 2.2); precisely, the following result is achieved by
running (Neiger, 2016b, Alg. 2) on input M,F,s,2D.
Proposition 4.0.
Assuming n=1 and deg(F)<D=deg(M), there is an
algorithm which solves Problem 1 using
O\leavevmode ~(mω−1D) operations in K.
4.2. When the s-minimal degree is known
Now, we consider Problem 1 with an additional input: the
s-minimal degree of R(M,F), which is the column degree of its
s-Popov basis. This is motivated by a technique from (Jeannerod et al., 2016)
and used in Algorithm 5 to control the degrees of all the bases computed
in the process. Namely, we find this s-minimal degree recursively, and
then we compute the s-Popov relation basis using this knowledge.
The same question was tackled in (Gupta and
Storjohann, 2011, Sec. 3) and
(Neiger, 2016b, Sec. 2.1) for a diagonal matrix M. Here, we extend
this to the case of a column reduced M, relying in particular on the
fast computation of Rem(EF,M) designed in
Section 3.2. We first extend (Neiger, 2016b, Lem. 2.1)
to this more general setting (Lemma 4.0), and then we give the
slightly modified version of (Neiger, 2016b, Alg. 1)
(Algorithm 4).
Lemma 4.0.
Let M∈K[x]n×n be column reduced, let F∈K[x]m×n be
such that cdeg(F)<cdeg(M), let s∈Zm.
Furthermore, let P∈K[x]m×m, and let w∈Zn be such that max(w)⩽min(s). Then,
P is the s-Popov relation basis for R(M,F) if and only if
[PQ] is the u-Popov kernel basis of [FTM]T for some Q∈K[x]m×n and u=(s,w)∈Zm+n. In this case, deg(Q)<deg(P)
and [PQ] has u-pivot index
(1,2,…,m).
Proof.
Let N=[FTM]T. It is easily verified
that P is a relation basis for R(M,F) if and only if there is
some Q∈K[x]m×n such that [PQ] is a kernel basis of N.
Then, for any matrix [PQ]∈K[x]m×(m+n) in the kernel of N, we have
PF=−QM and therefore Corollary 3.0
shows that rdeg(Q)<rdeg(P); since max(w)⩽min(s), this implies rdegw(Q)<rdegs(P). Thus, we have
lmu([PQ])=[lms(P)0], and therefore P is in
s-Popov form if and only if [PQ] is in
u-Popov form with u-pivot index (1,…,m).
∎
Proposition 4.0.
Algorithm 4* is correct, and assuming that m and n
are in O(D), where D=∣cdeg(M)∣, it uses
O\leavevmode ~(mω−1D+nωD/m) operations in
K.*
Proof.
The correctness follows from the material in (Neiger, 2016b, Sec. 2.1) and
(Jeannerod et al., 2016, Sec. 4). Concerning the cost bound, we first note that we
have δ1+⋯+δm⩽D according to
Corollary 2.0. Thus, the cost analysis in Proposition 3.0
shows that Step 2 uses O\leavevmode ~((mnω−2+nω−1)D) operations. (Jeannerod et al., 2016, Thm. 1.4) states
that the approximant basis computation at Step 3 uses
O\leavevmode ~((m+n)ω−1(1+n/m)D) operations,
since the row dimension of the input matrix is m+n⩽2m+n and the sum of the orders is ∣τ∣=∣cdeg(M)∣+n(δ+1)⩽(1+n/m)D.
∎
4.3. Solution based on fast linear algebra
Here, we detail how previous work can be used to handle a base case of the
recursion in Algorithm 5 (Step 1): when the vector space
dimension deg(det(M)) of the input module is small compared to the
number m of input elements. Then, we rely on an interpretation of
Problem 1 as a question of dense linear algebra over K, which is
solved efficiently by (Jeannerod et al., 2017, Alg. 9). This yields the following
result.
Proposition 4.0.
Assuming that M is in shifted Popov form, and that cdeg(F)<cdeg(M), there is an algorithm which solves Problem 1 using
O\leavevmode ~(Dω⌈m/D⌉) operations in
K, where D=deg(det(M)).
This cost bound is O\leavevmode ~(Dω−1m)⊆O\leavevmode ~(mω−1D) when D∈O(m). To see why
relying on fast linear algebra is sufficient to obtain a fast algorithm when
D∈O(m), we note that this implies that the average column
degree of the s-Popov relation basis P is
[TABLE]
For example, if D⩽m, most entries in this basis have degree [math]:
we are essentially dealing with matrices over K. On the other hand, when
m∈O(D), this approach based on linear algebra uses
O\leavevmode ~(Dω) operations, which largely exceeds our target cost.
We now describe how to translate our problem into the K-linear algebra
framework in (Jeannerod et al., 2017). Let M denote the row space of
M; we assume that M has no identity column. In order to
compute in the quotient K[x]n/M, which has finite dimension
D, it is customary to make use of the multiplication matrix of
x with respect to a given monomial basis. Here, since the basis M
of M is in shifted Popov form with column degree
(d1,…,dn)∈Z>0n, Lemma 2.0
suggests to use the monomial basis
[TABLE]
Above, we have represented an element in K[x]n/M by a
polynomial vector f∈K[x]1×n such that
cdeg(f)<(d1,…,dn). In the linear algebra
viewpoint, we rather represent it by a constant vector e∈K1×D, which is formed by the concatenations of the coefficient
vectors of the entries of f. Applying this to each row of the input
matrix F yields a constant matrix E∈Km×D, which is
another representation of the same m elements in the quotient.
Besides, the multiplication matrix X∈KD×D is the matrix
such that eX∈K1×D corresponds to the
remainder in the division of xf by M. Since the basis
M is in shifted Popov form, the computation of X is
straightforward. Indeed, writing M=diag(xd1,…,xdn)−A where A∈K[x]n×n is such that cdeg(A)<(d1,…,dn), then
the row d1+⋯+di−1+j of X is the
unit vector with 1 at index d1+⋯+di−1+j+1, for 1⩽j<di and 1⩽i⩽n,
the row d1+⋯+di of X is the
concatenation of the coefficient vectors of the row i of A, for
1⩽i⩽n.
That is, writing A=[aij]1⩽i,j⩽n and denoting by
{aij(k),0⩽k<dj} the coefficients of aij, the
multiplication matrix X∈KD×D is
[TABLE]
5. Relations modulo Hermite forms
In this section, we give a fast algorithm for solving Problem 1 when
M is in Hermite form; this matrix is denoted by H in what
follows. The cost bound is given under the assumption that H has no
identity column; how to reduce to this case by discarding columns of H
and F was discussed in Corollary 2.0. We recall that
Steps 1, 2, and 3.i have been discussed in
Section 4.
Proposition 5.0.
Algorithm 5* is correct and, assuming the entries cdeg(H) are
positive, it uses O\leavevmode ~(mω−1D+nωD/m) operations in K, where D=∣cdeg(H)∣=deg(det(H)).*
Proof.
Following the recursion in the algorithm, our proof is by induction on
n, with two base cases (Steps 1 and 2).
The correctness and the cost bound for Step 1 follows from the
discussion in Section 4.3, as summarized in
Proposition 4.0. From Section 4.1,
Step 2 correctly computes the s-Popov relation basis and
uses O\leavevmode ~(mω−1D) operations in K.
Now, we focus on the correctness of Step 3, assuming that the two
recursive calls at Steps 3.d and 3.g correctly compute
the shifted Popov relation bases. Since KnownDegreeRelations is
correct, it is enough to prove that the s-minimal degree of R(H,F)
is δ1+δ2; for this, we will show that P2P1 is a relation basis for R(H,F) whose s-Popov form has
column degree δ1+δ2.
From Theorem 2.0, P2P1 is a relation basis
for R(H,F). Furthermore, the fact that the s-Popov form of
P2P1 has column degree δ1+δ2 follows
from (Jeannerod et al., 2016, Sec. 3), since P1 is in s-Popov form
and P2 is in t-Popov form, where t=s+δ1=rdegs(P1).
Concerning the cost of Step 3, we remark that m<D,
that n⩽D is ensured by cdeg(H)>0, and that
δ1+δ2=deg(det(P2P1))⩽D according
to Corollary 2.0. Furthermore, there are two recursive calls with
dimension about n/2, and with H1 and H2 that are in
Hermite form and have determinant degrees D1=deg(det(H1))
and D2=deg(det(H2)) such that D=D1+D2. Besides, the entries of both cdeg(H1) and
cdeg(H2) are all positive.
In particular, the assumptions on the parameters in
Propositions 3.0 and 4.0, concerning the computation of
the residual at Step 3.f and of the relation basis when the degrees
are known at Step 3.i, are satisfied. Thus, these steps use
O\leavevmode ~((mω−1+nω−1)D) and
O\leavevmode ~(mω−1D+nωD/m)
operations, respectively. The announced cost bound follows.
∎
Acknowledgements.
The authors thank Claude-Pierre Jeannerod for interesting discussions, Arne
Storjohann for his helpful comments on high-order lifting, and the reviewers
whose remarks helped to prepare the final version of this paper. The research
leading to these results has received funding from the People Programme
(Marie Curie Actions) of the European Union’s Seventh Framework Programme
(FP7/2007-2013) under REA grant agreement number 609405 (COFUNDPostdocDTU).
Vu Thi Xuan acknowledges financial support provided by the scholarship
Explora Doc from Région Rhône-Alpes, France, and by the
LABEX MILYON (ANR-10-LABX-0070) of Université de Lyon, within the program
Investissements d’Avenir (ANR-11-IDEX-0007) operated by the French
National Research Agency.