Computing Canonical Bases of Modules of Univariate Relations

Vincent Neiger; Thi Xuan Vu

arXiv:1705.10649·cs.SC·May 31, 2017

Computing Canonical Bases of Modules of Univariate Relations

Vincent Neiger, Thi Xuan Vu

PDF

TL;DR

This paper presents an efficient algorithm for computing canonical bases of modules of univariate relations, generalizing previous methods to non-diagonal modules and improving computational complexity.

Contribution

It introduces a new divide-and-conquer algorithm leveraging high-order lifting for modules with Hermite form matrices, extending prior diagonal matrix results.

Findings

01

Achieves $O ilde{~}(m^{ ext{w}-1}D + n^{ ext{w}} D/m)$ complexity for the problem.

02

Extends previous diagonal matrix algorithms to general Hermite form matrices.

03

Provides a method to compute shifted Popov forms within the same complexity bounds.

Abstract

We study the computation of canonical bases of sets of univariate relations $(p_{1}, \dots, p_{m}) \in K [x]^{m}$ such that $p_{1} f_{1} + \dots + p_{m} f_{m} = 0$ ; here, the input elements $f_{1}, \dots, f_{m}$ are from a quotient $K [x]^{n} / M$ , where $M$ is a $K [x]$ -module of rank $n$ given by a basis $M \in K [x]^{n \times n}$ in Hermite form. We exploit the triangular shape of $M$ to generalize a divide-and-conquer approach which originates from fast minimal approximant basis algorithms. Besides recent techniques for this approach, we rely on high-order lifting to perform fast modular products of polynomial matrices of the form $PF mod M$ . Our algorithm uses $O ~ (m^{ω - 1} D + n^{ω} D / m)$ operations in $K$ , where $D = deg (det (M))$ is the $K$ -vector space…

Equations31

\begin{array}[]{rrcl}\varphi_{\mathcal{M},\boldsymbol{f}}:&\mathbb{K}[x]^{m}&\to&\mathbb{K}[x]^{n}/\mathcal{M}\\ &(p_{1},\ldots,p_{m})&\mapsto&p_{1}f_{1}+\cdots+p_{m}f_{m}\end{array}

\begin{array}[]{rrcl}\varphi_{\mathcal{M},\boldsymbol{f}}:&\mathbb{K}[x]^{m}&\to&\mathbb{K}[x]^{n}/\mathcal{M}\\ &(p_{1},\ldots,p_{m})&\mapsto&p_{1}f_{1}+\cdots+p_{m}f_{m}\end{array}

R (M, F) = {p \in K [x]^{1 \times m} ∣ p F = 0 mod M},

R (M, F) = {p \in K [x]^{1 \times m} ∣ p F = 0 mod M},

∣ cdeg (P) ∣ = de g (det (P)) ⩽ de g (det (M)),

∣ cdeg (P) ∣ = de g (det (P)) ⩽ de g (det (M)),

\tilde{O} (m^{ω - 1} D + n^{ω} D / m)

\tilde{O} (m^{ω - 1} D + n^{ω} D / m)

D_{M} = max_{π \in S_{n}} \sum_{1 ⩽ i ⩽ n} max (0, de g (a_{i, π_{i}}))

D_{M} = max_{π \in S_{n}} \sum_{1 ⩽ i ⩽ n} max (0, de g (a_{i, π_{i}}))

\tilde{O} (n^{ω} ⌈ D_{M} / n ⌉) \subseteq \tilde{O} (n^{ω} de g (M))

\tilde{O} (n^{ω} ⌈ D_{M} / n ⌉) \subseteq \tilde{O} (n^{ω} de g (M))

R (M, F) = R (U M, F) = R (M A, F A) = R (M, F + B M) .

R (M, F) = R (U M, F) = R (M A, F A) = R (M, F + B M) .

π_{1} M π_{2} = [I_{k} 0 B N] and F π_{2} = [0 G],

π_{1} M π_{2} = [I_{k} 0 B N] and F π_{2} = [0 G],

M_{rev}

M_{rev}

F_{rev}

Q_{rev}

R_{rev}

E^{T} = 1 x^{δ} \dots x^{(α_{1} - 1) δ} ⋱ 1 x^{δ} \dots x^{(α_{m} - 1) δ} .

E^{T} = 1 x^{δ} \dots x^{(α_{1} - 1) δ} ⋱ 1 x^{δ} \dots x^{(α_{m} - 1) δ} .

\overline{δ} = (α_{1} δ, \dots, δ, β_{1}, \dots, α_{m} δ, \dots, δ, β_{m}) .

\overline{δ} = (α_{1} δ, \dots, δ, β_{1}, \dots, α_{m} δ, \dots, δ, β_{m}) .

∣ cdeg (P) ∣/ m = de g (det (P)) / m ⩽ D / m \in O (1) .

∣ cdeg (P) ∣/ m = de g (det (P)) / m ⩽ D / m \in O (1) .

{(x^{i}, 0, \dots, 0), 0 ⩽ i < d_{1}} \cup \dots \cup {(0, \dots, 0, x^{i}), 0 ⩽ i < d_{n}} .

{(x^{i}, 0, \dots, 0), 0 ⩽ i < d_{1}} \cup \dots \cup {(0, \dots, 0, x^{i}), 0 ⩽ i < d_{n}} .

a_{11}^{(0)} a_{n 1}^{(0)} 1 a_{11}^{(1)} a_{n 1}^{(1)} ⋱ \dots \dots 1 a_{11}^{(d_{1} - 1)} a_{n 1}^{(d_{1} - 1)} \dots ⋱ \dots a_{1 n}^{(0)} a_{nn}^{(0)} a_{1 n}^{(1)} 1 a_{nn}^{(1)} \dots ⋱ \dots a_{1 n}^{(d_{n} - 1)} 1 a_{nn}^{(d_{n} - 1)} .

a_{11}^{(0)} a_{n 1}^{(0)} 1 a_{11}^{(1)} a_{n 1}^{(1)} ⋱ \dots \dots 1 a_{11}^{(d_{1} - 1)} a_{n 1}^{(d_{1} - 1)} \dots ⋱ \dots a_{1 n}^{(0)} a_{nn}^{(0)} a_{1 n}^{(1)} 1 a_{nn}^{(1)} \dots ⋱ \dots a_{1 n}^{(d_{n} - 1)} 1 a_{nn}^{(d_{n} - 1)} .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Computing Canonical Bases of Modules of Univariate Relations

Vincent Neiger

Technical University of DenmarkKgs. LyngbyDenmark

[email protected]

and

Vu Thi Xuan

ENS de Lyon, LIP (CNRS, Inria, ENSL, UCBL)LyonFrance

[email protected]

( 2017)

Abstract.

We study the computation of canonical bases of sets of univariate relations $(p_{1},\ldots,p_{m})\in\mathbb{K}[x]^{m}$ such that $p_{1}f_{1}+\cdots+p_{m}f_{m}=0$ ; here, the input elements $f_{1},\ldots,f_{m}$ are from a quotient $\mathbb{K}[x]^{n}/\mathcal{M}$ , where $\mathcal{M}$ is a $\mathbb{K}[x]$ -module of rank $n$ given by a basis $\mathbf{M}\in\mathbb{K}[x]^{n\times n}$ in Hermite form. We exploit the triangular shape of $\mathbf{M}$ to generalize a divide-and-conquer approach which originates from fast minimal approximant basis algorithms. Besides recent techniques for this approach, we rely on high-order lifting to perform fast modular products of polynomial matrices of the form $\mathbf{P}\mathbf{F}\bmod\mathbf{M}$ .

Our algorithm uses $O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)$ operations in $\mathbb{K}$ , where $D=\deg(\det(\mathbf{M}))$ is the $\mathbb{K}$ -vector space dimension of $\mathbb{K}[x]^{n}/\mathcal{M}$ , $O\tilde{\leavevmode\nobreak\ }(\cdot)$ indicates that logarithmic factors are omitted, and $\omega$ is the exponent of matrix multiplication. This had previously only been achieved for a diagonal matrix $\mathbf{M}$ . Furthermore, our algorithm can be used to compute the shifted Popov form of a nonsingular matrix within the same cost bound, up to logarithmic factors, as the previously fastest known algorithm, which is randomized.

Polynomial matrix; shifted Popov form; division with remainder; univariate equations; syzygy module.

††journalyear: 2017††copyright: acmlicensed††conference: ISSAC ’17; July 25-28, 2017; Kaiserslautern, Germany††price: 15.00††doi: 10.1145/3087604.3087656††isbn: 978-1-4503-5064-8/17/07

1. Introduction

In what follows, $\mathbb{K}$ is a field, $\mathbb{K}[x]$ denotes the set of univariate polynomials in $x$ over $\mathbb{K}$ , and $\mathbb{K}[x]^{m\times n}$ denotes the set of $m\times n$ (univariate) polynomial matrices.

Univariate relations

Let us consider a (free) $\mathbb{K}[x]$ -submodule $\mathcal{M}\subseteq\mathbb{K}[x]^{n}$ of rank $n$ , specified by one of its bases, represented as the rows of a nonsingular matrix $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ . Besides, let some elements $f_{1},\ldots,f_{m}\in\mathbb{K}[x]^{n}/\mathcal{M}$ be represented as a matrix $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ . Then, the kernel of the module morphism

[TABLE]

consists of relations between the $f_{i}$ ’s, and is known as a syzygy module (Eisenbud, 2005). From the matrix viewpoint above, we write it as

[TABLE]

where the notation $\mathbf{{A}}=\mathbf{{0}}\bmod\mathbf{{M}}$ stands for “ $\mathbf{{A}}=\mathbf{{Q}}\mathbf{{M}}$ for some $\mathbf{{Q}}$ ”, which means that the rows of $\mathbf{{A}}$ are in the module $\mathcal{M}$ . Hereafter, the elements of $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ are called relations of $\,\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ .

Examples of such relations are the following.

•

Hermite-Padé approximants are relations for $n=1$ and $\mathcal{M}=x^{D}\mathbb{K}[x]$ . That is, given polynomials $f_{1},\ldots,f_{m}$ , the corresponding approximants are all $(p_{1},\ldots,p_{m})\in\mathbb{K}[x]^{m}$ such that $p_{1}f_{1}+\cdots+p_{m}f_{m}=0\bmod x^{D}$ . Fast algorithms for finding such approximants include (Van Barel and Bultheel, 1991; Beckermann and Labahn, 1994; Giorgi et al., 2003; Zhou and Labahn, 2012; Jeannerod et al., 2016).

•

Multipoint Padé approximants: the fast computation of relations when $\mathcal{M}$ is a product of ideals, corresponding to a diagonal basis $\mathbf{{M}}=\mathrm{diag}(M_{1},\ldots,M_{n})$ , was studied in (Beckermann, 1992; Van Barel and Bultheel, 1992; Beckermann and Labahn, 1997; Jeannerod et al., 2017, 2016; Neiger, 2016b). Many of these references focus on $M_{1},\ldots,M_{n}$ which split over $\mathbb{K}$ with known roots and multiplicities; then, relations are known as multipoint Padé approximants (Baker and Graves-Morris, 1996), or also interpolants (Beckermann and Labahn, 1997; Jeannerod et al., 2017). In this case, a relation can be thought of as a solution to a linear system over $\mathbb{K}[x]$ in which the $j$ th equation is modulo $M_{j}$ .

Canonical bases

Since $\det(\mathbf{{M}})\mathbb{K}[x]^{m}\subseteq\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})\subseteq\mathbb{K}[x]^{m}$ , the module $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ is free of rank $m$ (Dummit and Foote, 2004, Sec. 12.1, Thm. 4). Hence, any of its bases can be represented as the rows of a nonsingular matrix in $\mathbb{K}[x]^{m\times m}$ , which we call a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ .

Here, we are interested in computing relation bases in shifted Popov form (Popov, 1972; Beckermann et al., 1999). Such bases are canonical in terms of the module $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ and of a shift, the latter being a tuple $\mathbf{s}\in\mathbb{Z}^{n}$ used as column weights in the notion of degree for row vectors. Furthermore, the degrees in shifted Popov bases are well controlled, which helps to compute them faster than less constrained types of bases (see (Jeannerod et al., 2016) and (Neiger, 2016a, Sec. 1.2.2)) and then, once obtained, to exploit them for other purposes (see for example (Rosenkilde and Storjohann, 2016, Thm. 12)). Having a shifted Popov basis of a submodule $\mathcal{M}\subseteq\mathbb{K}[x]^{n}$ is particularly useful for efficient computations in the quotient $\mathbb{K}[x]^{n}/\mathcal{M}$ (see Section 3).

In fact, shifted Popov bases coincide with Gröbner bases for $\mathbb{K}[x]$ -submodules of $\mathbb{K}[x]^{n}$ (Eisenbud, 1995, Chap. 15), for a term-over-position monomial order weighted by the entries of the shift. For more details about this link, we refer to (Middeke, 2011, Chap. 6) and (Neiger, 2016a, Chap. 1).

For a shift $\mathbf{s}=(s_{1},\ldots,s_{n})\in\mathbb{Z}^{n}$ , the $\mathbf{s}$ -degree of a row vector $\mathbf{{p}}=[p_{1},\ldots,p_{n}]\in\mathbb{K}[x]^{1\times n}$ is $\max_{1\leqslant j\leqslant n}(\deg(p_{j})+s_{j})$ ; the $\mathbf{s}$ -row degree of a matrix $\mathbf{{P}}\in\mathbb{K}[x]^{m\times n}$ is $\mathrm{rdeg}_{{\mathbf{s}}}(\mathbf{{P}})=(d_{1},\ldots,d_{m})$ with $d_{i}$ the $\mathbf{s}$ -degree of the $i$ th row of $\mathbf{{P}}$ . Then, the $\mathbf{s}$ -leading matrix of $\mathbf{{P}}=[p_{i,j}]_{ij}$ is the matrix $\mathrm{lm}_{\mathbf{s}}(\mathbf{{P}})\in\mathbb{K}^{m\times n}$ whose entry $(i,j)$ is the coefficient of degree $d_{i}-s_{j}$ of $p_{i,j}$ . Similarly, the list of column degrees of a matrix $\mathbf{{P}}$ is denoted by $\mathrm{cdeg}(\mathbf{{P}})$ .

Definition 1.0 ((Kailath, 1980; Beckermann

et al., 1999)).

Let $\mathbf{{P}}\in\mathbb{K}[x]^{m\times m}$ be nonsingular, and let $\mathbf{s}\in\mathbb{Z}^{m}$ . Then, $\mathbf{{P}}$ is said to be in

•

$\mathbf{s}$ -reduced form if $\mathrm{lm}_{\mathbf{s}}(\mathbf{{P}})$ is invertible;

•

$\mathbf{s}$ -Popov form if $\mathrm{lm}_{\mathbf{s}}(\mathbf{{P}})$ is unit lower triangular and $\mathrm{lm}_{\mathbf{0}}(\mathbf{{P}}^{\mathsf{T}})$ is the identity matrix.

Hereafter, when we introduce a matrix by saying that it is reduced, it is understood that it is nonsingular. Similar forms can be defined for modules generated by the columns of a matrix rather than by its rows; in the context of polynomial matrix division with remainder, we will use the notion of $\mathbf{{P}}$ in column reduced form, meaning that $\mathrm{lm}_{\mathbf{0}}(\mathbf{{P}}^{\mathsf{T}})$ is invertible. In particular, we remark that any matrix in shifted Popov form is also column reduced.

Considering relation bases $\mathbf{{P}}$ for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ in shifted Popov form offers a strong control over the degrees of their entries. As shifted (row) reduced bases, they satisfy the predictable degree property (Forney, Jr., 1975), which is at the core of the correctness of a divide-and-conquer approach behind most algorithms for the two specific situations described above, for example (Beckermann and Labahn, 1994; Giorgi et al., 2003; Giorgi and Lebreton, 2014; Jeannerod et al., 2017). Furthermore, as column reduced matrices they have small average column degree, which is central in the efficiency of fast algorithms for non-uniform shifts (Jeannerod et al., 2016; Neiger, 2016b). Indeed, we will see in Corollary 2.0 that

[TABLE]

where $|\cdot|$ denotes the sum of the entries of a tuple.

Below, triangular canonical bases will play an important role. A matrix $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ is in Hermite form if $\mathbf{{M}}$ is upper triangular and $\mathrm{lm}_{\mathbf{0}}(\mathbf{{M}}^{\mathsf{T}})$ is the identity matrix; or, equivalently, if $\mathbf{{M}}$ is in $(dn,d(n-1),\ldots,d)$ -Popov form for any $d\geqslant\deg(\det(\mathbf{{M}}))$ .

Relations modulo Hermite forms

Our main focus is on the case where $\mathbf{{M}}$ is in Hermite form and $\mathbf{{F}}$ is already reduced modulo $\mathbf{{M}}$ . In this article, all comparisons of tuples are componentwise.

Theorem 1.0.

If $\mathbf{{M}}$ is in Hermite form and $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})$ , there is a deterministic algorithm which solves Problem 1 using

[TABLE]

operations in $\mathbb{K}$ , where $D=\deg(\det(\mathbf{{M}}))=|\mathrm{cdeg}(\mathbf{{M}})|$ .

Here, the exponent $\omega$ is so that we can multiply $m\times m$ matrices over $\mathbb{K}$ in $O(m^{\omega})$ operations in $\mathbb{K}$ , the best known bound being $\omega<2.38$ (Coppersmith and Winograd, 1990; Le Gall, 2014). The notation $\mathchoice{\tilde{O}\left(\cdot\right)}{O\tilde{\leavevmode\nobreak\ }(\cdot)}{O\tilde{\leavevmode\nobreak\ }(\cdot)}{O\tilde{\leavevmode\nobreak\ }(\cdot)}$ means that we have omitted the logarithmic factors in the asymptotic bound.

To put this cost bound in perspective, we note that the representation of the input $\mathbf{{F}}$ and $\mathbf{{M}}$ requires at most $(m+n)D$ field elements, while that of the output basis uses at most $mD$ elements. In many applications we have $n\in O(m)$ , in which case the cost bound becomes $\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ , which is satisfactory.

To the best of our knowledge, previous algorithms with a comparable cost bound focus on the case of a diagonal matrix $\mathbf{{M}}$ .

The case of minimal approximant bases $\mathbf{{M}}=x^{d}\mathbf{{I}}_{n}$ has concentrated a lot of attention. A first algorithm with cost quasi-linear in $d$ was given (Beckermann and Labahn, 1994). It was then improved in (Giorgi et al., 2003; Storjohann, 2006; Zhou and Labahn, 2012), obtaining the cost bound $\mathchoice{\tilde{O}\left(m^{\omega-1}nd\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}nd)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}nd)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}nd)}=\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ under assumptions on the dimensions $m$ and $n$ or on the shift.

In (Jeannerod et al., 2017), the divide-and-conquer approach of (Beckermann and Labahn, 1994) was carried over and made efficient in the more general case $\mathbf{{M}}=\mathrm{diag}(M_{1},\ldots,M_{n})$ , where the polynomials $M_{i}$ split over $\mathbb{K}$ with known linear factors. This approach was then augmented in (Jeannerod et al., 2016) with a strategy focusing on degree information to efficiently compute the shifted Popov bases for arbitrary shifts, achieving the cost bound $\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ .

Then, the case of a diagonal matrix $\mathbf{{M}}$ , with no assumption on the diagonal entries, was solved within $\mathchoice{\tilde{O}\left(m^{\omega-1}D+n^{\omega}D/m\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}$ (Neiger, 2016b). The main new ingredient developed in (Neiger, 2016b) was an efficient algorithm for the case $n=1$ , that is, when solving a single linear equation modulo a polynomial; we will also make use of this algorithm here.

In this paper we obtain the same cost bound as (Neiger, 2016b) for any matrix $\mathbf{{M}}$ in Hermite form. For a more detailed comparison with earlier algorithms focusing on diagonal matrices $\mathbf{{M}}$ , we refer the reader to (Neiger, 2016b, Sec. 1.2) and in particular Table 2 therein.

Our algorithm essentially follows the approach of (Neiger, 2016b). In particular, it uses the algorithm developed there for $n=1$ . However, working modulo Hermite forms instead of diagonal matrices makes the computation of residuals much more involved. The residual is a modular product $\mathbf{{P}}\mathbf{{F}}\bmod\mathbf{{M}}$ which is computed after the first recursive call and is to be used as an input replacing $\mathbf{{F}}$ for the second recursive call. When $\mathbf{{M}}$ is diagonal, its computation boils down to the multiplication of $\mathbf{{P}}$ and $\mathbf{{F}}$ , although care has to be taken to account for their possibly unbalanced column degrees. However, when $\mathbf{{M}}$ is triangular, computing $\mathbf{{P}}\mathbf{{F}}\bmod\mathbf{{M}}$ becomes a much greater challenge: we want to compute a matrix remainder instead of simply taking polynomial remainders for each column separately. We handle this, while still taking unbalanced degrees into account, by resorting to high-order lifting (Storjohann, 2003).

Shifted Popov forms of matrices

A specific instance of Problem 1 yields the following problem: given a shift $\mathbf{s}\in\mathbb{Z}^{n}$ and a nonsingular matrix $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ , compute the $\mathbf{s}$ -Popov form of $\mathbf{{M}}$ . Indeed, the latter is the $\mathbf{s}$ -Popov relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})$ (see Lemma 2.0).

To compute this relation basis efficiently, we start by computing the Hermite form $\mathbf{{H}}$ of $\mathbf{{M}}$ , which can be done deterministically in $\mathchoice{\tilde{O}\left(n^{\omega}\lceil D_{\mathbf{{M}}}/n\rceil\right)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega}\lceil D_{\mathbf{{M}}}/n\rceil)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega}\lceil D_{\mathbf{{M}}}/n\rceil)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega}\lceil D_{\mathbf{{M}}}/n\rceil)}$ operations (Labahn et al., 2017). Here, $D_{\mathbf{{M}}}$ is the generic determinant bound (Gupta et al., 2012); writing $\mathbf{{M}}=[a_{ij}]$ , it is defined as

[TABLE]

where $S_{n}$ is the set of permutations of $\{1,\ldots,n\}$ . In particular, $D_{\mathbf{{M}}}/n$ is bounded from above by both the average of the degrees of the columns of $\mathbf{{M}}$ and that of its rows. For more details about this quantity, we refer to (Gupta et al., 2012, Sec. 6) and (Labahn et al., 2017, Sec. 2.3).

Since the rows of $\mathbf{{H}}$ generate the same module as $\mathbf{{M}}$ , we have $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})=\operatorname{\mathcal{R}}(\mathbf{{H}},\mathbf{{I}}_{n})$ (see Lemma 2.0). Then, applying our algorithm for relations modulo $\mathbf{{H}}$ has a cost of $\mathchoice{\tilde{O}\left(n^{\omega-1}\deg(\det(\mathbf{{H}}))\right)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}\deg(\det(\mathbf{{H}})))}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}\deg(\det(\mathbf{{H}})))}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}\deg(\det(\mathbf{{H}})))}$ operations, according to Theorem 1.0. This yields the next result.

Theorem 1.0.

Given a shift $\mathbf{s}\in\mathbb{Z}^{n}$ and a nonsingular matrix $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ , there is a deterministic algorithm which computes the $\mathbf{s}$ -Popov form of $\mathbf{{M}}$ using

[TABLE]

operations in $\mathbb{K}$ .

A similar cost bound was obtained in (Neiger, 2016b), yet with a randomized algorithm. The latter follows the approach of (Gupta and Storjohann, 2011) for computing Hermite forms, whose first step determines the Smith form $\mathbf{{S}}$ of $\mathbf{{M}}$ along with a matrix $\mathbf{{F}}$ such that the sought matrix is the $\mathbf{s}$ -Popov relation basis for $\operatorname{\mathcal{R}}(\mathbf{{S}},\mathbf{{F}})$ , with $\mathbf{{S}}$ being therefore a diagonal matrix. Here, relying on the deterministic computation of the Hermite form of $\mathbf{{M}}$ , our algorithm for relation bases modulo Hermite forms allows us to circumvent the computation of $\mathbf{{S}}$ , for which the currently fastest known algorithm is Las Vegas randomized (Storjohann, 2003). For a more detailed comparison with earlier row reduction and Popov forms algorithms, we refer to (Neiger, 2016b, Sec. 1.1) and Table 1 therein.

General relation bases

To solve the general case of Problem 1, one can proceed as follows:

•

find the Hermite form $\mathbf{{H}}$ of $\mathbf{{M}}$ , using (Labahn et al., 2017, Algo. 1 and 3);

•

reduce $\mathbf{{F}}$ modulo $\mathbf{{H}}$ , for example using Algorithm 1;

•

apply Algorithm 5 for relations modulo a Hermite form.

Outline

We first give basic properties about matrix division and relation bases (Section 2). We then focus on the fast computation of residuals (Section 3). After that, we discuss three situations which have already been solved efficiently in the literature (Section 4): when $n=1$ , when information on the output degrees is available, and when $D\leqslant m$ . Finally, we present our algorithm for relations modulo Hermite forms (Section 5).

2. Preliminaries on polynomial matrix division and modules of relations

Division with remainder

Polynomial matrix division is a central notion in this paper, since we aim at solving equations modulo $\mathbf{{M}}$ .

Theorem 2.0 ((Gantmacher, 1959, IV.§2),(Kailath, 1980, Thm. 6.3-15)).

For any $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ and any column reduced $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ , there exist unique matrices $\mathbf{{Q}},\mathbf{{R}}\in\mathbb{K}[x]^{m\times n}$ such that $\mathbf{{F}}=\mathbf{{Q}}\mathbf{{M}}+\mathbf{{R}}$ and $\mathrm{cdeg}(\mathbf{{R}})<\mathrm{cdeg}(\mathbf{{M}})$ .

Hereafter, we write $\mathchoice{\operatorname{Quo}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}$ and $\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}$ for the quotient $\mathbf{{Q}}$ and the remainder $\mathbf{{R}}$ . We have the following properties.

Lemma 2.0.

We have $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})},\mathbf{{M}})}=\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}$ and $\displaystyle\mathchoice{\operatorname{Rem}\left(\begin{bmatrix}\mathbf{{F}}\\ \mathbf{{G}}\end{bmatrix},\mathbf{{M}}\right)}{\operatorname{Rem}(\begin{bmatrix}\mathbf{{F}}\\ \mathbf{{G}}\end{bmatrix},\mathbf{{M}})}{\operatorname{Rem}(\begin{bmatrix}\mathbf{{F}}\\ \mathbf{{G}}\end{bmatrix},\mathbf{{M}})}{\operatorname{Rem}(\begin{bmatrix}\mathbf{{F}}\\ \mathbf{{G}}\end{bmatrix},\mathbf{{M}})}=\begin{bmatrix}\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}\\ \mathchoice{\operatorname{Rem}\left(\mathbf{{G}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}})}\end{bmatrix}$ for any $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ , $\mathbf{{G}}\in\mathbb{K}[x]^{\ast\times n}$ , $\mathbf{{P}}\in\mathbb{K}[x]^{\ast\times m}$ and any column reduced $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ .

Degree control for relation bases

We first relate the vector space dimension of quotients and the degree of determinant of bases.

Lemma 2.0.

Let $\mathcal{M}$ be a $\mathbb{K}[x]$ -submodule of $\mathbb{K}[x]^{n}$ of rank $n$ . Then, the dimension of $\mathbb{K}[x]^{n}/\mathcal{M}$ as a $\mathbb{K}$ -vector space is $\deg(\det(\mathbf{{M}}))$ , for any matrix $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ whose rows form a basis of $\mathcal{M}$ .

Proof.

Since the degree of the determinant is the same for all bases of $\mathcal{M}$ , we may assume that $\mathbf{{M}}$ is column reduced. Then, Theorem 2.0 implies that there is a $\mathbb{K}$ -vector space isomorphism $\mathbb{K}[x]^{n}/\mathcal{M}\cong\mathbb{K}[x]/(x^{d_{1}})\times\cdots\times\mathbb{K}[x]/(x^{d_{n}})$ , where $(d_{1},\ldots,d_{n})=\mathrm{cdeg}(\mathbf{{M}})$ . Thus, the dimension of $\mathbb{K}[x]^{n}/\mathcal{M}$ is $d_{1}+\cdots+d_{n}$ , which is equal to $\deg(\det(\mathbf{{M}}))$ according to (Kailath, 1980, Sec. 6.3.2). ∎

This allows us to bound the sum of column degrees of any column reduced relation basis; for example, a shifted Popov relation basis.

Corollary 2.0.

Let $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ , and let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ be nonsingular. Then, any relation basis $\mathbf{{P}}\in\mathbb{K}[x]^{m\times m}$ for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ is such that $\deg(\det(\mathbf{{P}}))\leqslant\deg(\det(\mathbf{{M}}))$ . In particular, if $\mathbf{{P}}$ is column reduced, then $|\mathrm{cdeg}(\mathbf{{P}})|\leqslant\deg(\det(\mathbf{{M}}))$ .

Proof.

Let $\mathcal{M}$ be the row space of $\mathbf{{M}}$ . By definition, $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ is the kernel of $\varphi_{\mathcal{M},\boldsymbol{f}}$ (see Section 1), hence $\mathbb{K}[x]^{m}/\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ is isomorphic to a submodule of $\mathbb{K}[x]^{n}/\mathcal{M}$ . Since, by Lemma 2.0, the dimensions of $\mathbb{K}[x]^{m}/\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ and $\mathbb{K}[x]^{m}/\mathcal{M}$ are $\deg(\det(\mathbf{{P}}))$ and $\deg(\det(\mathbf{{M}}))$ , we obtain $\deg(\det(\mathbf{{P}}))\leqslant\deg(\det(\mathbf{{M}}))$ . ∎

Properties of relation bases

We now formalize the facts that $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ is not changed if $\mathbf{{M}}$ is replaced by another basis of the module generated by its rows; or if $\mathbf{{F}}$ and $\mathbf{{M}}$ are right-multiplied by the same nonsingular matrix; or yet if $\mathbf{{F}}$ is considered modulo $\mathbf{{M}}$ .

Lemma 2.0.

Let $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ , and let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ be nonsingular. Then, for any nonsingular $\mathbf{{A}}\in\mathbb{K}[x]^{n\times n}$ , any matrix $\mathbf{{B}}\in\mathbb{K}[x]^{m\times n}$ , and any unimodular $\mathbf{{U}}\in\mathbb{K}[x]^{m\times m}$ , we have

[TABLE]

A first consequence is that we may discard identity columns in $\mathbf{{M}}$ .

Corollary 2.0.

Let $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ , and let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ be nonsingular. Suppose that $\mathbf{{M}}$ has at least $k\in\mathbb{Z}_{>0}$ identity columns, and that the corresponding columns of $\mathbf{{F}}$ are zero. Then, let $\pi_{1},\pi_{2}$ be $n\times n$ permutation matrices such that

[TABLE]

where $\mathbf{{G}}\in\mathbb{K}[x]^{m\times(n-k)}$ . Then, $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})=\operatorname{\mathcal{R}}(\mathbf{{N}},\mathbf{{G}})$ .

Another consequence concerns the transformation of a matrix into shifted Popov form. Indeed, Lemma 2.0 together with the next lemma imply in particular that the $\mathbf{s}$ -Popov form of $\mathbf{{M}}$ is the $\mathbf{s}$ -Popov relation basis for $\operatorname{\mathcal{R}}(\mathbf{{H}},\mathbf{{I}}_{n})$ , where $\mathbf{{H}}$ is the Hermite form of $\mathbf{{M}}$ .

Lemma 2.0.

Let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ be nonsingular. Then, $\mathbf{{M}}$ is a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})$ . It follows that the $\mathbf{s}$ -Popov form of $\mathbf{{M}}$ is the $\mathbf{s}$ -Popov relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{m})$ , for any $\mathbf{s}\in\mathbb{Z}^{n}$ .

Proof.

Let $\mathbf{{P}}\in\mathbb{K}[x]^{n\times n}$ be a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})$ . Then, $\mathbf{{P}}\mathbf{{I}}_{n}=\mathbf{{Q}}\mathbf{{M}}$ for some $\mathbf{{Q}}\in\mathbb{K}[x]^{n\times n}$ ; since the rows of $\mathbf{{M}}$ belong to $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})$ , we also have $\mathbf{{M}}=\mathbf{{R}}\mathbf{{P}}$ for some $\mathbf{{R}}\in\mathbb{K}[x]^{n\times n}$ . Since $\mathbf{{P}}$ is nonsingular, $\mathbf{{P}}=\mathbf{{Q}}\mathbf{{R}}\mathbf{{P}}$ implies that $\mathbf{{Q}}\mathbf{{R}}={\mathbf{{I}}_{n}}$ , and therefore $\mathbf{{R}}$ is unimodular. Thus, $\mathbf{{M}}=\mathbf{{R}}\mathbf{{P}}$ is a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{I}}_{n})$ . ∎

Divide and conquer approach

Here we give properties in the case of a block triangular matrix $\mathbf{{M}}$ . They imply, if $\mathbf{{M}}$ is in Hermite form, that Problem 1 can be solved recursively by splitting the instance in dimension $n$ into two instances in dimension $n/2$ .

Lemma 2.0.

Let $\mathbf{{M}}_{1}\in\mathbb{K}[x]^{n_{1}\times n_{1}}$ , $\mathbf{{M}}_{2}\in\mathbb{K}[x]^{n_{2}\times n_{2}}$ , and $\mathbf{{A}}\in\mathbb{K}[x]^{n_{1}\times n_{2}}$ be such that $\mathbf{{M}}=\big{[}\begin{smallmatrix}\mathbf{{M}}_{1}&\mathbf{{A}}\\ \mathbf{{0}}&\mathbf{{M}}_{2}\end{smallmatrix}\big{]}$ is column reduced. For any $\mathbf{{F}}_{1}\in\mathbb{K}[x]^{m\times n_{1}}$ and $\mathbf{{F}}_{2}\in\mathbb{K}[x]^{m\times n_{2}}$ , we have $\mathchoice{\operatorname{Rem}\left([\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}}\right)}{\operatorname{Rem}([\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}([\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}([\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}=[\mathchoice{\operatorname{Rem}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}\;\;\mathchoice{\operatorname{Rem}\left(\mathbf{{F}}_{2}-\mathchoice{\operatorname{Quo}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}\mathbf{{A}},\mathbf{{M}}_{2}\right)}{\operatorname{Rem}(\mathbf{{F}}_{2}-\mathchoice{\operatorname{Quo}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}\mathbf{{A}},\mathbf{{M}}_{2})}{\operatorname{Rem}(\mathbf{{F}}_{2}-\mathchoice{\operatorname{Quo}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}\mathbf{{A}},\mathbf{{M}}_{2})}{\operatorname{Rem}(\mathbf{{F}}_{2}-\mathchoice{\operatorname{Quo}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}\mathbf{{A}},\mathbf{{M}}_{2})}]$ .

Proof.

Writing $[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}]=[\mathbf{{Q}}_{1}\;\;\mathbf{{Q}}_{2}]\mathbf{{M}}+[\mathbf{{R}}_{1}\;\;\mathbf{{R}}_{2}]$ where $\mathrm{cdeg}([\mathbf{{R}}_{1}\;\;\mathbf{{R}}_{2}])<\mathrm{cdeg}(\mathbf{{M}})$ , we obtain $\mathbf{{F}}_{1}=\mathbf{{Q}}_{1}\mathbf{{M}}_{1}+\mathbf{{R}}_{1}$ as well as $\mathrm{cdeg}(\mathbf{{R}}_{1})<\mathrm{cdeg}(\mathbf{{M}}_{1})$ , and therefore $\mathbf{{R}}_{1}=\mathchoice{\operatorname{Rem}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}$ and $\mathbf{{Q}}_{1}=\mathchoice{\operatorname{Quo}\left(\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Quo}(\mathbf{{F}}_{1},\mathbf{{M}}_{1})}$ . The result follows from $\mathbf{{F}}_{2}=\mathbf{{Q}}_{1}\mathbf{{A}}+\mathbf{{Q}}_{2}\mathbf{{M}}_{2}+\mathbf{{R}}_{2}$ . ∎

Theorem 2.0.

Let $\mathbf{{M}}=\big{[}\begin{smallmatrix}\mathbf{{M}}_{1}&\boldsymbol{\ast}\\ \mathbf{{0}}&\mathbf{{M}}_{2}\end{smallmatrix}\big{]}$ be column reduced, where $\mathbf{{M}}_{1}\in\mathbb{K}[x]^{n_{1}\times n_{1}}$ and $\mathbf{{M}}_{2}\in\mathbb{K}[x]^{n_{2}\times n_{2}}$ , and let $\mathbf{{F}}_{1}\in\mathbb{K}[x]^{m\times n_{1}}$ and $\mathbf{{F}}_{2}\in\mathbb{K}[x]^{m\times n_{2}}$ . If $\mathbf{{P}}_{1}$ is a basis for $\operatorname{\mathcal{R}}(\mathbf{{M}}_{1},\mathbf{{F}}_{1})$ , then $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}$ has the form $[\mathbf{{0}}\;\;\mathbf{{G}}]$ for some $\mathbf{{G}}\in\mathbb{K}[x]^{m\times n_{2}}$ ; if furthermore $\mathbf{{P}}_{2}$ is a basis for $\operatorname{\mathcal{R}}(\mathbf{{M}}_{2},\mathbf{{G}})$ , then $\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ is a basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}])$ .

Proof.

It follows from Lemma 2.0 that the first $n_{1}$ columns of $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}$ are $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}_{1}\mathbf{{F}}_{1},\mathbf{{M}}_{1}\right)}{\operatorname{Rem}(\mathbf{{P}}_{1}\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{P}}_{1}\mathbf{{F}}_{1},\mathbf{{M}}_{1})}{\operatorname{Rem}(\mathbf{{P}}_{1}\mathbf{{F}}_{1},\mathbf{{M}}_{1})}$ , which is zero, and that $\mathchoice{\operatorname{Rem}\left([\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}}\right)}{\operatorname{Rem}([\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}{\operatorname{Rem}([\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}{\operatorname{Rem}([\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}=[\mathbf{{0}}\;\;\mathchoice{\operatorname{Rem}\left(\mathbf{{G}},\mathbf{{M}}_{2}\right)}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}}_{2})}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}}_{2})}{\operatorname{Rem}(\mathbf{{G}},\mathbf{{M}}_{2})}]$ . Then, the first identity in Lemma 2.0 implies both that $\operatorname{\mathcal{R}}(\mathbf{{M}},[\mathbf{{0}}\;\;\mathbf{{G}}])=\operatorname{\mathcal{R}}(\mathbf{{M}}_{2},\mathbf{{G}})$ and that the rows of $\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ are in $\operatorname{\mathcal{R}}(\mathbf{{M}},[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}])$ . Now let $\mathbf{{p}}\in\operatorname{\mathcal{R}}(\mathbf{{M}},[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}])$ . Lemma 2.0 implies that $\mathbf{{p}}\in\operatorname{\mathcal{R}}(\mathbf{{M}}_{1},\mathbf{{F}}_{1})$ , hence $\mathbf{{p}}=\boldsymbol{\lambda}\mathbf{{P}}_{1}$ for some $\boldsymbol{\lambda}$ . Then, the first identity in Lemma 2.0 shows that $\mathbf{{0}}=\mathchoice{\operatorname{Rem}\left(\boldsymbol{\lambda}\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}}\right)}{\operatorname{Rem}(\boldsymbol{\lambda}\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\boldsymbol{\lambda}\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}{\operatorname{Rem}(\boldsymbol{\lambda}\mathbf{{P}}_{1}[\mathbf{{F}}_{1}\;\;\mathbf{{F}}_{2}],\mathbf{{M}})}=\mathchoice{\operatorname{Rem}\left(\boldsymbol{\lambda}[\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}}\right)}{\operatorname{Rem}(\boldsymbol{\lambda}[\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}{\operatorname{Rem}(\boldsymbol{\lambda}[\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}{\operatorname{Rem}(\boldsymbol{\lambda}[\mathbf{{0}}\;\;\mathbf{{G}}],\mathbf{{M}})}$ , and therefore $\boldsymbol{\lambda}\in\operatorname{\mathcal{R}}(\mathbf{{M}}_{2},\mathbf{{G}})$ . Thus $\boldsymbol{\lambda}=\boldsymbol{\mu}\mathbf{{P}}_{2}$ for some $\boldsymbol{\mu}$ , and $\mathbf{{p}}=\boldsymbol{\mu}\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ . ∎

3. Computing modular products

In this section, we aim at designing a fast algorithm for the modular products that arise in our relation basis algorithm.

3.1. Fast division with remainder

For univariate polynomials, fast Euclidean division can be achieved by first computing the reversed quotient via Newton iteration, and then deducing the remainder (Gathen and Gerhard, 2013, Chap. 9). This directly translates into the context of polynomial matrices, as was noted for example in the proof of (Giorgi et al., 2003, Lem. 3.4) or in (Zhou, 2012, Chap. 10).

In the latter reference, it is showed how to efficiently compute remainders $\mathchoice{\operatorname{Rem}\left(\mathcal{E},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathcal{E},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E},\mathbf{{M}})}$ for a matrix $\mathcal{E}$ as in Eq. 1 below; this is not general enough for our purpose. Algorithms for the general case have been studied (Favati and Lotti, 1991; Zhang and Chen, 1983; Wolovich, 1984; Codenotti and Lotti, 1989; Wang and Zhou, 1986), but we are not aware of any that achieves the speed we desire. Thus, as a preliminary to the computation of residuals in Section 3.2, we now detail this extension of fast polynomial division to fast polynomial matrix division.

As mentioned above, we will start by computing the quotient. The degrees of its entries are controlled thanks to the reducedness of the divisor, which ensures that no high-degree cancellation can occur when multiplying the quotient and the divisor.

Lemma 3.0.

Let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ , $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ , and $\delta\in\mathbb{Z}_{>0}$ be such that $\mathbf{{M}}$ is column reduced and $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})+(\delta,\ldots,\delta)$ . Then, $\deg(\mathchoice{\operatorname{Quo}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})})<\delta$ .

Proof.

First, $\mathrm{lm}_{\mathbf{0}}(\mathbf{{M}}^{\mathsf{T}})^{\mathsf{T}}=\mathrm{lm}_{-\mathbf{d}}(\mathbf{{M}})$ where $\mathbf{d}=\mathrm{cdeg}(\mathbf{{M}})\in\mathbb{Z}_{\geqslant 0}^{n}$ : the $\mathbf{0}$ -column leading matrix of $\mathbf{{M}}$ is equal to its $-\mathbf{d}$ -row leading matrix. Since $\mathbf{{M}}$ is $\mathbf{0}$ -column reduced, it is also $-\mathbf{d}$ -row reduced.

Thus, by the predictable degree property (Kailath, 1980, Thm. 6.3-13) and since since $\mathrm{rdeg}_{{-\mathbf{d}}}(\mathbf{{M}})=\mathbf{0}$ , we have $\mathrm{rdeg}_{{-\mathbf{d}}}(\mathbf{{Q}}\mathbf{{M}})=\mathrm{rdeg}_{{\mathbf{0}}}(\mathbf{{Q}})$ . Here, we write $\mathbf{{Q}}=\mathchoice{\operatorname{Quo}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}$ and $\mathbf{{R}}=\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}$ .

Now, our assumption $\mathrm{cdeg}(\mathbf{{F}})<\mathbf{d}+(\delta,\ldots,\delta)$ and the fact that $\mathrm{cdeg}(\mathbf{{R}})<\mathbf{d}$ imply that $\mathrm{cdeg}(\mathbf{{F}}-\mathbf{{R}})<\mathbf{d}+(d,\ldots,d)$ , and thus $\mathrm{rdeg}_{{-\mathbf{d}}}(\mathbf{{F}}-\mathbf{{R}})<(\delta,\ldots,\delta)$ . Since $\mathbf{{F}}-\mathbf{{R}}=\mathbf{{Q}}\mathbf{{M}}$ , from the previous paragraph we obtain $\mathrm{rdeg}_{{\mathbf{0}}}(\mathbf{{Q}})<(\delta,\ldots,\delta)$ , hence $\deg(\mathbf{{Q}})<\delta$ . ∎

Corollary 3.0.

Let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ and $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ be such that $\mathbf{{M}}$ is column reduced and $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})$ , and let $\mathbf{{P}}\in\mathbb{K}[x]^{k\times m}$ . Then, $\mathrm{rdeg}(\mathchoice{\operatorname{Quo}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Quo}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})})<\mathrm{rdeg}(\mathbf{{P}})$ .

Proof.

For the case $k=1$ , the inequality follows from Lemma 3.0 since $\mathrm{cdeg}(\mathbf{{P}}\mathbf{{F}})\leqslant(\delta,\ldots,\delta)+\mathrm{cdeg}(\mathbf{{F}})<(\delta,\ldots,\delta)+\mathrm{cdeg}(\mathbf{{M}})$ , where $\delta=\deg(\mathbf{{P}})$ . Then, the general case $k\in\mathbb{Z}_{>0}$ follows by considering separately each row of $\mathbf{{P}}$ . ∎

Going back to the division $\mathbf{{F}}=\mathbf{{Q}}\mathbf{{M}}+\mathbf{{R}}$ , to obtain the reversed quotient we will right-multiply the reversed $\mathbf{{F}}$ by an expansion of the inverse of the reversed $\mathbf{{M}}$ . This operation is performed efficiently by means of high-order lifting; we will use the next result.

Lemma 3.0.

Let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ with $\mathbf{{M}}(0)$ nonsingular, and let $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ . Then, defining $d=\lceil|\mathrm{cdeg}(\mathbf{{M}})|/n\rceil$ , the truncated $x$ -adic expansion $\mathbf{{F}}\mathbf{{M}}^{-1}\bmod x^{kd}$ can be computed deterministically using $\mathchoice{\tilde{O}\left(\lceil mk/n\rceil n^{\omega}d\right)}{O\tilde{\leavevmode\nobreak\ }(\lceil mk/n\rceil n^{\omega}d)}{O\tilde{\leavevmode\nobreak\ }(\lceil mk/n\rceil n^{\omega}d)}{O\tilde{\leavevmode\nobreak\ }(\lceil mk/n\rceil n^{\omega}d)}$ operations in $\mathbb{K}$ .

Proof.

This is a minor extension of (Storjohann, 2003, Prop. 15), incorporating the average column degree of the matrix $\mathbf{{M}}$ instead of the largest degree of its entries. This can be done by means of partial column linearization (Gupta et al., 2012, Sec. 6), as follows. One first expands the high-degree columns of $\mathbf{{M}}$ and inserts elementary rows to obtain a matrix $\overline{\mathbf{{M}}}\in\mathbb{K}[x]^{\overline{n}\times\overline{n}}$ such that $n\leqslant\overline{n}<2n$ , $\deg(\overline{\mathbf{{M}}})\leqslant d$ , and $\mathbf{{M}}^{-1}$ is the $n\times n$ principal leading submatrix of $\overline{\mathbf{{M}}}{}^{-1}$ (Gupta et al., 2012, Thm. 10 and Cor. 2). Then, defining $\overline{\mathbf{{F}}}=[\mathbf{{F}}\;\;\mathbf{{0}}]\in\mathbb{K}[x]^{m\times\overline{n}}$ , we have that $\mathbf{{F}}\mathbf{{M}}^{-1}$ is the submatrix of $\overline{\mathbf{{F}}}\,\overline{\mathbf{{M}}}{}^{-1}$ formed by its first $n$ columns. Thus, the sought truncated expansion is obtained by computing $\overline{\mathbf{{F}}}\,\overline{\mathbf{{M}}}{}^{-1}\bmod x^{kd}$ , which is done efficiently by (Storjohann, 2003, Alg. 4) with the choice $X=x^{d}$ ; this is valid since this polynomial is coprime to $\det(\overline{\mathbf{{M}}})=\det(\mathbf{{M}})$ and its degree is at least the degree of $\overline{\mathbf{{M}}}$ . ∎

Proposition 3.0.

Algorithm 1* is correct. Assuming that both $m\delta$ and $n$ are in $O(D)$ , where $D=|\mathrm{cdeg}(\mathbf{{M}})|$ , this algorithm uses $\mathchoice{\tilde{O}\left(\lceil m/n\rceil n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}$ operations in $\mathbb{K}$ . *

Proof.

Let $\mathbf{{Q}}=\mathchoice{\operatorname{Quo}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Quo}(\mathbf{{F}},\mathbf{{M}})}$ , $\mathbf{{R}}=\mathchoice{\operatorname{Rem}\left(\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{F}},\mathbf{{M}})}$ , and $(d_{1},\ldots,d_{n})=\mathrm{cdeg}(\mathbf{{M}})$ . We have the bounds $\mathrm{cdeg}(\mathbf{{F}})<(\delta+d_{1},\ldots,\delta+d_{n})$ , $\mathrm{cdeg}(\mathbf{{R}})<(d_{1},\ldots,d_{n})$ , and Lemma 3.0 gives $\deg(\mathbf{{Q}})<\delta$ . Thus, we can define the reversals of these polynomial matrices as

[TABLE]

for which the same degree bounds hold. Then, right-multiplying both sides of the identity $\mathbf{{F}}(x^{-1})=\mathbf{{Q}}(x^{-1})\mathbf{{M}}(x^{-1})+\mathbf{{R}}(x^{-1})$ by $\mathrm{diag}(x^{\delta+d_{1}-1},\ldots,x^{\delta+d_{n}-1})$ , we obtain ${\mathbf{{F}}}_{\mathrm{rev}}={\mathbf{{Q}}}_{\mathrm{rev}}{\mathbf{{M}}}_{\mathrm{rev}}+x^{\delta}{\mathbf{{R}}}_{\mathrm{rev}}$ .

Now, note that the constant term ${\mathbf{{M}}}_{\mathrm{rev}}(0)\in\mathbb{K}^{n\times n}$ is equal to the column leading matrix of $\mathbf{{M}}$ , which is invertible since $\mathbf{{M}}$ is column reduced, hence ${\mathbf{{M}}}_{\mathrm{rev}}$ is invertible (over the fractions). Thus, since $\deg({\mathbf{{Q}}}_{\mathrm{rev}})<\delta$ , this reversed quotient matrix can be determined as the truncated expansion ${\mathbf{{Q}}}_{\mathrm{rev}}={\mathbf{{F}}}_{\mathrm{rev}}{\mathbf{{M}}}_{\mathrm{rev}}^{-1}\bmod x^{\delta}$ . This proves the correctness of the algorithm.

Concerning the cost bound, Step 2 uses $\mathchoice{\tilde{O}\left(\lceil(m\delta)/(nd)\rceil n^{\omega}d\right)}{O\tilde{\leavevmode\nobreak\ }(\lceil(m\delta)/(nd)\rceil n^{\omega}d)}{O\tilde{\leavevmode\nobreak\ }(\lceil(m\delta)/(nd)\rceil n^{\omega}d)}{O\tilde{\leavevmode\nobreak\ }(\lceil(m\delta)/(nd)\rceil n^{\omega}d)}$ operations according to Lemma 3.0, where $d=\lceil D/n\rceil$ . We have by assumption $d\in\Theta(D/n)$ as well as $m\delta/(nd)\in O(1)$ , so that this cost bound is in $\mathchoice{\tilde{O}\left(n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}$ .

In Step 3, we multiply the $m\times n$ matrix $\mathbf{{Q}}$ of degree less than $\delta$ with the $n\times n$ matrix $\mathbf{{M}}$ such that $|\mathrm{cdeg}(\mathbf{{M}})|=D$ . First consider the case $m\leqslant n$ . To perform this product efficiently, we expand the rows of $\mathbf{{Q}}$ so as to obtain a $O(n)\times n$ matrix $\overline{\mathbf{{Q}}}$ of degree in $O(\lceil m\delta/n\rceil)$ and such that $\mathbf{{Q}}\mathbf{{M}}$ is easily retrieved from $\overline{\mathbf{{Q}}}\mathbf{{M}}$ (see Section 3.2 for more details about how such row expansions are carried out). Thus, this product is done in $\mathchoice{\tilde{O}\left(n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}$ , since $\lceil m\delta/n\rceil\in O(D/n)$ . On the other hand, if $m>n$ , we have $\delta\in O(D/m)\subseteq O(D/n)$ . Then, we can compute the product $\mathbf{{Q}}\mathbf{{M}}$ via $\lceil m/n\rceil$ products of $n\times n$ matrices of degree $O(D/n)$ , which cost each $\mathchoice{\tilde{O}\left(n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(n^{\omega-1}D)}$ operations; hence the total cost $\mathchoice{\tilde{O}\left(mn^{\omega-2}D\right)}{O\tilde{\leavevmode\nobreak\ }(mn^{\omega-2}D)}{O\tilde{\leavevmode\nobreak\ }(mn^{\omega-2}D)}{O\tilde{\leavevmode\nobreak\ }(mn^{\omega-2}D)}$ when $m>n$ . ∎

3.2. Fast residual computation

Here, we focus on performing modular products $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}$ , where $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ and $\mathbf{{P}}\in\mathbb{K}[x]^{m\times m}$ are such that $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})$ and $|\mathrm{cdeg}(\mathbf{{P}})|\leqslant|\mathrm{cdeg}(\mathbf{{M}})|$ , and $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ is column reduced. The difficulty in designing a fast algorithm for this operation comes from the non-uniformity of $\mathrm{cdeg}(\mathbf{{P}})$ : in particular, the product $\mathbf{{P}}\mathbf{{F}}$ cannot be computed within the target cost bound.

To start with, we use the same strategy as in (Jeannerod et al., 2016; Neiger, 2016b): we make the column degrees of $\mathbf{{P}}$ uniform, at the price of introducing another, simpler matrix $\mathcal{E}$ for which we want to compute $\mathchoice{\operatorname{Rem}\left(\mathcal{E}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}$ .

Let $(\delta_{1},\ldots,\delta_{m})=\mathrm{cdeg}(\mathbf{{P}})$ , $\delta=\lceil(\delta_{1}+\cdots+\delta_{m})/m\rceil\geqslant 1$ , and for $i\in\{1,\ldots,m\}$ write $\delta_{i}=(\alpha_{i}-1)\delta+\beta_{i}$ with $\alpha_{i}=\lceil\delta_{i}/\delta\rceil$ and $1\leqslant\beta_{i}\leqslant\delta$ if $\delta_{i}>0$ , and with $\alpha_{i}=1$ and $\beta_{i}=0$ if $\delta_{i}=0$ . Then, let $\overline{m}=\alpha_{1}+\cdots+\alpha_{m}$ , and define $\mathcal{E}\in\mathbb{K}[x]^{\overline{m}\times m}$ as the transpose of

[TABLE]

Define also the expanded column degrees $\overline{\boldsymbol{\delta}}\in\mathbb{Z}_{\geqslant 0}^{\overline{m}}$ as

[TABLE]

Then, we expand the columns of $\mathbf{{P}}$ by considering $\overline{\mathbf{{P}}}\in\mathbb{K}[x]^{m\times\overline{m}}$ such that $\mathbf{{P}}=\overline{\mathbf{{P}}}\mathcal{E}$ and $\deg(\overline{\mathbf{{P}}})\leqslant\delta$ . (Note that $\overline{\mathbf{{P}}}$ can be made unique by specifying more constraints on $\mathrm{cdeg}(\overline{\mathbf{{P}}})$ .) The aim of this construction is that the dimension is at most doubled while the degree of the expanded matrix becomes the average column degree of $\mathbf{{P}}$ . Precisely, $m\leqslant\overline{m}<2m$ and $\max(\overline{\boldsymbol{\delta}})=\delta=\lceil|\mathrm{cdeg}(\mathbf{{P}})|/m\rceil$ .

Now, we have $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}=\mathchoice{\operatorname{Rem}\left(\overline{\mathbf{{P}}}\mathcal{E}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\overline{\mathbf{{P}}}\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\mathcal{E}\mathbf{{F}},\mathbf{{M}})}=\mathchoice{\operatorname{Rem}\left(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}}\right)}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}$ by Lemma 2.0, where $\overline{\mathbf{{F}}}=\mathchoice{\operatorname{Rem}\left(\mathcal{E}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}$ . Thus, $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}$ can be obtained by computing first $\overline{\mathbf{{F}}}$ and then $\mathchoice{\operatorname{Rem}\left(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}}\right)}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}$ . For the latter, since $\overline{\mathbf{{P}}}$ has small degree, one can compute the product and then perform the division (Steps 3 and 4 of Algorithm 3). Step 2 of Algorithm 3 efficiently computes $\overline{\mathbf{{F}}}$ , relying on Algorithm 2.

Proposition 3.0.

Algorithm 2* is correct. Assuming that both $2^{k}m\delta$ and $n$ are in $O(D)$ , where $D=|\mathrm{cdeg}(\mathbf{{M}})|$ , this algorithm uses $\mathchoice{\tilde{O}\left((2^{k}mn^{\omega-2}+kn^{\omega-1})D\right)}{O\tilde{\leavevmode\nobreak\ }((2^{k}mn^{\omega-2}+kn^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((2^{k}mn^{\omega-2}+kn^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((2^{k}mn^{\omega-2}+kn^{\omega-1})D)}$ operations in $\mathbb{K}$ .*

Proof.

The correctness is a consequence of the two properties in Lemma 2.0. Now, if $2^{k}m\delta$ and $n$ are in $O(D)$ , the assumptions in Proposition 3.0 about the input parameters for PM-QuoRem are always satisfied in recursive calls, since the row dimension $m$ is doubled while the exponent $2^{k}\delta$ is halved. From the same proposition, we deduce the cost bound $\mathchoice{\tilde{O}\left((\sum_{0\leqslant r\leqslant k-1}\lceil 2^{r}m/n\rceil)n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }((\sum_{0\leqslant r\leqslant k-1}\lceil 2^{r}m/n\rceil)n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }((\sum_{0\leqslant r\leqslant k-1}\lceil 2^{r}m/n\rceil)n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }((\sum_{0\leqslant r\leqslant k-1}\lceil 2^{r}m/n\rceil)n^{\omega-1}D)}$ . ∎

Proposition 3.0.

Algorithm 3* is correct. Assuming that all of $|\mathrm{cdeg}(\mathbf{{P}})|$ , $m$ , and $n$ are in $O(D)$ , where $D=|\mathrm{cdeg}(\mathbf{{M}})|$ , this algorithm uses $\mathchoice{\tilde{O}\left((m^{\omega-1}+n^{\omega-1})D\right)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}$ operations in $\mathbb{K}$ .*

Proof.

Let us consider $\mathcal{E}\in\mathbb{K}[x]^{\overline{m}\times m}$ defined as in Eq. 1 from the parameters $\delta$ and $\alpha_{1},\ldots,\alpha_{m}$ in Step 1. We claim that the matrix $\overline{\mathbf{{F}}}$ computed at Step 2 is equal to $\mathchoice{\operatorname{Rem}\left(\mathcal{E}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}$ . Then, having $\mathrm{cdeg}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}})<\mathrm{cdeg}(\mathbf{{M}})+(\delta,\ldots,\delta)$ , the correctness of PM-QuoRem implies $\mathbf{{R}}=\mathchoice{\operatorname{Rem}\left(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}}\right)}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}{\operatorname{Rem}(\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}},\mathbf{{M}})}$ , which is $\mathchoice{\operatorname{Rem}\left(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathbf{{P}}\mathbf{{F}},\mathbf{{M}})}$ by Lemma 2.0.

To prove our claim, it is enough to show that, for $1\leqslant i\leqslant m$ , the $i$ th block $\overline{\mathbf{{F}}}_{i}$ of $\overline{\mathbf{{F}}}$ is the matrix formed by stacking the remainders involving the row $i$ of $\mathbf{{F}}$ , that is, $(\mathchoice{\operatorname{Rem}\left(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}}\right)}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})})_{0\leqslant r<\alpha_{i}}$ . This is clear from the first For loop if $\alpha_{i}=1$ . Otherwise, let $k\in\mathbb{Z}_{>0}$ be such that $2^{k-1}<\alpha_{i}\leqslant 2^{k}$ . Then, at the $k$ th iteration of the second loop, we have $i_{j}=i$ for some $1\leqslant j\leqslant\ell$ . Thus, the correctness of RemOfShifts implies that, for $0\leqslant r<2^{k}$ , the row $j$ of $\mathbf{{R}}_{r}$ is $\mathchoice{\operatorname{Rem}\left(x^{r\delta}{\mathbf{{G}}}_{j,*},\mathbf{{M}}\right)}{\operatorname{Rem}(x^{r\delta}{\mathbf{{G}}}_{j,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{G}}}_{j,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{G}}}_{j,*},\mathbf{{M}})}=\mathchoice{\operatorname{Rem}\left(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}}\right)}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})}{\operatorname{Rem}(x^{r\delta}{\mathbf{{F}}}_{i,*},\mathbf{{M}})}$ . Since $2^{k}\geqslant\alpha_{i}$ , this contains the wanted remainders and the claim follows.

Let us show the cost bound, assuming that $|\mathrm{cdeg}(\mathbf{{P}})|$ , $m$ , and $n$ are in $O(D)$ . Note that this implies $m\delta\in O(D)$ .

We first study the cost of the iteration $k$ of the second loop of Step 2. We have that $2^{k-1}\ell\leqslant\alpha_{1}+\cdots+\alpha_{m}=\overline{m}\leqslant 2m$ , the row dimension of $\mathbf{{G}}$ is $\ell$ , and $k\leqslant\lceil\log(\max_{i}(\alpha_{i}))\rceil\in O(\log(m))$ . Thus, the call to RemOfShifts costs $\mathchoice{\tilde{O}\left((mn^{\omega-2}+n^{\omega-1})D\right)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}$ operations according to Proposition 3.0, and the same cost bound holds for the whole Step 2. Concerning Step 4, the cost bound $\mathchoice{\tilde{O}\left(\lceil m/n\rceil n^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(\lceil m/n\rceil n^{\omega-1}D)}$ follows directly from Proposition 3.0.

The product at Step 3 involves the $m\times\overline{m}$ matrix $\overline{\mathbf{{P}}}$ whose degree is at most $\delta$ and the $\overline{m}\times n$ matrix $\overline{\mathbf{{F}}}$ such that $\mathrm{cdeg}(\overline{\mathbf{{F}}})<\mathrm{cdeg}(\mathbf{{M}})$ ; we recall that $\overline{m}\leqslant 2m$ . If $n\geqslant m$ , we expand the columns of $\overline{\mathbf{{F}}}$ similarly to how $\overline{\mathbf{{P}}}$ was obtained from $\mathbf{{P}}$ : this yields a $\overline{m}\times(\leqslant 2n)$ matrix of degree at most $\lceil D/n\rceil$ , whose left-multiplication by $\overline{\mathbf{{P}}}$ directly yields $\overline{\mathbf{{P}}}\,\overline{\mathbf{{F}}}$ by compressing back the columns. Thus, this product is done in $\mathchoice{\tilde{O}\left(m^{\omega-2}nD\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-2}nD)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-2}nD)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-2}nD)}$ operations since both $\delta$ and $D/n$ are in $O(D/m)$ when $n\geqslant m$ . If $m\geqslant n$ , we do a similar column expansion of $\overline{\mathbf{{F}}}$ , yet into a matrix with $O(m)$ columns and degree $O(D/m)$ ; thus, the product can be performed in $\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ operations in this case. ∎

4. Fast algorithms in specific cases

Here, we discuss fast solutions to specific instances of Problem 1. This will be important ingredients of our main algorithm for relations modulo Hermite forms (Algorithm 5).

4.1. When the input module is an ideal

We first focus on Problem 1 when $n=1$ ; this is one of the two base cases of the recursion in Algorithm 5 (Step 2). In this case, the input matrix $\mathbf{{M}}$ is a nonzero polynomial $M\in\mathbb{K}[x]$ . In other words, the input module is the ideal $(M)$ of $\mathbb{K}[x]$ , and we are looking for the $\mathbf{s}$ -Popov basis for the set of relations between $m$ elements of $\mathbb{K}[x]/(M)$ . A fast algorithm for this task was given in (Neiger, 2016b, Sec. 2.2); precisely, the following result is achieved by running (Neiger, 2016b, Alg. 2) on input $\mathbf{{M}},\mathbf{{F}},\mathbf{s},2D$ .

Proposition 4.0.

Assuming $n=1$ and $\deg(\mathbf{{F}})<D=\deg(\mathbf{{M}})$ , there is an algorithm which solves Problem 1 using $\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ operations in $\mathbb{K}$ .

4.2. When the $\mathbf{s}$ -minimal degree is known

Now, we consider Problem 1 with an additional input: the $\mathbf{s}$ -minimal degree of $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ , which is the column degree of its $\mathbf{s}$ -Popov basis. This is motivated by a technique from (Jeannerod et al., 2016) and used in Algorithm 5 to control the degrees of all the bases computed in the process. Namely, we find this $\mathbf{s}$ -minimal degree recursively, and then we compute the $\mathbf{s}$ -Popov relation basis using this knowledge.

The same question was tackled in (Gupta and Storjohann, 2011, Sec. 3) and (Neiger, 2016b, Sec. 2.1) for a diagonal matrix $\mathbf{{M}}$ . Here, we extend this to the case of a column reduced $\mathbf{{M}}$ , relying in particular on the fast computation of $\mathchoice{\operatorname{Rem}\left(\mathcal{E}\mathbf{{F}},\mathbf{{M}}\right)}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}{\operatorname{Rem}(\mathcal{E}\mathbf{{F}},\mathbf{{M}})}$ designed in Section 3.2. We first extend (Neiger, 2016b, Lem. 2.1) to this more general setting (Lemma 4.0), and then we give the slightly modified version of (Neiger, 2016b, Alg. 1) (Algorithm 4).

Lemma 4.0.

Let $\mathbf{{M}}\in\mathbb{K}[x]^{n\times n}$ be column reduced, let $\mathbf{{F}}\in\mathbb{K}[x]^{m\times n}$ be such that $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})$ , let $\mathbf{s}\in\mathbb{Z}^{m}$ . Furthermore, let $\mathbf{{P}}\in\mathbb{K}[x]^{m\times m}$ , and let $\mathbf{w}\in\mathbb{Z}^{n}$ be such that $\max(\mathbf{w})\leqslant\min(\mathbf{s})$ . Then, $\mathbf{{P}}$ is the $\mathbf{s}$ -Popov relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ if and only if $[\mathbf{{P}}\;\;\mathbf{{Q}}]$ is the $\mathbf{u}$ -Popov kernel basis of $[\mathbf{{F}}^{\mathsf{T}}\;\;\mathbf{{M}}]^{\mathsf{T}}$ for some $\mathbf{{Q}}\in\mathbb{K}[x]^{m\times n}$ and $\mathbf{u}=(\mathbf{s},\mathbf{w})\in\mathbb{Z}^{m+n}$ . In this case, $\deg(\mathbf{{Q}})<\deg(\mathbf{{P}})$ and $[\mathbf{{P}}\;\;\mathbf{{Q}}]$ has $\mathbf{u}$ -pivot index $(1,2,\ldots,m)$ .

Proof.

Let $\mathbf{{N}}=[\mathbf{{F}}^{\mathsf{T}}\;\;\mathbf{{M}}]^{\mathsf{T}}$ . It is easily verified that $\mathbf{{P}}$ is a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{M}},\mathbf{{F}})$ if and only if there is some $\mathbf{{Q}}\in\mathbb{K}[x]^{m\times n}$ such that $[\mathbf{{P}}\;\;\mathbf{{Q}}]$ is a kernel basis of $\mathbf{{N}}$ .

Then, for any matrix $[\mathbf{{P}}\;\;\mathbf{{Q}}]\in\mathbb{K}[x]^{m\times(m+n)}$ in the kernel of $\mathbf{{N}}$ , we have $\mathbf{{P}}\mathbf{{F}}=-\mathbf{{Q}}\mathbf{{M}}$ and therefore Corollary 3.0 shows that $\mathrm{rdeg}(\mathbf{{Q}})<\mathrm{rdeg}(\mathbf{{P}})$ ; since $\max(\mathbf{w})\leqslant\min(\mathbf{s})$ , this implies $\mathrm{rdeg}_{{\mathbf{w}}}(\mathbf{{Q}})<\mathrm{rdeg}_{{\mathbf{s}}}(\mathbf{{P}})$ . Thus, we have $\mathrm{lm}_{\mathbf{u}}([\mathbf{{P}}\;\;\mathbf{{Q}}])=[\mathrm{lm}_{\mathbf{s}}(\mathbf{{P}})\;\;\mathbf{{0}}]$ , and therefore $\mathbf{{P}}$ is in $\mathbf{s}$ -Popov form if and only if $[\mathbf{{P}}\;\;\mathbf{{Q}}]$ is in $\mathbf{u}$ -Popov form with $\mathbf{u}$ -pivot index $(1,\ldots,m)$ . ∎

Proposition 4.0.

Algorithm 4* is correct, and assuming that $m$ and $n$ are in $O(D)$ , where $D=|\mathrm{cdeg}(\mathbf{{M}})|$ , it uses $\mathchoice{\tilde{O}\left(m^{\omega-1}D+n^{\omega}D/m\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}$ operations in $\mathbb{K}$ .*

Proof.

The correctness follows from the material in (Neiger, 2016b, Sec. 2.1) and (Jeannerod et al., 2016, Sec. 4). Concerning the cost bound, we first note that we have $\delta_{1}+\cdots+\delta_{m}\leqslant D$ according to Corollary 2.0. Thus, the cost analysis in Proposition 3.0 shows that Step 2 uses $\mathchoice{\tilde{O}\left((mn^{\omega-2}+n^{\omega-1})D\right)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((mn^{\omega-2}+n^{\omega-1})D)}$ operations. (Jeannerod et al., 2016, Thm. 1.4) states that the approximant basis computation at Step 3 uses $\mathchoice{\tilde{O}\left((m+n)^{\omega-1}(1+n/m)D\right)}{O\tilde{\leavevmode\nobreak\ }((m+n)^{\omega-1}(1+n/m)D)}{O\tilde{\leavevmode\nobreak\ }((m+n)^{\omega-1}(1+n/m)D)}{O\tilde{\leavevmode\nobreak\ }((m+n)^{\omega-1}(1+n/m)D)}$ operations, since the row dimension of the input matrix is $\overline{m}+n\leqslant 2m+n$ and the sum of the orders is $|\boldsymbol{\tau}|=|\mathrm{cdeg}(\mathbf{{M}})|+n(\delta+1)\leqslant(1+n/m)D$ . ∎

4.3. Solution based on fast linear algebra

Here, we detail how previous work can be used to handle a base case of the recursion in Algorithm 5 (Step 1): when the vector space dimension $\deg(\det(\mathbf{{M}}))$ of the input module is small compared to the number $m$ of input elements. Then, we rely on an interpretation of Problem 1 as a question of dense linear algebra over $\mathbb{K}$ , which is solved efficiently by (Jeannerod et al., 2017, Alg. 9). This yields the following result.

Proposition 4.0.

Assuming that $\mathbf{{M}}$ is in shifted Popov form, and that $\mathrm{cdeg}(\mathbf{{F}})<\mathrm{cdeg}(\mathbf{{M}})$ , there is an algorithm which solves Problem 1 using $\mathchoice{\tilde{O}\left(D^{\omega}\lceil m/D\rceil\right)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega}\lceil m/D\rceil)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega}\lceil m/D\rceil)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega}\lceil m/D\rceil)}$ operations in $\mathbb{K}$ , where $D=\deg(\det(\mathbf{{M}}))$ .

This cost bound is $\mathchoice{\tilde{O}\left(D^{\omega-1}m\right)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega-1}m)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega-1}m)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega-1}m)}\subseteq\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ when $D\in O(m)$ . To see why relying on fast linear algebra is sufficient to obtain a fast algorithm when $D\in O(m)$ , we note that this implies that the average column degree of the $\mathbf{s}$ -Popov relation basis $\mathbf{{P}}$ is

[TABLE]

For example, if $D\leqslant m$ , most entries in this basis have degree [math]: we are essentially dealing with matrices over $\mathbb{K}$ . On the other hand, when $m\in O(D)$ , this approach based on linear algebra uses $\mathchoice{\tilde{O}\left(D^{\omega}\right)}{O\tilde{\leavevmode\nobreak\ }(D^{\omega})}{O\tilde{\leavevmode\nobreak\ }(D^{\omega})}{O\tilde{\leavevmode\nobreak\ }(D^{\omega})}$ operations, which largely exceeds our target cost.

We now describe how to translate our problem into the $\mathbb{K}$ -linear algebra framework in (Jeannerod et al., 2017). Let $\mathcal{M}$ denote the row space of $\mathbf{{M}}$ ; we assume that $\mathbf{{M}}$ has no identity column. In order to compute in the quotient $\mathbb{K}[x]^{n}/\mathcal{M}$ , which has finite dimension $D$ , it is customary to make use of the multiplication matrix of $x$ with respect to a given monomial basis. Here, since the basis $\mathbf{{M}}$ of $\mathcal{M}$ is in shifted Popov form with column degree $(d_{1},\ldots,d_{n})\in\mathbb{Z}_{>0}^{n}$ , Lemma 2.0 suggests to use the monomial basis

[TABLE]

Above, we have represented an element in $\mathbb{K}[x]^{n}/\mathcal{M}$ by a polynomial vector $\mathbf{{f}}\in\mathbb{K}[x]^{1\times n}$ such that $\mathrm{cdeg}(\mathbf{{f}})<(d_{1},\ldots,d_{n})$ . In the linear algebra viewpoint, we rather represent it by a constant vector $\mathbf{{e}}\in\mathbb{K}^{1\times D}$ , which is formed by the concatenations of the coefficient vectors of the entries of $\mathbf{{f}}$ . Applying this to each row of the input matrix $\mathbf{{F}}$ yields a constant matrix $\mathbf{{E}}\in\mathbb{K}^{m\times D}$ , which is another representation of the same $m$ elements in the quotient.

Besides, the multiplication matrix $\mathbf{{X}}\in\mathbb{K}^{D\times D}$ is the matrix such that $\mathbf{{e}}\mathbf{{X}}\in\mathbb{K}^{1\times D}$ corresponds to the remainder in the division of $x\mathbf{{f}}$ by $\mathbf{{M}}$ . Since the basis $\mathbf{{M}}$ is in shifted Popov form, the computation of $\mathbf{{X}}$ is straightforward. Indeed, writing $\mathbf{{M}}=\mathrm{diag}(x^{d_{1}},\ldots,x^{d_{n}})-\mathbf{{A}}$ where $\mathbf{{A}}\in\mathbb{K}[x]^{n\times n}$ is such that $\mathrm{cdeg}(\mathbf{{A}})<(d_{1},\ldots,d_{n})$ , then

•

the row $d_{1}+\cdots+d_{i-1}+j$ of $\mathbf{{X}}$ is the unit vector with $1$ at index $d_{1}+\cdots+d_{i-1}+j+1$ , for $1\leqslant j<d_{i}$ and $1\leqslant i\leqslant n$ ,

•

the row $d_{1}+\cdots+d_{i}$ of $\mathbf{{X}}$ is the concatenation of the coefficient vectors of the row $i$ of $\mathbf{{A}}$ , for $1\leqslant i\leqslant n$ .

That is, writing $\mathbf{{A}}=[a_{ij}]_{1\leqslant i,j\leqslant n}$ and denoting by $\{a_{ij}^{(k)},0\leqslant k<d_{j}\}$ the coefficients of $a_{ij}$ , the multiplication matrix $\mathbf{{X}}\in\mathbb{K}^{D\times D}$ is

[TABLE]

5. Relations modulo Hermite forms

In this section, we give a fast algorithm for solving Problem 1 when $\mathbf{{M}}$ is in Hermite form; this matrix is denoted by $\mathbf{{H}}$ in what follows. The cost bound is given under the assumption that $\mathbf{{H}}$ has no identity column; how to reduce to this case by discarding columns of $\mathbf{{H}}$ and $\mathbf{{F}}$ was discussed in Corollary 2.0. We recall that Steps 1, 2, and 3.i have been discussed in Section 4.

Proposition 5.0.

Algorithm 5* is correct and, assuming the entries $\mathrm{cdeg}(\mathbf{{H}})$ are positive, it uses $\mathchoice{\tilde{O}\left(m^{\omega-1}D+n^{\omega}D/m\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}$ operations in $\mathbb{K}$ , where $D=|\mathrm{cdeg}(\mathbf{{H}})|=\deg(\det(\mathbf{{H}}))$ .*

Proof.

Following the recursion in the algorithm, our proof is by induction on $n$ , with two base cases (Steps 1 and 2).

The correctness and the cost bound for Step 1 follows from the discussion in Section 4.3, as summarized in Proposition 4.0. From Section 4.1, Step 2 correctly computes the $\mathbf{s}$ -Popov relation basis and uses $\mathchoice{\tilde{O}\left(m^{\omega-1}D\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D)}$ operations in $\mathbb{K}$ .

Now, we focus on the correctness of Step 3, assuming that the two recursive calls at Steps 3.d and 3.g correctly compute the shifted Popov relation bases. Since KnownDegreeRelations is correct, it is enough to prove that the $\mathbf{s}$ -minimal degree of $\operatorname{\mathcal{R}}(\mathbf{{H}},\mathbf{{F}})$ is $\boldsymbol{\delta}_{1}+\boldsymbol{\delta}_{2}$ ; for this, we will show that $\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ is a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{H}},\mathbf{{F}})$ whose $\mathbf{s}$ -Popov form has column degree $\boldsymbol{\delta}_{1}+\boldsymbol{\delta}_{2}$ .

From Theorem 2.0, $\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ is a relation basis for $\operatorname{\mathcal{R}}(\mathbf{{H}},\mathbf{{F}})$ . Furthermore, the fact that the $\mathbf{s}$ -Popov form of $\mathbf{{P}}_{2}\mathbf{{P}}_{1}$ has column degree $\boldsymbol{\delta}_{1}+\boldsymbol{\delta}_{2}$ follows from (Jeannerod et al., 2016, Sec. 3), since $\mathbf{{P}}_{1}$ is in $\mathbf{s}$ -Popov form and $\mathbf{{P}}_{2}$ is in $\mathbf{t}$ -Popov form, where $\mathbf{t}=\mathbf{s}+\boldsymbol{\delta}_{1}=\mathrm{rdeg}_{{\mathbf{s}}}(\mathbf{{P}}_{1})$ .

Concerning the cost of Step 3, we remark that $m<D$ , that $n\leqslant D$ is ensured by $\mathrm{cdeg}(\mathbf{{H}})>\mathbf{0}$ , and that $\boldsymbol{\delta}_{1}+\boldsymbol{\delta}_{2}=\deg(\det(\mathbf{{P}}_{2}\mathbf{{P}}_{1}))\leqslant D$ according to Corollary 2.0. Furthermore, there are two recursive calls with dimension about $n/2$ , and with $\mathbf{{H}}_{1}$ and $\mathbf{{H}}_{2}$ that are in Hermite form and have determinant degrees $D_{1}=\deg(\det(\mathbf{{H}}_{1}))$ and $D_{2}=\deg(\det(\mathbf{{H}}_{2}))$ such that $D=D_{1}+D_{2}$ . Besides, the entries of both $\mathrm{cdeg}(\mathbf{{H}}_{1})$ and $\mathrm{cdeg}(\mathbf{{H}}_{2})$ are all positive.

In particular, the assumptions on the parameters in Propositions 3.0 and 4.0, concerning the computation of the residual at Step 3.f and of the relation basis when the degrees are known at Step 3.i, are satisfied. Thus, these steps use $\mathchoice{\tilde{O}\left((m^{\omega-1}+n^{\omega-1})D\right)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}{O\tilde{\leavevmode\nobreak\ }((m^{\omega-1}+n^{\omega-1})D)}$ and $\mathchoice{\tilde{O}\left(m^{\omega-1}D+n^{\omega}D/m\right)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}{O\tilde{\leavevmode\nobreak\ }(m^{\omega-1}D+n^{\omega}D/m)}$ operations, respectively. The announced cost bound follows. ∎

Acknowledgements.

The authors thank Claude-Pierre Jeannerod for interesting discussions, Arne Storjohann for his helpful comments on high-order lifting, and the reviewers whose remarks helped to prepare the final version of this paper. The research leading to these results has received funding from the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007-2013) under REA grant agreement number 609405 (COFUNDPostdocDTU). Vu Thi Xuan acknowledges financial support provided by the scholarship Explora Doc from Région Rhône-Alpes, France, and by the LABEX MILYON (ANR-10-LABX-0070) of Université de Lyon, within the program Investissements d’Avenir (ANR-11-IDEX-0007) operated by the French National Research Agency.

Bibliography38

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1(1)
2Baker and Graves-Morris (1996) G. A. Baker and P. R. Graves-Morris. 1996. Padé Approximants . Cambridge University Press.
3Beckermann (1992) B. Beckermann. 1992. A reliable method for computing M-Padé approximants on arbitrary staircases. J. Comput. Appl. Math. 40, 1 (1992), 19–42.
4Beckermann and Labahn (1994) B. Beckermann and G. Labahn. 1994. A Uniform Approach for the Fast Computation of Matrix-Type Padé Approximants. SIAM J. Matrix Anal. Appl. 15, 3 (July 1994), 804–823.
5Beckermann and Labahn (1997) B. Beckermann and G. Labahn. 1997. Recursiveness in matrix rational interpolation problems. J. Comput. Appl. Math. 77, 1–2 (1997), 5–34.
6Beckermann et al . (1999) B. Beckermann, G. Labahn, and G. Villard. 1999. Shifted Normal Forms of Polynomial Matrices. In ISSAC’99 . ACM, 189–196.
7Codenotti and Lotti (1989) B. Codenotti and G. Lotti. 1989. A fast algorithm for the division of two polynomial matrices. IEEE Trans. Automat. Control 34, 4 (Apr 1989), 446–448.
8Coppersmith and Winograd (1990) D. Coppersmith and S. Winograd. 1990. Matrix multiplication via arithmetic progressions. J. Symbolic Comput. 9, 3 (1990), 251–280.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Computing Canonical Bases of Modules of Univariate Relations

Abstract.

1. Introduction

Univariate relations

Canonical bases

Definition 1.0 ((Kailath, 1980; Beckermann

Relations modulo Hermite forms

Theorem 1.0.

Shifted Popov forms of matrices

Theorem 1.0.

General relation bases

Outline

2. Preliminaries on polynomial matrix division and modules of relations

Division with remainder

Theorem 2.0 ((Gantmacher, 1959, IV.§2),(Kailath, 1980, Thm. 6.3-15)).

Lemma 2.0.

Degree control for relation bases

Lemma 2.0.

Proof.

Corollary 2.0.

Proof.

Properties of relation bases

Lemma 2.0.

Corollary 2.0.

Lemma 2.0.

Proof.

Divide and conquer approach

Lemma 2.0.

Proof.

Theorem 2.0.

Proof.

3. Computing modular products

3.1. Fast division with remainder

Lemma 3.0.

Proof.

Corollary 3.0.

Proof.

Lemma 3.0.

Proof.

Proposition 3.0.

Proof.

3.2. Fast residual computation

Proposition 3.0.

Proof.

Proposition 3.0.

Proof.

4. Fast algorithms in specific cases

4.1. When the input module is an ideal

Proposition 4.0.

4.2. When the s\mathbf{s}s-minimal degree is known

Lemma 4.0.

Proof.

Proposition 4.0.

Proof.

4.3. Solution based on fast linear algebra

Proposition 4.0.

5. Relations modulo Hermite forms

Proposition 5.0.

Proof.

Acknowledgements.

4.2. When the $\mathbf{s}$ -minimal degree is known