Generalized algorithms for the approximate matrix polynomial GCD of   reducing data uncertainties with application to MIMO system and control

A. Fazzi; N. Guglielmi; I. Markovsky

arXiv:1907.13101·math.NA·June 2, 2021

Generalized algorithms for the approximate matrix polynomial GCD of reducing data uncertainties with application to MIMO system and control

A. Fazzi, N. Guglielmi, I. Markovsky

PDF

TL;DR

This paper extends algorithms for approximate polynomial GCD to matrix polynomials, enabling better handling of data uncertainties in control systems and signal processing applications.

Contribution

It generalizes two scalar polynomial GCD algorithms to matrix polynomials, including a fast and a more accurate method, with application to MIMO systems.

Findings

01

Both algorithms perform similarly to scalar cases.

02

The methods effectively handle data uncertainties.

03

Application demonstrated on MIMO control systems.

Abstract

Computation of (approximate) polynomials common factors is an important problem in several fields of science, like control theory and signal processing. While the problem has been widely studied for scalar polynomials, the scientific literature in the framework of matrix polynomials seems to be limited to the problem of exact greatest common divisors computation. In this paper, we generalize two algorithms from scalar to matrix polynomials. The first one is fast and simple. The second one is more accurate but computationally more expensive. We test the performances of the two algorithms and observe similar behavior to the one in the scalar case. Finally we describe an application to multi-input multi-output linear time-invariant dynamical systems.

Equations114

A (λ) = C (λ) \overset{ˉ}{A} (λ) B (λ) = C (λ) \overset{ˉ}{B} (λ)

A (λ) = C (λ) \overset{ˉ}{A} (λ) B (λ) = C (λ) \overset{ˉ}{B} (λ)

A (λ) = A_{0} + A_{1} λ + \dots + A_{n} λ^{n} with A_{n} \neq = 0,

A (λ) = A_{0} + A_{1} λ + \dots + A_{n} λ^{n} with A_{n} \neq = 0,

B (λ) = B_{0} + B_{1} λ + \dots + B_{n} λ^{n} with B_{n} \neq = 0.

B (λ) = B_{0} + B_{1} λ + \dots + B_{n} λ^{n} with B_{n} \neq = 0.

S(A,B)=\begin{pmatrix}A_{n}&\cdots&\cdots&A_{0}&&&\\ &A_{n}&\cdots&\cdots&A_{0}&&\\ &&\ddots&&&\ddots&\\ &&&A_{n}&\cdots&\cdots&A_{0}\\ B_{n}&\cdots&\cdots&B_{0}&&&\\ &B_{n}&\cdots&\cdots&B_{0}&&\\ &&\ddots&&&\ddots&\\ &&&B_{n}&\cdots&\cdots&B_{0}\\ \end{pmatrix}\begin{array}[]{clc}&\rdelim\}{4}{4mm}&\\ &&n\\ &&\\ &&\\ &\rdelim\}{4}{4mm}&\\ &&n\\ &&\\ &&\\ \end{array}.

S(A,B)=\begin{pmatrix}A_{n}&\cdots&\cdots&A_{0}&&&\\ &A_{n}&\cdots&\cdots&A_{0}&&\\ &&\ddots&&&\ddots&\\ &&&A_{n}&\cdots&\cdots&A_{0}\\ B_{n}&\cdots&\cdots&B_{0}&&&\\ &B_{n}&\cdots&\cdots&B_{0}&&\\ &&\ddots&&&\ddots&\\ &&&B_{n}&\cdots&\cdots&B_{0}\\ \end{pmatrix}\begin{array}[]{clc}&\rdelim\}{4}{4mm}&\\ &&n\\ &&\\ &&\\ &\rdelim\}{4}{4mm}&\\ &&n\\ &&\\ &&\\ \end{array}.

dim ker (S (A, B)) \geq ν (A, B),

dim ker (S (A, B)) \geq ν (A, B),

A (λ) = (- 1 + λ 1 0 - 1 + λ), B (λ) = (λ 0 1 λ - 2) .

A (λ) = (- 1 + λ 1 0 - 1 + λ), B (λ) = (λ 0 1 λ - 2) .

A_{0} = (- 1 1 0 - 1), A_{1} = (1001), B_{0} = (00 1 - 2), B_{1} = (1001) .

A_{0} = (- 1 1 0 - 1), A_{1} = (1001), B_{0} = (00 1 - 2), B_{1} = (1001) .

S (A, B) 1 - 2 1 - 1 = (A_{1} B_{1} A_{0} B_{0}) 1 - 2 1 - 1 = 10100101 - 1 100 0 - 1 1 - 2 1 - 2 1 - 1 = 0000,

S (A, B) 1 - 2 1 - 1 = (A_{1} B_{1} A_{0} B_{0}) 1 - 2 1 - 1 = 10100101 - 1 100 0 - 1 1 - 2 1 - 2 1 - 1 = 0000,

S_{\ell}(A,B)=\begin{pmatrix}A_{n}&\cdots&\cdots&A_{0}&&&\\ &A_{n}&\cdots&\cdots&A_{0}&&\\ &&\ddots&&&\ddots&\\ &&&A_{n}&\cdots&\cdots&A_{0}\\ B_{n}&\cdots&\cdots&B_{0}&&&\\ &B_{n}&\cdots&\cdots&B_{0}&&\\ &&\ddots&&&\ddots&\\ &&&B_{n}&\cdots&\cdots&B_{0}\\ \end{pmatrix}\begin{array}[]{l}\\[-28.45274pt] \rdelim\}{4}{4mm}[$\ell-n$]\\ \\ \\[17.07164pt] \rdelim\}{4}{4mm}[$\ell-n$]\\ \\ \end{array}

S_{\ell}(A,B)=\begin{pmatrix}A_{n}&\cdots&\cdots&A_{0}&&&\\ &A_{n}&\cdots&\cdots&A_{0}&&\\ &&\ddots&&&\ddots&\\ &&&A_{n}&\cdots&\cdots&A_{0}\\ B_{n}&\cdots&\cdots&B_{0}&&&\\ &B_{n}&\cdots&\cdots&B_{0}&&\\ &&\ddots&&&\ddots&\\ &&&B_{n}&\cdots&\cdots&B_{0}\\ \end{pmatrix}\begin{array}[]{l}\\[-28.45274pt] \rdelim\}{4}{4mm}[$\ell-n$]\\ \\ \\[17.07164pt] \rdelim\}{4}{4mm}[$\ell-n$]\\ \\ \end{array}

S_{3} (A, B) = A_{1} B_{1} A_{0} A_{1} B_{0} B_{1} A_{0} B_{0} = 1000100001000100 - 1 1100010 0 - 1 011 - 2 01 00 - 1 10000 000 - 1 001 - 2

S_{3} (A, B) = A_{1} B_{1} A_{0} A_{1} B_{0} B_{1} A_{0} B_{0} = 1000100001000100 - 1 1100010 0 - 1 011 - 2 01 00 - 1 10000 000 - 1 001 - 2

dist ({A, B}, {\hat{A}, \hat{B}}) = j = 0 \sum n ∥ A_{j} - \hat{A}_{j} ∥_{F}^{2} + j = 0 \sum n ∥ B_{j} - \hat{B}_{j} ∥_{F}^{2}

dist ({A, B}, {\hat{A}, \hat{B}}) = j = 0 \sum n ∥ A_{j} - \hat{A}_{j} ∥_{F}^{2} + j = 0 \sum n ∥ B_{j} - \hat{B}_{j} ∥_{F}^{2}

{\hat{A}, \hat{B}} : \exists C such that \hat{A} = C \overset{ˉ}{A}, \hat{B} = C \overset{ˉ}{B} C has degree d in f dist ({A, B}, {\hat{A}, \hat{B}})

{\hat{A}, \hat{B}} : \exists C such that \hat{A} = C \overset{ˉ}{A}, \hat{B} = C \overset{ˉ}{B} C has degree d in f dist ({A, B}, {\hat{A}, \hat{B}})

V_{0} = (V_{d}, \dots, V_{1}),

V_{0} = (V_{d}, \dots, V_{1}),

H_{i} = V_{i} (1) V_{i} (2) \dots V_{i} (r) V_{i} (2) V_{i} (3) \dots V_{i} (r + 1) \dots \dots \dots \dots V_{i} (d + 1) V_{i} (d + 2) \dots V_{i} (2 n + 1) i = r + 1, \dots, 2 n + 1.

H_{i} = V_{i} (1) V_{i} (2) \dots V_{i} (r) V_{i} (2) V_{i} (3) \dots V_{i} (r + 1) \dots \dots \dots \dots V_{i} (d + 1) V_{i} (d + 2) \dots V_{i} (2 n + 1) i = r + 1, \dots, 2 n + 1.

R = i = d i = i - 1 \sum 1 H_{i}^{T} H_{i}

R = i = d i = i - 1 \sum 1 H_{i}^{T} H_{i}

V_{0} = (V_{m d}, \dots, V_{1}),

V_{0} = (V_{m d}, \dots, V_{1}),

\overset{ˉ}{V}_{i} = V_{i} (1) ⋮ V_{i} (m) \dots \dots V_{i} (mn (1 + m) - m + 1) ⋮ V_{i} (mn (1 + m))

\overset{ˉ}{V}_{i} = V_{i} (1) ⋮ V_{i} (m) \dots \dots V_{i} (mn (1 + m) - m + 1) ⋮ V_{i} (mn (1 + m))

K = [H (V_{m d}), \dots, H (V_{1})]

K = [H (V_{m d}), \dots, H (V_{1})]

S_{ℓ} (A, B) = (U_{r} U_{0}) (Σ_{r} 0 00) (V_{r}^{⊤} V_{0}^{⊤}),

S_{ℓ} (A, B) = (U_{r} U_{0}) (Σ_{r} 0 00) (V_{r}^{⊤} V_{0}^{⊤}),

τ (C) V_{0} = 0.

τ (C) V_{0} = 0.

i = m d i = i - 1 \sum 1 ∥ τ (C) V_{i} ∥^{2} = 0 V_{i} \in V_{0} .

i = m d i = i - 1 \sum 1 ∥ τ (C) V_{i} ∥^{2} = 0 V_{i} \in V_{0} .

i = m d i = i - 1 \sum 1 ∥ C H (V_{i}) ∥^{2} = 0 V_{i} \in V_{0},

i = m d i = i - 1 \sum 1 ∥ C H (V_{i}) ∥^{2} = 0 V_{i} \in V_{0},

\hat{A}, \hat{B} min ∥ A - \hat{A} ∥_{2}^{2} + ∥ B - \hat{B} ∥_{2}^{2} = \overset{ˉ}{A}, \overset{ˉ}{B} min ∥ A - C \overset{ˉ}{A} ∥_{2}^{2} + ∥ B - C \overset{ˉ}{B} ∥_{2}^{2}

\hat{A}, \hat{B} min ∥ A - \hat{A} ∥_{2}^{2} + ∥ B - \hat{B} ∥_{2}^{2} = \overset{ˉ}{A}, \overset{ˉ}{B} min ∥ A - C \overset{ˉ}{A} ∥_{2}^{2} + ∥ B - C \overset{ˉ}{B} ∥_{2}^{2}

\dot{λ} = x_{0}^{⊤} \dot{D} x_{0}

\dot{λ} = x_{0}^{⊤} \dot{D} x_{0}

\derivative t σ^{2}

\derivative t σ^{2}

\overset{σ}{˙}_{k}

⟨ u v^{⊤}, \dot{E} ⟩ = ⟨ P_{S} (u v^{⊤}), \dot{E} ⟩

⟨ u v^{⊤}, \dot{E} ⟩ = ⟨ P_{S} (u v^{⊤}), \dot{E} ⟩

P_{S} (H) = S_{ℓ} (P^{1}, P^{2}),

P_{S} (H) = S_{ℓ} (P^{1}, P^{2}),

P_{n - i}^{1}

P_{n - i}^{1}

P_{n - i}^{2}

m (j - 1) + 1 + mi : m (j + i))

for i = 0, \dots, n .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Generalized algorithms for the approximate matrix polynomial GCD of reducing data uncertainties with application to MIMO system and control

Antonio Fazzi

[email protected]

Nicola Guglielmi

[email protected]

Ivan Markovsky

[email protected]

Gran Sasso Science Institute (GSSI), Viale F. Crispi 7, 67100 L’Aquila, Italy

Vrije Universiteit Brussel (VUB), Department ELEC, Pleinlaan 2, 1050 Brussels, Belgium

Abstract

Computation of (approximate) polynomials common factors is an important problem in several fields of science, like control theory and signal processing. While the problem has been widely studied for scalar polynomials, the scientific literature in the framework of matrix polynomials seems to be limited to the problem of exact greatest common divisor computation. In this paper, we generalize two algorithms from scalar to matrix polynomials. The first one is fast and simple. The second one is more accurate but computationally more expensive. We test the performances of the two algorithms and observe similar behavior to the one in the scalar case. Finally we describe an application to multi-input multi-output linear time-invariant dynamical systems.

keywords:

Matrix polynomials , Approximate common factor , Subspace method , Matrix ODEs

1 Introduction

Polynomials common factors computation is an important problem in several scientific fields due to its applications [1]. In this paper we deal with common factors for matrix polynomials, which are matrices whose elements are polynomials, or equivalently polynomials with matrix coefficients. Readers not familiar with matrix polynomials can refer for example to [2, 3].

The computation of a Greatest Common Divisor (GCD) $C(\lambda)$ of two matrix polynomials $A(\lambda)$ and $B(\lambda)$ appears in several problems in multivariable control [4, 5, 6]. The problem has been studied by many authors and through different techniques. Some authors find the GCD as a combination of polynomials [7] or transform the block matrix $[A(\lambda)\ B(\lambda)]$ into $[C(\lambda)\ \ 0]$ [8]. Other methods use the generalized Sylvester matrix [9, 10].

The most popular references study the properties of the resultant for matrix polynomials, e.g. [9, 11, 12, 13], or they deal with exact common factor computations for matrix polynomials [6, 10, 14]. Anyway in some applications (see Section 6) it is needed to compute approximate common factors, due to measurement noise or other perturbations on the data.

The Approximate GCD problem has been extensively studied for scalar polynomials; but in the framework of multivariable control systems we deal with matrix polynomials and, up to our knowledge, there is no algorithm for solving the problem in the matrix case. The goal of this paper is to generalize the algorithms proposed in [15] and in [16, 17] from scalar to matrix polynomials.

This paper is organized as follows: Section 2 relates to the exact GCD computation in the matrix case, and the properties of the generalized resultant; Section 3 generalizes the subspace method (Algorithm 1) of [15], while Section 4 generalizes the ODE-based method (Algorithm 3) of [16, 17]. Section 5 shows the performance of the algorithms. Finally applications in the framework of linear time-invariant systems are considered in Section 6.

Notations

•

$A(\lambda)$ , $B(\lambda)$ are two (square) coprime matrix polynomials, $\hat{A}(\lambda),\hat{B}(\lambda)$ are perturbations of $A(\lambda),B(\lambda)$ having a common factor (the outputs of the proposed algorithms). They can be factored as $\hat{A}=C\bar{A}$ , $\hat{B}=C\bar{B}$ ; $C$ denotes the (monic) common factor;

•

$m$ is the dimension of the matrices $A,B$ , $n$ is the degree of the polynomials (we assume they have the same degree), $d$ is the degree of the sought common factor;

•

$S_{\ell}$ denotes a structured Sylvester matrix whose dimensions depend on the parameter $\ell$ (see Section 2.1); $A\in\mathcal{S}$ means that the matrix $A$ has the Sylvester structure and $P_{\mathcal{S}}(\cdot)$ is the operator which orthogonally project the argument onto the set $\mathcal{S}$ ;

•

we denote by $\|\cdot\|_{F}$ the Frobenius norm of a matrix induced by the Frobenius inner product $\langle A,B\rangle=\tr(A^{\top}B)$ ;

•

$\tau(C)$ denotes the Toeplitz matrix built from the coefficients of the matrix polynomial $C(z)$ ;

•

a dot on a function denotes its time derivative (we deal with univariate functions only).

We restrict in the following to the case of two matrix polynomials and we assume both the matrices $A,B$ to be square in order to simplify the notation. Anyway the proposed algorithms work if one of the two matrices is rectangular (as pointed out in Remark 2.1 we need only two matrices having the same number of rows or columns) and they could be extended to more than two polynomials. Throughout the paper we use without distinction the terms GCD and common factors.

2 Matrix polynomial GCD approximation

We analyze in this section how to approach the common factors computation in the case of matrix polynomials, emphasizing the main differences with respect to the scalar case. The first difference arising when we consider matrices instead of scalars is the loss of commutativity. Henceforth, we need to distinguish between right and left divisors. In the following we focus on left divisors but right divisors have obvious counterparts.

Definition 2.1 (Left divisor of two matrix polynomials).

A (exact) common left divisor of two matrix polynomials $A(\lambda)$ and $B(\lambda)$ , having the same number of rows, is any matrix polynomial $C(\lambda)$ such that

[TABLE]

*for some matrix polynomials $\bar{A}(\lambda)$ , $\bar{B}(\lambda)$ ; *

Remark 2.1.

The definition of left (right) divisor is meaningful only in the case the two matrices have the same number of rows (columns). If we transpose the matrix polynomials, we can switch between left and right common factors.

In the framework of scalar polynomials, two common factors (or, in general, two polynomials) are equivalent up to a constant factor. A similar property holds true in the matrix case: two matrix polynomials are equivalent up to multiplication with unimodular matrices.

Definition 2.2 (Unimodular matrix polynomials).

Let $U(\lambda)$ be a square matrix polynomial of dimension $m$ . Then $U(\lambda)$ is a unimodular matrix polynomial if there exists a $m\times m$ matrix polynomial $V(\lambda)$ such that $V(\lambda)U(\lambda)=I$ . Equivalently, if $\textrm{det}(U(\lambda))$ is a non-zero constant.

Definition 2.3 (Matrix polynomials equivalence).

Given two matrix polynomials $C_{1}(\lambda)$ and $C_{2}(\lambda)$ , they are equivalent if and only if there exist unimodular matrix polynomials $U(\lambda)$ , $V(\lambda)$ such that $C_{1}(\lambda)=U(\lambda)C_{2}(\lambda)V(\lambda)$ .

The following statement is helpful to understand if a given matrix polynomial is unimodular: $U(\lambda)$ is a unimodular matrix polynomial if and only if it is associated with a finite sequence of the following transformations:

interchange two columns: it is equivalent to the multiplication with a permutation matrix; 2. 2.

multiply a column by a nonzero constant: it is equivalent to multiplication with a constant diagonal matrix; 3. 3.

replace the i-th column $c_{i}(\lambda)$ by $c_{i}(\lambda)+\lambda^{d}c_{j}(\lambda)$ : this is equivalent to the multiplication with a matrix polynomial equal to the identity except for the presence of $\lambda^{d}$ in the position $(j,i)$ ; 4. 4.

all the previous transformations can be applied to the rows and they correspond to a premultiplication with a suitable unimodular matrix.

Remark 2.2.

The set of equivalent common factors, according to Definition 2.3 and the last statement, is big and sometimes it can be difficult to understand if two given matrix polynomials are equivalent even for small dimensions. In order to make this problem milder we restrict, in the following, to the case of monic common factors (there is some loss of generality since we restrict to the polynomials whose leading coefficient is full rank). This assumption is not fundamental, though; by removing it we can compute approximate common factors of given degree whose leading coefficient is not full rank.

2.1 Sylvester matrices for matrix polynomials

Let $A$ and $B$ be $m\times m$ matrix polynomials of degree $n$ . Thus

[TABLE]

We assume $n>0$ , and that the leading matrix coefficients $A_{n}$ and $B_{n}$ are invertible, so the determinants of $A(\lambda)$ and $B(\lambda)$ are not zero.

A useful tool in testing polynomials coprimeness is the Sylvester resultant: its straightforward generalization to the matrix case is the following $2mn\times 2mn$ structured matrix

[TABLE]

In [11] it has been shown that the key property for the classical Sylvester resultant does not carry over for matrix polynomials, in particular

[TABLE]

where $\nu(A,B)$ denotes the total common multiplicity of the common eigenvalues of $A$ and $B$ . Example 2.1 shows that the inequality (2.1) can be strict.

Example 2.1.

Let the two $2\times 2$ matrix polynomials of degree $1$ ,

[TABLE]

We deduce easily that $A(\lambda)=A_{0}+A_{1}\lambda$ and $B(\lambda)=B_{0}+B_{1}\lambda$ where

[TABLE]

We have

[TABLE]

so the kernel of the resultant has dimension (at least) $1$ , but $\textrm{det}(A(\lambda))$ and $\textrm{det}(B(\lambda))$ have no common zeros, hence the matrices have no common eigenvalues.

On the other hand, given $A(\lambda),B(\lambda)$ and $\lambda_{0}\in\mathbb{C}$ , if there exists a vector $x_{0}\neq 0$ such that $A(\lambda_{0})x_{0}=0$ and $B(\lambda_{0})x_{0}=0$ then det $(A(\lambda_{0}))=0$ and det $(B(\lambda_{0}))=0$ ; but the contrary is not true. Consequently, the common factors are not associated only with the common roots of the determinants of the matrix polynomials.

In order to get the equality in (2.1) we can consider a bigger Sylvester matrix [11]. Defining the following resultant

[TABLE]

we have the equality in (2.1) if $\ell\geq n(m+1)$ ; in the following we set $\ell=n(m+1)$ .

Example 2.2.

Using (2.2), we deduce that

[TABLE]

is full rank.

.

Remark 2.3.

The definition of resultant in (2.4) refers to right common factors. If we deal with left common factors we need to transpose it.

2.2 Common factor approximation

In the past years several authors have proposed some algorithms for the computation of an exact GCD of matrix polynomials. But in practical applications, the coefficients can be inexact due to several sources of error. Given coprime matrix polynomials, we are interested in computing the smallest perturbation which makes them having a common factor of given degree.

Consider two coprime matrix polynomials $A(\lambda)$ and $B(\lambda)$ . The problem is to compute a closest pair of matrix polynomials $\hat{A}(\lambda)$ , $\hat{B}(\lambda)$ which has a non trivial (exact) common factor of specified degree $d$ . Such a common factor is called an approximate common factor for the matrices $A(\lambda)$ and B( $\lambda$ ). In the following we assume that the coefficient matrices are real. The distance between two pairs of matrix polynomials is defined as follows:

[TABLE]

where $A_{j}$ and $B_{j}$ denote the $j$ -th (matrix) coefficient of the corresponding matrix polynomial. The formulation of the problem is the following:

Problem 2.1.

*Approximate left common factor ptoblem for matrix polynomials

Given two left coprime matrix polynomials $A=A(\lambda)$ and $B=B(\lambda)$ , a number $d\in\mathbf{N}$ , compute*

[TABLE]

where $A,B$ denote (with an abuse of notation) a matrix collecting the coefficients of the corresponding matrix polynomial, while the distance is the one defined in (2.5). The left common factor $C$ is an approximate common factor for the matrix polynomials $A$ and $B$ . The problem involving approximate right common factor is analogous.

In the following sections we propose two algorithms for solving the nonconvex optimization Problem 2.1 by local optimization approaches. To the best of our knowledge there is no algorithm in the literature to compute its solution. Our proposals come from the generalization of two algorithms proposed in the scalar case: the subspace method [15] and an ODE-based algorithm [17]. We list for each algorithm the main points and properties, and we test their performance on some numerical examples.

3 Generalized subspace method for matrix polynomials

In this section we describe how we generalize the subspace method [15] to the computation of approximate common factors of matrix polynomials. The original algorithm for scalar polynomials is a powerful tool in the framework of GCD computation since it is simple to develop, easy to understand and convenient to implement. Moreover it is one of the first algorithms capable of dealing with noise-corrupted data. However, as shown in [17], the performance of the subspace method can be improved in terms of accuracy of the solution by other optimization methods. The basic idea of the algorithm is the fact that the information on the (approximate) common factors of a set of polynomials is in the null space of the associated resultant.

We briefly recall how the algorithm works for scalar polynomials, as described in [15]:

Build $S$ , the Sylvester matrix of dimension $N(n+1)\times(2n+1)$ associated with the given data polynomials, where $N$ is the number of polynomials and $n$ is the degree of the polynomials. 2. 2.

a

Compute

[TABLE]

the null space of $S$ ( $d$ is the degree of the sought GCD). $V_{0}$ has $d$ columns. 2. b

In order to extract the information about the GCD, reshape each column of $V_{0}$ into a Hankel matrix with $r=2n+1-d$ rows:

[TABLE] 3. 3.

Build the matrix

[TABLE]

and extract the GCD by the eigenvector of $R$ corresponding to the smallest eigenvalue. The entries of such eigenvector are the coefficients of the common factor.

To generalize the method for matrix polynomials, we replace the scalar coefficients by matrices of dimension $m$ , manipulating and reshaping the data in a suitable way. Similarly to the scalar case, the algorithm works in the same way both in the computation of exact common factors or approximate common factors. This leads to high computational speed but less accurate solutions. The main points of the algorithm are summarized in Algorithm 1.

The following theorem shows how the proposed algorithm works.

Theorem 3.1.

If the matrix $\mathcal{K}$ (3.1) is rank deficient, the subspace method computes a common factor between the data matrix polynomials. Otherwise, it computes an approximate common factor.

Proof.

We show the result about the computation of exact common factors only; the if statement follows from the possible presence of noise but the algorithm is exactly the same.

In the case $A(z)$ and $B(z)$ have a (right) common factor $C(z)$ , the resultant $S_{\ell}(A,B)$ can be split as $S_{\ell}(\bar{A},\bar{B})\tau(C)$ . Moreover, we know the resultant $S_{\ell}(A,B)$ has a non-trivial kernel (see Section 2.1) so we can write the following SVD factorization

[TABLE]

where $U_{r},\Sigma_{r},V_{r}$ correspond to the non-zero singular values/vectors. We notice that the rows of $\tau(C)$ and the rows of $V_{r}^{\top}$ span the same subspace. Then, because of the orthogonality between $V_{r}$ and $V_{0}$ , the following equality holds true

[TABLE]

Equation (3.2) has a unique solution for the common factor $C$ (up to multiplication by unimodular matrices, see Definition 2.2) [18]. Equation (3.2) can be written as

[TABLE]

Exploiting the Toeplitz structure of $\tau(C)$ the equation (3.3) can be written as

[TABLE]

where $C$ is a matrix collecting the coefficients of the common factor (with an abuse of notation we use the same letter $C$ ), while $H(V_{i})$ is a mosaic Hankel matrix built from the entries of the vector $V_{i}$ .Hence the entries of the matrix $C$ , i.e. the coefficients of the sought common factor, can be recovered from the left null space of the matrix (3.1) $\mathcal{K}=[H(V_{md}),\dots,H(V_{1})]$ .

∎

Remark 3.1.

Given the matrices $A(z)$ and $B(z)$ , the subspace method computes only a (approximate) common factor $C(z)$ but not the polynomials $\hat{A}(z),\hat{B}(z)$ having $C(z)$ as common factor. To compute these polynomials we need to solve the least squares problem

[TABLE]

where $C$ is the common factor computed by the algorithm.

Remark 3.2.

(Computational cost) The advantage of this subspace method is to be very fast and cheap. The main computational cost consists in two SVDs.

Remark 3.3.

The proposed algorithm computes a (exact) common factor between $A(z)$ and $B(z)$ whenever it exists. If the data do not admit a common factor, the algorithm automatically computes an approximate common factor, but there are no differences from the computational point of view.

4 Generalized ODE-based method for matrix polynomials

The goal of this section is to generalize the algorithm proposed in [17] for scalar polynomials, to the case of matrix polynomials. Even if some of the results stated in this section may look small variations of the one proposed in [17, 16], we remark that there are no algorithms in the literature which solve the considered problem. Moreover, by removing the assumption in Remark 2.2, we can change the objective functional in order to compute approximate common factors whose leading coefficient is rank deficient. A further difference with respect to the case of scalar polynomials is the computational strategy in the outer iteration.

4.1 General aspects

We describe first some useful tools and ideas to understand how the algorithm works. When we deal with coprimeness of matrix polynomials, just as it happens for the scalar case, the Sylvester resultant is a useful tool. We showed in Section 2.1 that, replacing the scalar coefficients by matrices, we do not have anymore the equality between the corank (the dimension of the kernel) of the resultant and the degree of the common factor between the polynomials, as it happens in the scalar case [19]. In order to solve this issue, it can be worth to work with the modified resultant $S_{\ell}$ (2.4), since in this way we preserve the equality in (2.1).

We start with a full rank Sylvester matrix $S_{\ell}(A,B)$ and we want to perturb the coefficients of the polynomials (in a minimal way) so that the kernel of the associated resultant $S_{\ell}(\hat{A},\hat{B})$ has dimension $k=md$ . This is done by iteratively adding a structured perturbation to the matrix $S_{\ell}(A,B)$ which minimizes the singular values of interest (the $k$ smallest singular values). The rank test on the Sylvester matrix is done by computing its SVD, and in particular it is well known that a matrix has corank $k$ if and only if it has $k$ zero singular values. Exploiting the fact that the singular values are ordered non negative real numbers, we can focus on minimizing only the $k$ -th singular value. In particular we write the perturbed matrix as $\hat{S}_{\ell}=S_{\ell}+\epsilon E$ , where $\epsilon$ is a scalar measuring the norm of the perturbation, while $E$ is a norm one matrix (w.r.t. the Frobenius norm) which identifies as $\varepsilon E$ the minimizer of $\sigma_{k}$ over the ball of matrices whose norm is at most $\varepsilon$ . In this way we can move $E$ and $\epsilon$ independently, minimizing the $k$ -th singular value at one step, and the norm of the perturbation at the other until $\sigma_{k}=0$ .

These ideas give raise to the following $2$ -levels algorithm : we iteratively consider a matrix of the form $S_{\ell}+\epsilon E$ and we update it on two different independent levels:

•

at the inner level we fix the value of $\epsilon$ , and we minimize the functional $\sigma_{k}$ by looking for the stationary points of a system of ODEs for the matrix $E$ ;

•

at the outer level, we move the value of $\epsilon$ in order to compute the best possible solution.

Remark 4.1.

From the numerical point of view the functional $\sigma_{k}$ does not vanish, but it only reaches a fixed small tolerance.

4.2 Inner iteration

We analyze now the inner iteration of the algorithm, where the value of $\epsilon$ is fixed. The goal is to compute an optimal perturbation $E$ that minimizes the singular value $\sigma_{k}$ of the matrix $S_{\ell}+\epsilon E$ over the set of matrices $E$ of unit Frobenius norm. To do this we consider a smooth path of matrices $E(t)$ of unit Frobenius norm along which the singular value $\sigma_{k}$ of $S_{\ell}+\varepsilon E(t)$ decreases. We exploit the following result about derivatives of eigenvalues for symmetric matrices [20].

Lemma 4.1.

Let $D(t)$ be a differentiable real symmetric matrix function for $t$ in a neighborhood of [math], and let $\lambda(t)$ be an eigenvalue of $D(t)$ converging to a simple eigenvalue $\lambda_{0}$ of $D(0)$ as $t\rightarrow 0$ . Let $x_{0}$ be a normalized eigenvector (s.t. $x_{0}^{\top}x_{0}=1$ ) of $D_{0}$ associated to $\lambda_{0}$ . Then the function $\lambda(t)$ is differentiable near $t=0$ with

[TABLE]

Assuming that $E(t)$ is smooth we can apply Lemma 4.1 to the eigenvalues of the matrix $\hat{S}_{\ell}^{\top}(t)\hat{S}_{\ell}(t)=(S_{\ell}+\epsilon E(t))^{\top}(S_{\ell}+\epsilon E(t))$ , and we observe that the eigenvalues of $\hat{S}_{\ell}^{\top}\hat{S}_{\ell}$ are the squares of the singular values of $\hat{S}_{\ell}$ (we can assume the singular values are differentiable functions since, from the numerical point of view, we do not observe any coalescence among them). Omitting the time dependence, we find the following expression for the derivative of $\sigma_{k}$ :

[TABLE]

where $u,v$ are the singular vectors of $\hat{S}_{\ell}$ associated to $\sigma_{k}$ ; so the steepest descent direction for the functional $\sigma_{k}$ , minimizing the function over the admissible set for $\dot{E}$ , is attained by minimizing $u^{\top}\dot{E}v=\langle uv^{\top},\dot{E}\rangle$ . We notice that $E\in\mathcal{S}$ , and consequently $\dot{E}\in\mathcal{S}$ , hence

[TABLE]

where the formula for the operator $P_{\mathcal{S}}$ (the projection of the argument onto the Sylvester structure) is given in the following lemma:

Lemma 4.2.

Let $\mathcal{S}$ be the set of generalized Sylvester matrices of dimension $m\ell\times 2m(\ell-n)$ , and let $H\in\mathbb{R}^{m\ell\times 2m(\ell-n)}$ be an arbitrary matrix. The orthogonal projection with respect to the Frobenius norm of $H$ onto $\mathcal{S}$ is given by (using Matlab notation for the rows/columns of the matrices)

[TABLE]

where

[TABLE]

Proof.

The considered structured Sylvester matrices form a linear subspace and the basis matrices are orthogonal, the closest Sylvester matrix to a given matrix (in the Frobenius norm) is obtained by the inner product with the basis matrices (or equivalently taking the average along the diagonals). ∎

We underline that the projection $P_{\mathcal{S}}(uv^{\top})$ is different from zero for any pair of singular vectors $u,v$ associated to a non-zero singular value.

Lemma 4.3.

If $\sigma>0$ is a simple singular value of a matrix $\hat{S}_{\ell}$ with associated singular vectors $u$ and $v$ , we have

[TABLE]

Proof.

Assume, by contradiction, that we have $P_{\mathcal{S}}(uv^{\top})=0$ . Doing some computations, we get

[TABLE]

since $\sigma>0$ by assumption. Consequently (4.2) is a contradiction, and the claim follows. ∎

4.2.1 Minimization problem

We found in (4.1) the expression for the derivative of the singular value $\sigma_{k}$ of the Sylvester matrix $\hat{S}_{\ell}=S_{\ell}+\epsilon E$ . In order to compute the steepest descent direction for $\sigma_{k}$ we need to compute

[TABLE]

where the constraint on the norm is added in order to select a unique solution, since we look for a direction. The solution of (4.3) is given by:

[TABLE]

(the proof [16, Section 4.2] is based on the projection of an element in an Euclidean space onto the intersection of two linear subspaces). Consequently (4.4) is the key point of the inner iteration of the proposed algorithm. The following result shows its importance:

Theorem 4.1.

Let $E(t)\in\mathcal{S}$ be a matrix of unit Frobenius norm, which is a solution of (4.4). If $\sigma$ is the singular value of $\hat{S}_{\ell}=S_{\ell}+\epsilon E$ associated to the singular vectors $u,v$ , then $\sigma(t)$ is decreasing, i.e.

[TABLE]

Proof.

In the proof we show that $\dot{\sigma}\leq 0$ . We remember the expression for $\dot{\sigma}=u^{\top}\dot{E}v$ (up to constant factors). Exploiting (4.4) to replace $\dot{E}$ , we have two terms: the first is

[TABLE]

which follows from the structure of $P_{\mathcal{S}}(uv^{\top})$ . The second is

[TABLE]

which follows from the Sylvester structure of $E$ . Summing the two terms with the correct signs we have

[TABLE]

since $\|E\|_{F}=1$ . ∎

Theorem 4.1 and Lemma 4.3 guarantee that the function $\sigma(t)$ is monotonically decreasing (for a fixed value of $\epsilon$ ). Therefore the stationary points of the ODE (corresponding to the zeros of $\dot{\sigma}$ ) are the candidate local minima for the functional under the considered constraints. The following corollary provides a rigorous characterization of minimizers.

Corollary 4.1.

Consider a solution of (4.4), and assume the corresponding singular value $\sigma>0$ . The following statements are equivalent:

$\dot{\sigma}=0$ ** 2. 2.

$\dot{E}=0$ ** 3. 3.

$E$ * is a scalar multiple of $P_{\mathcal{S}}(uv^{\top})$ .*

4.2.2 ODE integration

We discuss here how to compute the solution of the ODE (4.4). Since (4.4) is a constrained gradient system, the value of $\sigma_{k}$ is monotonically decreasing, as we can see in Figure 1.

The function evaluation required in the integration of the equation is expensive because it involves the computation of a SVD at each step (we need both the singular value and the corresponding singular vectors), so a suitable choice is that of using the explicit Euler scheme. We summarize the pseudo-code in Algorithm 2. We remark that the performances of the code can change depending on the values of some parameters (which depend on the starting data).

4.3 Outer iteration

Once we integrate the ODE (4.4), we find its stationary point $E$ and the corresponding $\sigma_{k}$ . Since these quantities depend on a (fixed) value of $\epsilon$ we denote them as $\sigma_{k}(\epsilon),E(\epsilon)$ . The next step is to find the minimal value $\varepsilon$ (the norm of the perturbation to the original Sylvester matrix) which solves the problem $\sigma_{k}(\varepsilon)=0$ . Observe that the distance between the two matrices is given by $\varepsilon$ because of the relation $\hat{S}_{\ell}-S_{\ell}=\varepsilon E$ .

Increasing the value of $\varepsilon$ , due to the choice of an initial value for the matrix $E$ in the gradient system (4.4), can lead to unexpected trajectories for the function $\sigma_{k}(\varepsilon)$ , that is $\sigma_{k}(\varepsilon)$ does not decrease. The observed behavior can be due to possibly poor initialization for the ODE: it can happen that by increasing the value of $\varepsilon$ without changing the perturbation $E(\varepsilon)$ in the initial datum, the equation reaches a stationary point before the objective functional decreases. In order to have a global decreasing property with respect to both the inner and the outer iteration we can iteratively alternate the following dynamics:

starting from the matrix $S_{\ell}+\hat{\epsilon}\hat{E}$ , we integrate (for a given $\varepsilon>\hat{\varepsilon}$ ) the equation

[TABLE]

where $\hat{E}$ is the computed equilibrium of the ODE (4.4) corresponding to the value $\hat{\varepsilon}$ and $u,v$ are the singular vectors associated with $\sigma_{k}$ .

This equation is still a gradient system for the objective functional obtained from (4.4) by removing the constraints on the norm of $E$ . The solution is expected to increase in norm while the objective functional decreases, so we stop the integration of the equation when the norm of the perturbation $E$ reaches the level

[TABLE] 2. 2.

starting from the solution computed in point 1 (applying a normalization $\|E\|_{F}=1$ ), integrate (4.4) with initial datum $\varepsilon E$ (using Algorithm 2).

The idea behind this computational strategy is to start each iteration at the endpoint of the previous one, in a way that $\sigma_{k}(\epsilon)$ is continuous and monotonically decreasing with respect to $\epsilon$ . This is obtained by integrating the ODE (4.5) between two consecutive values of $\epsilon$ . The main body of this computational method is in Algorithm 3.

Remark 4.2.

(Computational cost) First of all we remark that the update of $\epsilon$ does not affect the computational cost since it is only one flop per iteration, and the two iteration levels (inner and outer) are independent. All the computations are developed at the inner level, i.e., during the integration of the gradient system. As described in this section, there are two different (alternating) dynamics: the unconstrained dynamic (4.5) and the constrained one (4.4). The integration of each equation is an iterative algorithm which performs a SVD per iteration till the stopping criterion is reached (see Algorithm 2). Such decomposition is computed through the whole factorization of the matrix, hence the number of flops is expected to be cubic in the dimension of the data matrix (a possible improvement is object of future work). However it is not easy to estimate a priori the number of iterations needed by the integrator in order to reach the convergence, hence to guess the computational cost of the algorithm. As stated, the two iterations (inner and outer) are independent: however a poor accuracy in the inner iteration can determine also an inaccurate change of $\epsilon$ , therefore a slowdown of the process.

4.4 GCD Computation

In this paragraph we discuss how to extract the GCD from the perturbed polynomials computed by the ODE-based algorithm proposed in Section 4. We saw in Remark 3.1 that given the GCD, we can obtain the polynomials $\hat{A}$ , $\hat{B}$ by solving a least squares problem, but here the problem is more difficult.

The first idea to compute the sought common factor from the non-coprime polynomials $\hat{A},\hat{B}$ is to apply a fast and computationally cheap algorithm (e.g. the subspace method proposed in Section 3).

Alternatively we can make use of an external function for (exact) GCD computation for matrix polynomials. A suitable function comes from the Polyx Toolbox (www.polyx.com), referring to the function grd.m or gld.m depending on the interest in computing a right or a left common factor, respectively.

The functions grd.m and gld.m

We briefly explain here how the two functions grd.m and gld.m from the Polyx Toolbox (www.polyx.com) work. We state the idea of the algorithm for right common factors computation, but dealing with left common factors has analogous counterparts.

Consider the matrix polynomials

[TABLE]

having the same number of columns $m_{N}$ , and define $N=\begin{bmatrix}N_{1}\\ N_{2}\end{bmatrix}$ . Consider the resultants $S_{w+\ell}(N_{1},N_{2})$ (as defined in (2.4)) for increasing $\ell=1,2,\dots$ . Each Sylvester matrix is then reduced to its shifted row Echelon form by a Gaussian elimination algorithm without row permutations. According to [21] the last $m_{N}$ nonzero rows of $S_{\bar{\ell}}$ yield the coefficients of a greatest common right divisor of $N_{1},N_{2}$ , where $\bar{\ell}$ is defined as the smallest integer such that

[TABLE]

However we remember these functions are thought for exact GCD computation, while the output polynomials computed by the proposed ODE based algorithm have not an exact GCD (the singular values of the resultant decrease up to a small tolerance but they do not reach the zero). In particular, we can observe some of the following issues:

the computed GCD equals the identity (so the functions are not able to reveal the presence of a common factor); 2. 2.

the leading coefficient of the GCD is singular, while we always assume the common factors are monic (in particular the leading coefficient is full rank); 3. 3.

the computed GCD has degree higher than expected.

Most of the times no one of the previous facts is verified, and in these cases the common factors computed by the function grd.m match the ones computed by the subspace method.

5 Numerical experiments

In this section, we consider the performances of the proposed algorithms 1 and 3. As stated before, there is no term of comparison in the scientific literature (up to our knowledge), so the results of our algorithms are compared with the solutions obtained through the Matlab function $fminsearch$ .

First of all we show a numerical example which highlights how the two Algorithms 1 and 3 work. We consider the following $2\times 2$ matrix polynomials of degree $2$

[TABLE]

We generate then the data $A(\lambda),B(\lambda)$ by perturbing all the coefficients with normally distributed random noise with zero mean and standard deviation $0.1$ . Starting from the noisy data, Algorithm 1 computes the following polynomials and the associated common factor:

[TABLE]

Starting from the same data, Algorithm 3 computes the following numerical solution:

[TABLE]

We observe that both the polynomials and the common factor computed by Algorithm 3 are closer to the noiseless data than the ones computed by Algorithm 1.

The result of the previous experiment is quite general: this is observed by running now more examples with random data, where we neglect the numerical values.

We generate data polynomials having an exact common factor, and we add normal distributed perturbations multiplied by a constant (called noise level) in the interval $[0,1]$ in order to analyze the solution computed by the different approaches. We focus only on the values of the computed distances. In the following experiments we generate fifty perturbations (for a given value of standard deviation) and we plot the average distance computed by the different algorithms.

In Figure 2 we have two $2\times 2$ matrix polynomials of degree $3$ and we compute an approximate (monic) common factor of degree one.

From the graph we can observe that the proposed ODE-based algorithm obtains better solutions (in terms of accuracy) than the subspace method, as it happened in the case of scalar polynomials [17]. We need to make some comments about the minimization through the Matlab function fminsearch. People familiar with Matlab know this function needs an initial approximation in input, so we can ask if the performances observed in Figure 2 depend on the (possibly poor) initialization. In Figure 2 the initial estimate is the solution computed by the subspace method, so it is not a bad choice but neither the best one since we observe the (average) computed distances are bigger than the ones computed by the ODE algorithm. If we initialize the function with the GCD computed by the proposed ODE-based algorithm, the solutions computed by the Matlab minimization improves the one got by the proposed method. In Figure 3 we observe a similar numerical example where we added the distances computed by the function fminsearch with different initializations (random, solution of the subspace method, solution of the ODE algorithm). We notice how the different initial estimates for the function fminsearch influence the accuracy of the obtained solution.

Remark 5.1.

(Computational time) The subspace method is very fast due to its low number of arithmetic operations. The proposed ODE-based algorithm is (on average) faster than the function fminsearch, whose performances depend on the initial estimate.

6 Applications in system and control theory

We show in this section an application of the proposed algorithms. It extends the computation of distance to uncontrollability from Single-Input Single-Output (SISO) systems (presented in [1]) to Multi-Input Multi-Output (MIMO) systems. However we remind that any problem involving exact GCD computation for matrix polynomials can be seen as an approximate GCD computation problem whenever the coefficient are inexact, e.g. they come from measurements, computations or they are affected by perturbations [22].

Controllability for LTI systems

Consider the linear time invariant system $\mathcal{B}$ defined by its state space representation

[TABLE]

where $A\in\mathbb{R}^{n\times n}$ , $B\in\mathbb{R}^{n\times m}$ , $C\in\mathbb{R}^{p\times n}$ , $D\in\mathbb{R}^{p\times m}$ . The classical notion of controllability for (6.1) is a property of the matrices $A,B$ and it is related to the rank of the matrix

[TABLE]

In particular the system (6.1) is state controllable if and only if the matrix $\mathcal{C}$ in (6.2) is full rank. This definition of controllability is not a property of the system, but of the matrices $A$ and $B$ ; consequently distance problems associated to the matrix $\mathcal{C}$ may not have a well-defined solution since the same system (6.1) can be represented by different parameters $(\hat{A},\hat{B},\hat{C},\hat{D})$ (for example choosing a different basis or considering a bigger state dimension).

In order to avoid these issues we use the behavioral setting [23, 24, 25], where the notion of controllability is a property of the system and not of the parameters we choose for its representation. In this framework, the system (6.1) is viewed as the set of its trajectories. The controllability property is the possibility of concatenating any two trajectories, up to a delay of time.

Definition 6.1.

Let $\mathcal{B}$ be a time invariant dynamical system, which is a set of trajectories (vector valued functions of time). $\mathcal{B}$ is said to be controllable if for all $w_{1},w_{2}\in\mathcal{B}$ there exists a $T>0$ and a $w\in\mathcal{B}$ such that

[TABLE]

A system is uncontrollable if it is not controllable.

Any linear time invariant system admit a kernel representation [26]; hence given the system $\mathcal{B}$ , there is a polynomial matrix $R(z)=(P(z)\ Q(z))\in\mathbb{R}^{p\times(m+p)}$ such that

[TABLE]

where $\sigma$ is the shift operator (in the discrete case). The controllability property is related to the rank of the matrix polynomial $R(z)$ , and in particular we have the following Lemma [23]:

Lemma 6.1.

The system $\mathcal{B}$ is controllable (according to Definition 6.1) if and only if the polynomial matrix

[TABLE]

is left prime, i.e $R(z)$ is full row rank for all $z$ .

Distance to uncontrollability

Alternatively to (6.3), a MIMO linear time invariant system can be represented by its input/output representation

[TABLE]

where we split the vector $w$ in (6.3) into two blocks (the inputs $u$ and the outputs $y$ ) and we partition the matrix $R=(Q\ \ -P)$ accordingly. As a consequence of Lemma 6.1 we have

Corollary 6.1.

[27]** The presence of left common factors in $P$ and $Q$ leads to loss of controllability.

Let $\mathcal{L}_{uc}$ be the set of uncontrollable linear time invariant systems with $m\geq 1$ inputs and $p\geq 1$ outputs,

[TABLE]

and define the distance between two arbitrary systems by

[TABLE]

where the matrix polynomials are identified by a vector whose entries are their coefficients111The parameters $P$ and $Q$ which identify the system are not unique. In order to have a well posed definition of distance we can assume $P$ to be monic. This involves however some loss of generality.. The problem of computing the distance to uncontrollability is the following:

Problem 6.1.

Given a controllable system $\mathcal{B}(P,Q)$ , find

[TABLE]

In order to solve the non convex optimization Problem 6.1, we aim at perturbing the (left) coprime matrix polynomials $P$ and $Q$ in a minimal way till they have a (left) common factor of degree $1$ . The solution can be computed by the algorithm proposed in Section 4.

A detailed description supported by some numerical experiments is presented in [28].

7 Conclusions

We generalized two algorithms for computing approximate common factors from scalar to matrix polynomials. The first is a fast and computationally cheap algorithm which extract the informations about the common divisor from the resultant, while the second is a more accurate algorithm based on a two level iteration, which looks for the stationary points of a gradient system associated to a suitable functional. We showed how the performances are similar to the scalar case, and we described how to use the algorithms for computing the distance to uncontrollability for a Multi-Input Multi-Output linear time-invariant system.

Acknowledgments

N. G. thanks the Italian INdAM GNCS (Gruppo Nazionale di Calcolo Scientifico) for financial support. I. M. received funding from the European Research Council (ERC) under the European Union’s Seventh Framework Programme (FP7/2007–2013) / ERC Grant agreement number 258581 ”Structured low-rank approximation: Theory, algorithms, and applications” and Fund for Scientific Research Vlaanderen (FWO) projects G028015N ”Decoupling multivariate polynomials in nonlinear system identification” and

G090117N ”Block-oriented nonlinear identification using Volterra series”; and Fonds de la Recherche Scientifique (FNRS) – FWO Vlaanderen under Excellence of Science (EOS) Project no 30468160 ”Structured low-rank matrix / tensor approximation: numerical optimization-based algorithms and applications”. All the authors thank the anonymous reviewers and the Principal Editor for their comments and suggestions, which led to an improvement of the paper.

Bibliography28

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] I. Markovsky, A. Fazzi, N. Guglielmi, Applications of polynomial common factor computation in signal processing, in: Latent Variable Analysis and Signal Separation. Lecture Notes in Computer Science, Springer, 2018, pp. 99–106.
2[2] I. Gohberg, P. Lancaster, L. Rodman, Matrix Polynomials, SIAM, 2009.
3[3] E. N. Rosenwasser, B. P. Lampe, Multivariable Computer-controlled Systems, Communications and Control Engineering, Springer London, London, 2006.
4[4] E. Emre, Nonsingular factors of polynomial matrices and ( A , B ) 𝐴 𝐵 (A,B) -invariant subspaces, SIAM J. Control Optim. 18 (1980) 288–296. doi:10.1137/0318020 . · doi ↗
5[5] G. D. Forney Jr., Minimal bases of rational vector spaces, with applications to multivariable linear systems, SIAM J. Control 13 (1975) 493–520. doi:10.1137/0313029 . · doi ↗
6[6] J. C. Basilio, B. Kouvaritakis, An algorithm for coprime matrix fraction description using Sylvester matrices, Linear Algebra Its Appl. 266 (1997) 107–125. doi:10.1016/S 0024-3795(96)00636-2 . · doi ↗
7[7] C. C. Mac Duffee, The Theory of Matrices, Chelsea Publishing Company, New York, 1946.
8[8] W. A. Wolovich, Linear Multivariable Systems, Vol. 11 of Applied Mathematical Sciences, Springer New York, New York, NY, 1974.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Generalized algorithms for the approximate matrix polynomial GCD of reducing data uncertainties with application to MIMO system and control

Abstract

keywords:

1 Introduction

Notations

2 Matrix polynomial GCD approximation

Definition 2.1** (Left divisor of two matrix polynomials).**

Remark 2.1**.**

Definition 2.2** (Unimodular matrix polynomials).**

Definition 2.3** (Matrix polynomials equivalence).**

Remark 2.2**.**

2.1 Sylvester matrices for matrix polynomials

Example 2.1**.**

Example 2.2**.**

Remark 2.3**.**

2.2 Common factor approximation

Problem 2.1**.**

3 Generalized subspace method for matrix polynomials

Theorem 3.1**.**

Proof.

Remark 3.1**.**

Remark 3.2**.**

Remark 3.3**.**

4 Generalized ODE-based method for matrix polynomials

4.1 General aspects

Remark 4.1**.**

4.2 Inner iteration

Lemma 4.1**.**

Lemma 4.2**.**

Proof.

Lemma 4.3**.**

Proof.

4.2.1 Minimization problem

Theorem 4.1**.**

Proof.

Corollary 4.1**.**

4.2.2 ODE integration

4.3 Outer iteration

Remark 4.2**.**

4.4 GCD Computation

The functions grd.m and gld.m

5 Numerical experiments

Remark 5.1**.**

6 Applications in system and control theory

Controllability for LTI systems

Definition 6.1**.**

Lemma 6.1**.**

Distance to uncontrollability

Corollary 6.1**.**

Problem 6.1**.**

7 Conclusions

Acknowledgments

Definition 2.1 (Left divisor of two matrix polynomials).

Remark 2.1.

Definition 2.2 (Unimodular matrix polynomials).

Definition 2.3 (Matrix polynomials equivalence).

Remark 2.2.

Example 2.1.

Example 2.2.

Remark 2.3.

Problem 2.1.

Theorem 3.1.

Remark 3.1.

Remark 3.2.

Remark 3.3.

Remark 4.1.

Lemma 4.1.

Lemma 4.2.

Lemma 4.3.

Theorem 4.1.

Corollary 4.1.

Remark 4.2.

Remark 5.1.

Definition 6.1.

Lemma 6.1.

Corollary 6.1.

Problem 6.1.