Linearizing the Word Problem in (some) Free Fields

Konrad Schrempf

arXiv:1701.03378·math.RA·August 6, 2018·Int. J. Algebra Comput.

Linearizing the Word Problem in (some) Free Fields

Konrad Schrempf

PDF

TL;DR

This paper presents a linear algebra-based method for solving the word problem in free fields using minimal linear representations, enabling rational identity testing and inverse construction.

Contribution

It introduces a linear algebra approach leveraging Cohn and Reutenauer's normal form for free fields, with a new method for minimal linear representation construction.

Findings

01

Effective solution to the word problem in free fields.

02

Method for testing rational identities.

03

Construction of minimal linear representations for inverses.

Abstract

We describe a solution of the word problem in free fields (coming from non-commutative polynomials over a commutative field) using elementary linear algebra, provided that the elements are given by minimal linear representations. It relies on the normal form of Cohn and Reutenauer and can be used more generally to (positively) test rational identities. Moreover we provide a construction of minimal linear representations for the inverse of non-zero elements.

Equations153

[1 . x - 1 1] s = [. 1], s = [1 - x 1]

[1 . x - 1 1] s = [. 1], s = [1 - x 1]

[1 . x 1] s = [. 1], s = [- x 1]

[1 . x 1] s = [. 1], s = [- x 1]

\mu\mathcal{A}_{f}=\bigl{(}u_{f},A_{f},\mu v_{f}\bigr{)}.

\mu\mathcal{A}_{f}=\bigl{(}u_{f},A_{f},\mu v_{f}\bigr{)}.

A_{f} + A_{g} = ([u_{f} .], [A_{f} . - A_{f} u_{f}^{⊤} u_{g} A_{g}], [v_{f} v_{g}]) .

A_{f} + A_{g} = ([u_{f} .], [A_{f} . - A_{f} u_{f}^{⊤} u_{g} A_{g}], [v_{f} v_{g}]) .

A_{f} \cdot A_{g} = ([u_{f} .], [A_{f} . - v_{f} u_{g} A_{g}], [. v_{g}]) .

A_{f} \cdot A_{g} = ([u_{f} .], [A_{f} . - v_{f} u_{g} A_{g}], [. v_{g}]) .

A_{h}^{- 1} = ([1 .], [- v_{h} . A_{h} u_{h}], [. 1]) .

A_{h}^{- 1} = ([1 .], [- v_{h} . A_{h} u_{h}], [. 1]) .

μ s_{f}, [s_{f} + u_{f}^{⊤} g s_{g}], [s_{f} g s_{g}] and [h^{- 1} s_{h} h^{- 1}]

μ s_{f}, [s_{f} + u_{f}^{⊤} g s_{g}], [s_{f} g s_{g}] and [h^{- 1} s_{h} h^{- 1}]

L (μ A_{f})

L (μ A_{f})

L (A_{f} + A_{g})

L (A_{g})

R (μ A_{f})

R (μ A_{f})

R (A_{f} + A_{g})

R (A_{f})

L = [c v u A] \in K ⟨ X ⟩^{m \times m}

L = [c v u A] \in K ⟨ X ⟩^{m \times m}

\begin{bmatrix}A&B\\ C&D\end{bmatrix}^{-1}=\begin{bmatrix}.&.\\ .&D^{-1}\end{bmatrix}+\begin{bmatrix}I_{k}\\ -D^{-1}C\end{bmatrix}\bigl{(}A-BD^{-1}C\bigr{)}^{-1}\begin{bmatrix}I_{k}&-BD^{-1}\end{bmatrix}.

\begin{bmatrix}A&B\\ C&D\end{bmatrix}^{-1}=\begin{bmatrix}.&.\\ .&D^{-1}\end{bmatrix}+\begin{bmatrix}I_{k}\\ -D^{-1}C\end{bmatrix}\bigl{(}A-BD^{-1}C\bigr{)}^{-1}\begin{bmatrix}I_{k}&-BD^{-1}\end{bmatrix}.

L = [c v u A]

L = [c v u A]

\tilde{L} = [. \tilde{v} \tilde{u} \tilde{A}] with \tilde{A} = c v - 1 u A . - 1 . ., \tilde{u} = [0, \dots, 0, 1], \tilde{v} = \tilde{u}^{⊤}

\tilde{L} = [. \tilde{v} \tilde{u} \tilde{A}] with \tilde{A} = c v - 1 u A . - 1 . ., \tilde{u} = [0, \dots, 0, 1], \tilde{v} = \tilde{u}^{⊤}

\tilde{A}^{-1}=\begin{bmatrix}L^{-1}&.\\ .&.\end{bmatrix}-\begin{bmatrix}-L^{-1}b^{\!\top}\\ 1\end{bmatrix}\bigl{(}bL^{-1}b^{\!\top}\bigr{)}^{-1}\begin{bmatrix}-bL^{-1}&1\end{bmatrix}.

\tilde{A}^{-1}=\begin{bmatrix}L^{-1}&.\\ .&.\end{bmatrix}-\begin{bmatrix}-L^{-1}b^{\!\top}\\ 1\end{bmatrix}\bigl{(}bL^{-1}b^{\!\top}\bigr{)}^{-1}\begin{bmatrix}-bL^{-1}&1\end{bmatrix}.

- \tilde{u} \tilde{A}^{- 1} \tilde{v}

- \tilde{u} \tilde{A}^{- 1} \tilde{v}

\displaystyle=\bigl{(}bL^{-1}b^{\!\top}\bigr{)}^{-1}

\displaystyle=\Biggl{(}b\left(\begin{bmatrix}.&.\\ .&A^{-1}\end{bmatrix}+\begin{bmatrix}1\\ -A^{-1}v\end{bmatrix}\bigl{(}c-uA^{-1}v\bigr{)}^{-1}\begin{bmatrix}1&-uA^{-1}\end{bmatrix}\right)b^{\!\top}\Biggr{)}^{-1}

\displaystyle=\Biggl{(}\left(\begin{bmatrix}.&.\end{bmatrix}-\bigl{(}c-uA^{-1}v\bigr{)}^{-1}\begin{bmatrix}1&-uA^{-1}\end{bmatrix}\right)\begin{bmatrix}-1\\ .\end{bmatrix}\Biggr{)}^{-1}

= c - u A^{- 1} v . \qed

[1 . . .], 1 . . . - x 1 . . - y . 1 . . - y - x 1, . . . 1 .

[1 . . .], 1 . . . - x 1 . . - y . 1 . . - y - x 1, . . . 1 .

L_{x y + y x}^{'} = . . . . 1 . . y x - 1 . y . - 1 . . x - 1 . . 1 - 1 . . .

L_{x y + y x}^{'} = . . . . 1 . . y x - 1 . y . - 1 . . x - 1 . . 1 - 1 . . .

L_{x y + y x} = . y x y . - 1 x - 1 . .

L_{x y + y x} = . y x y . - 1 x - 1 . .

L_{R} = [D B C A] \in K ⟨ X ⟩^{(p + n) \times (q + n)} .

L_{R} = [D B C A] \in K ⟨ X ⟩^{(p + n) \times (q + n)} .

L = . v_{f} v_{g} u_{f} A_{f} . u_{g} . - A_{g},

L = . v_{f} v_{g} u_{f} A_{f} . u_{g} . - A_{g},

K [α, β] = K [

K [α, β] = K [

β_{1, 1}, \dots, β_{1, n}, β_{2, 1}, \dots, β_{2, n}, \dots, β_{n, 1}, \dots, β_{n, n}] .

u = [* u^{'} .], A = * * * . A^{'} * . . * and v = . v^{'} * .

u = [* u^{'} .], A = * * * . A^{'} * . . * and v = . v^{'} * .

A = [A_{f} . - A_{f} u_{f}^{⊤} u_{g} A_{g}], s = [s_{f} - u_{f}^{⊤} g - s_{g}] and v = [v_{f} - v_{g}] .

A = [A_{f} . - A_{f} u_{f}^{⊤} u_{g} A_{g}], s = [s_{f} - u_{f}^{⊤} g - s_{g}] and v = [v_{f} - v_{g}] .

P = [I_{n_{f}} . T I_{n_{g}}] and Q = [I_{n_{f}} . - U I_{n_{g}}]

P = [I_{n_{f}} . T I_{n_{g}}] and Q = [I_{n_{f}} . - U I_{n_{g}}]

A^{'}

A^{'}

= [A_{f} . - A_{f} u_{f}^{⊤} u_{g} + T A_{g} A_{g}] [I_{n_{f}} . - U I_{n_{g}}]

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Linearizing the Word Problem

in (some) Free Fields

Konrad Schrempf111Contact: [email protected], Universität Wien, Fakultät für Mathematik, Oskar-Morgenstern-Platz 1, 1090 Wien, Austria. Supported by the Austrian FWF Project P25510-N26 “Spectra on Lamplighter groups and Free Probability”

Abstract

We describe a solution of the word problem in free fields (coming from non-commutative polynomials over a commutative field) using elementary linear algebra, provided that the elements are given by minimal linear representations. It relies on the normal form of Cohn and Reutenauer and can be used more generally to (positively) test rational identities. Moreover we provide a construction of minimal linear representations for the inverse of non-zero elements.

Keywords: word problem, minimal linear representation, linearization, realization, admissible linear system, rational series

AMS Classification: 16K40, 03B25, 16S10, 15A22

Introduction

Free (skew) fields arise as universal objects when it comes to embed the ring of non-commutative polynomials, that is, polynomials in (a finite number of) non-commuting variables, into a skew field [Coh85, Chapter 7]. The notion of “free fields” goes back to Amitsur [Ami66]. A brief introduction can be found in [Coh03, Section 9.3], for details we refer to [Coh95, Section 6.4]. In the present paper we restrict the setting to commutative ground fields, as a special case. See also [Rob84]. In [CR94], Cohn and Reutenauer introduced a normal form for elements in free fields in order to extend results from the theory of formal languages. In particular they characterize minimality of linear representations in terms of linear independence of the entries of a column and a row vector, generalizing the concept of “controllability and observability matrix” (Section 3).

It is difficult to solve the word problem, that is, given linear representations (LRs for short) of two elements $f$ and $g$ , to decide whether $f=g$ . In [CR99] the authors describe an answer of this question (for free fields coming from non-commutative polynomials over a commutative field). In practice however, this technique (using Gröbner bases) can be impractical even for representations of small dimensions.

Fortunately, it turns out that the word problem is equivalent to the solvability of a linear system of equations if both elements are given by minimal linear representations. Constructions of the latter are known for regular elements (non-commutative rational series), but in general non-linear techniques are necessary. This is considered in future work. Here we present a simple construction of minimal LRs for the inverses of arbitrary non-zero elements given by minimal LRs. In particular this applies to the inverses of non-zero polynomials with vanishing constant coefficient (which are not regular anymore). This is of interest especially for those polynomials which are identities [AL50], for example $xy-yx$ (which vanishes identically on commutative rings).

In any case, positive testing of rational identities becomes easy. Furthermore, the implementation in computer (algebra) software needs only a basic data structure for matrices (linear matrix pencil) and an exact solver for linear systems.

Section 1 introduces the required notation concerning linear representations and admissible linear systems in free fields. Rational operations on representation level are formulated and the related concepts of linearization and realization are briefly discussed. Section 2 describes the word problem. Theorem 2.4 shows that the (in general non-linear) problem of finding appropriate transformation matrices can be reduced to a linear system of equations if the given LRs are minimal. Examples can be constructed for regular elements (rational series) as special cases (of elements in the free field), which are summarized in Section 3. Here algorithms for obtaining minimal LRs are already known. Section 4 provides a first step in the construction of minimal LRs (with linear techniques), namely for the inverses of non-zero elements given itself by a minimal linear representations. This is formulated in Theorem 4.20.

The main result is Theorem 2.4, the “linear” word problem. Although it is rather elementary, it opens the possibility to work directly on linear representations (instead of the spaces they “span”). Or, using Bergman’s words [Ber78]: “The main results in this paper are trivial. But what is trivial when described in the abstract can be far from clear in the context of a complicated situation where it is needed.”

1 Representing Elements

Although there are several ways for representing elements in (a subset of) the free field (linear representation [CR99], linearization [Coh85], realization [HMV06], proper linear system [SS78], etc.) the concept of a linear representation seems to be the most convenient. It has the advantage (among others) that in the special case of regular elements, the general definition of the rank coincides with the Hankel rank [Fli74], [SS78, Section II.3].

Closely related to LRs are admissible linear systems (ALS for short) [Coh85], which could be seen as a special case. Both notations will be used synonymously. Depending on the context an ALS will be written as a triple, for example $\mathcal{A}=(u,A,v)$ or as linear system $As=v$ , sometimes as $u=tA$ . Like the rational operations defined on linear representation level [CR99], similar constructions can be done easily on ALS level. Thus, starting from systems for monomials (Proposition 4.1) only, a representation for each element in the free field can be constructed recursively.

Notation. Zero entries in matrices are usually replaced by (lower) dots to stress the structure of the non-zero entries unless they result from transformations where there were possibly non-zero entries before. We denote by $I_{n}$ the identity matrix and $\Sigma_{n}$ the permutation matrix that reverses the order of rows/columns of size $n$ . If the size is clear from the context, $I$ and $\Sigma$ are used respectively.

Let $\mathbb{K}$ be a commutative field and $X=\{x_{1},x_{2},\ldots,x_{d}\}$ be a finite alphabet. $\mathbb{K}\langle X\rangle$ denotes the free associative algebra (or “algebra of non-commutative polynomials”) and $\mathbb{K}(\!\langle X\rangle\!)$ denotes the universal field of fractions (or “free field”) of $\mathbb{K}\langle X\rangle$ [Coh95], [CR99]. In the examples the alphabet is usually $X=\{x,y,z\}$ .

Definition 1.1 (Inner Rank, Full Matrix, Hollow Matrix

[Coh85], [CR99]).

Given a matrix $A\in\mathbb{K}\langle X\rangle^{n\times n}$ , the inner rank of $A$ is the smallest number $m\in\mathbb{N}$ such that there exists a factorization $A=TU$ with $T\in\mathbb{K}\langle X\rangle^{n\times m}$ and $U\in\mathbb{K}\langle X\rangle^{m\times n}$ . The matrix $A$ is called full if $m=n$ , non-full otherwise. It is called hollow if it contains a zero submatrix of size $k\times l$ with $k+l>n$ .

Definition 1.2 (Associated and Stably Associated Matrices

[Coh95]).

Two matrices $A$ and $B$ over $\mathbb{K}\langle X\rangle$ (of the same size) are called associated over a subring $R\subseteq\mathbb{K}\langle X\rangle$ if there exist invertible matrices $P,Q$ over $R$ such that $A=PBQ$ . $A$ and $B$ (not necessarily of the same size) are called stably associated if $A\oplus I_{p}$ and $B\oplus I_{q}$ are associated for some unit matrices $I_{p}$ and $I_{q}$ . Here by $C\oplus D$ we denote the diagonal sum $\bigl{[}\begin{smallmatrix}C&.\\ .&D\end{smallmatrix}\bigr{]}$ .

In general it is hard to decide whether a matrix is full or not. For a linear matrix, that is, a matrix of the form $A=A_{0}\otimes 1+A_{1}\otimes x_{1}+\ldots+A_{d}\otimes x_{d}$ with $A_{\ell}$ over $\mathbb{K}$ , the following criterion is known, which is used in (the proof of) Theorem 2.1. If a matrix over $\mathbb{K}\langle X\rangle$ is not linear, then Higman’s trick [Coh85, Section 5.8] can be used to linearize it by enlargement. The inner rank is also discussed in [FR04].

Lemma 1.3 ([Coh95, Corollary 6.3.6]).

A linear square matrix over $\mathbb{K}\langle X\rangle$ which is not full is associated over $\mathbb{K}$ to a linear hollow matrix.

Definition 1.4 (Linear Representations

[CR94]).

Let $f\in\mathbb{K}(\!\langle X\rangle\!)$ . A linear representation of $f$ is a triple $(u,A,v)$ with $u\in\mathbb{K}^{1\times n}$ , $A=A_{0}\otimes 1+A_{1}\otimes x_{1}+\ldots+A_{d}\otimes x_{d}$ , $A_{\ell}\in\mathbb{K}^{n\times n}$ and $v\in\mathbb{K}^{n\times 1}$ such that $A$ is full, that is, $A$ is invertible over the free field $\mathbb{K}(\!\langle X\rangle\!)$ , and $f=uA^{-1}v$ . The dimension of the representation is $\dim\,(u,A,v)=n$ . It is called minimal if $A$ has the smallest possible dimension among all linear representations of $f$ . The “empty” representation $\pi=(,,)$ is the minimal representation for $0\in\mathbb{K}(\!\langle X\rangle\!)$ with $\dim\pi=0$ .

Remark. In Definition 1.15 it can be seen that $f=uA^{-1}v$ is (up to sign) the Schur complement of the linearization $\bigl{[}\begin{smallmatrix}0&u\\ v&A\end{smallmatrix}\bigr{]}$ with respect to the upper left $1\times 1$ block.

Definition 1.5 ([CR99]).

Two linear representations are called equivalent if they represent the same element.

Definition 1.6 (Rank

[CR99]).

Let $f\in\mathbb{K}(\!\langle X\rangle\!)$ and $\pi$ be a minimal representation of $f$ . Then the rank of $f$ is defined as $\operatorname{rank}f=\dim\pi$ .

Remark. The connection to the related concepts of inversion height and depth can be found in [Reu96], namely inversion height $\leq$ depth $\leq$ rank. Additional discussion about the depth appears in [Coh06, Section 7.7].

Definition 1.7.

Let $M=M_{1}\otimes x_{1}+\ldots+M_{d}\otimes x_{d}$ . An element in $\mathbb{K}(\!\langle X\rangle\!)$ is called regular, if it has a linear representation $(u,A,v)$ with $A=I-M$ , that is, $A_{0}=I$ in Definition 1.4, or equivalently, if $A_{0}$ is regular (invertible).

Definition 1.8 (Left and Right Families

[CR94]).

Let $\pi=(u,A,v)$ be a linear representation of $f\in\mathbb{K}(\!\langle X\rangle\!)$ of dimension $n$ . The families $(s_{1},s_{2},\ldots,s_{n})\subseteq\mathbb{K}(\!\langle X\rangle\!)$ with $s_{i}=(A^{-1}v)_{i}$ and $(t_{1},t_{2},\ldots,t_{n})\subseteq\mathbb{K}(\!\langle X\rangle\!)$ with $t_{j}=(uA^{-1})_{j}$ are called left family and right family respectively. $L(\pi)=\operatorname{span}\{s_{1},s_{2},\ldots,s_{n}\}$ and $R(\pi)=\operatorname{span}\{t_{1},t_{2},\ldots,t_{n}\}$ denote their linear spans.

Remark. The left family $(A^{-1}v)_{i}$ (respectively the right family $(uA^{-1})_{j}$ ) and the solution vector $s$ of $As=v$ (respectively $t$ of $u=tA$ ) will be used synonymously.

Proposition 1.9 ([CR94], Proposition 4.7).

A representation $\pi=(u,A,v)$ of an element $f\in\mathbb{K}(\!\langle X\rangle\!)$ is minimal if and only if both, the left family and the right family, are $\mathbb{K}$ -linearly independent.

Definition 1.10 (Admissible Linear Systems

[Coh72]).

A linear representation $\mathcal{A}=(u,A,v)$ of $f\in\mathbb{K}(\!\langle X\rangle\!)$ is called admissible linear system (for $f$ ), denoted by $As=v$ , if $u=e_{1}=[1,0,\ldots,0]$ . The element $f$ is then the first component of the (unique) solution vector $s$ .

Remark. In [Coh85], Cohn defines admissible linear systems with $v=v_{0}\otimes 1+v_{1}\otimes x_{1}+\ldots+v_{d}\otimes x_{d}$ with $v_{i}\in\mathbb{K}^{n\times 1}$ , and $u=[0,\ldots,0,1]$ . Writing $B=[-v,A]$ as block of size $n\times(n+1)$ the first $n$ columns of $B$ serve as numerator and the last $n$ columns of $B$ as denominator. However, in this setting, for regular elements, the dimension of such a minimal system could differ from the Hankel rank [Fli74], [SS78, Section II.3].

Definition 1.11 (Admissible Transformations).

Given a linear representation $\mathcal{A}=(u,A,v)$ of dimension $n$ of $f\in\mathbb{K}(\!\langle X\rangle\!)$ and invertible matrices $P,Q\in\mathbb{K}^{n\times n}$ , the transformed $P\mathcal{A}Q=(uQ,PAQ,Pv)$ is again a linear representation (of $f$ ). If $\mathcal{A}$ is an ALS, the transformation $(P,Q)$ is called admissible if the first row of $Q$ is $e_{1}=[1,0,\ldots,0]$ .

Remark 1.12 (Elementary Transformations).

In practice, transformations can be done by elementary row- and column operations (with respect to the system matrix $A$ ). If we add $\alpha$ -times row $i$ to row $j\neq i$ in $A$ , we also have to do this in $v$ . If we add $\beta$ -times column $i$ to column $j\neq i$ we have to subtract $\beta$ -times row $j$ from row $i$ in $s$ . Since it is not allowed to change the first entry of $s$ , column 1 cannot be used to eliminate entries in other columns! As an example consider the ALS

[TABLE]

for the element $1-x\in\mathbb{K}(\!\langle X\rangle\!)$ . Adding column 1 to column 2, that is, $Q=\bigl{[}\begin{smallmatrix}1&1\\ .&1\end{smallmatrix}\bigr{]}$ (and $P=I$ ), yields the ALS

[TABLE]

for the element $-x\neq 1-x$ .

Proposition 1.13 (Rational Operations

[CR99, Section 1]).

Let $f,g,h\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the admissible linear systems $\mathcal{A}_{f}=(u_{f},A_{f},v_{f})$ , $\mathcal{A}_{g}=(u_{g},A_{g},v_{g})$ and $\mathcal{A}_{h}=(u_{h},A_{h},v_{h})$ respectively, with $h\neq 0$ and let $\mu\in\mathbb{K}$ . Then admissible linear systems for the rational operations can be obtained as follows:

The scalar multiplication $\mu f$ is given by

[TABLE]

The sum $f+g$ is given by

[TABLE]

The product $fg$ is given by

[TABLE]

And the inverse $h^{-1}$ is given by

[TABLE]

Remark. One can easily verify that the solution vectors for the admissible linear systems defined above are

[TABLE]

respectively. It remains to check that the system matrices are full. For the sum and the product this is clear from the fact that the free associative algebra —being a free ideal ring (FIR)— has unbounded generating number (UGN) and therefore the diagonal sum of full matrices is full, see [Coh85, Section 7.3]. The system matrix for the inverse is full because $h\neq 0$ and therefore the linearization of $\mathcal{A}_{h}$ is full, compare [Coh95, Section 4.5].

Remark 1.14.

For the rational operations from Proposition 1.13 we observe that the left families satisfy the relations

[TABLE]

And similarly for the right families we have

[TABLE]

Definition 1.15 (Linearization

[BMS17], [CR99]).

Let $f\in\mathbb{K}(\!\langle X\rangle\!)$ . A linearization of $f$ is a matrix $L=L_{0}\otimes 1+L_{1}\otimes x_{1}+\ldots+L_{d}\otimes x_{d}$ , with $L_{\ell}\in\mathbb{K}^{m\times m}$ , of the form

[TABLE]

such that $A$ is invertible over the free field and $f$ is the Schur complement, that is, $f=c-uA^{-1}v$ . If $c=0$ then $L$ is called a pure linearization. The size of the linearization is $\operatorname{size}L=m$ , the dimension is $\dim L=m-1$ .

Proposition 1.16 ([BMS17, Proposition 3.2]).

Let $\mathbb{F}=\mathbb{K}(\!\langle X\rangle\!)$ and $A\in\mathbb{F}^{k\times k}$ , $B\in\mathbb{F}^{k\times l}$ , $C\in\mathbb{F}^{l\times k}$ and $D\in\mathbb{F}^{l\times l}$ be given and assume that $D$ is invertible in $\mathbb{F}^{l\times l}$ . Then the matrix $\bigl{[}\begin{smallmatrix}A&B\\ C&D\end{smallmatrix}\bigr{]}$ is invertible in $\mathbb{F}^{(k+l)\times(k+l)}$ if and only if the Schur complement $A-BD^{-1}C$ is invertible in $\mathbb{F}^{k\times k}$ . In this case

[TABLE]

Remark 1.17.

(i) Let $f\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the linearization $L$ . Then $f=|L|_{1,1}$ is the $(1,1)$ -quasideterminant [GGRW05] of $L$ .

(ii) Given a linear representation $(u,A,v)$ of $f\in\mathbb{K}(\!\langle X\rangle\!)$ , then $L=\bigl{[}\begin{smallmatrix}.&u\\ -v&A\end{smallmatrix}\bigr{]}$ is a pure linearization of $f$ .

(iii) Talking about a minimal linearization, one has to specify which class of matrices is considered: Scalar entries in the first row and column? Pure? And, if applicable, selfadjoint?

Proposition 1.18.

Let

[TABLE]

be a linearization of size $n$ for some element $f\in\mathbb{K}(\!\langle X\rangle\!)$ and define another element $g\in\mathbb{K}(\!\langle X\rangle\!)$ by the pure linearization

[TABLE]

of size $n+2$ . Then $g=f$ .

Proof.

Using Proposition 1.16 —taking the Schur complement with respect to the block entry $(2,2)$ — and $b=[-1,0,\ldots,0]$ , the inverse of $\tilde{A}=\bigl{[}\begin{smallmatrix}L&b^{\!\top}\\ b&.\end{smallmatrix}\bigr{]}$ can be written as

[TABLE]

Hence

[TABLE]

If the first row or column of a linearization for some $f\in\mathbb{K}(\!\langle X\rangle\!)$ contains non-scalar entries, then Proposition 1.18 can be used to construct a linear representation of $f$ . On the other hand, given a linear representation of dimension $n$ (of $f$ ) which can be brought to such a form, a linearization of size $n-1$ can be obtained. The characterization of minimality for linearizations will be considered in future work.

Example 1.19.

For the anticommutator $xy+yx$ a minimal ALS is given by

[TABLE]

Permuting the columns and multiplying the system matrix by $-1$ we get the linearization

[TABLE]

which is of the form in Proposition 1.18 and yields a minimal (pure) linearization of the anticommutator

[TABLE]

Definition 1.20 (Realization

[HMV06]).

A realization of a matrix $F\in\mathbb{K}(\!\langle X\rangle\!)^{p\times q}$ is a quadruple $(A,B,C,D)$ with $A=A_{0}\otimes 1+A_{1}\otimes x_{1}+\ldots+A_{d}\otimes x_{d}$ , $A_{\ell}\in\mathbb{K}^{n\times n}$ , $B\in\mathbb{K}^{n\times q}$ , $C\in\mathbb{K}^{p\times n}$ and $D\in\mathbb{K}^{p\times q}$ such that $A$ is invertible over the free field and $F=D-CA^{-1}B$ . The dimension of the realization is $\dim\,(A,B,C,D)=n$ .

Remark. A realization $\mathcal{R}=(A,B,C,D)$ could be written in block form

[TABLE]

Here, the definition is such that $F=|L_{\mathcal{R}}|_{1^{\prime},1^{\prime}}$ is the $(1,1)$ -block-quasideterminant [GGRW05] with respect to block $D$ . For $A=-J+L_{A}(X)$ we obtain the descriptor realization in [HMV06]. Realizations where $B$ and/or $C$ contain non-scalar entries are sometimes called “butterfly realizations” [HMV06]. Minimality with respect to realizations is investigated in [Vol18].

2 The Word Problem

Let $f,g\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the linear representations $\pi_{f}=(u_{f},A_{f},v_{f})$ and $\pi_{g}=(u_{g},A_{g},v_{g})$ of dimension $n_{f}$ and $n_{g}$ respectively and define the matrix

[TABLE]

which is a linearization of $f-g$ , of size $n=n_{f}+n_{g}+1$ . Then $f=g$ if and only if $L$ is not full [Coh95, Section 4.5]. For the word problem see also [Coh95, Section 6.6]. Whether $L$ is full or not can be decided by the following theorem. For $P=(\alpha_{ij})$ and $Q=(\beta_{ij})$ the commutative polynomial ring is

[TABLE]

Theorem 2.1 ([CR99, Theorem 4.1]).

For each $r\in\{1,2,\ldots,n\}$ , denote by $I_{r}$ the ideal of $\mathbb{K}[\alpha,\beta]$ generated by the polynomials $\det(P)-1$ , $\det(Q)-1$ and the coefficients of each $x\in\{1\}\cup X$ in the $(i,j)$ entries of the matrix $PLQ$ for $1\leq i\leq r$ , $r\leq j\leq n$ . Then the linear matrix $L$ is full if and only if for each $r\in\{1,2,\ldots,n\}$ , the ideal $I_{r}$ is trivial.

Remark. Notice that there is a misprint in [CR99] and the coefficients of $L_{0}$ are omitted.

So far we were not able to apply this theorem practically for $n\geq 5$ , where $50$ or more unknowns are involved. However, if we have any ALS (or linear representation) for $f-g$ , say from Proposition 1.13, then we could check whether it can be (admissibly) transformed into a smaller system, for example $A^{\prime}s^{\prime}=0$ . For polynomials (with $A=I-Q$ and $Q$ upper triangular and nilpotent) this could be done row by row. In general the pivot blocks (the blocks in the diagonal) can be arbitrarily large. Therefore this elimination has to be done blockwise by setting up a single linear system for row and column operations. This idea is used in the following lemma. Note that the existence of a solution for this linear system is invariant under admissible transformations (on the subsystems). This is a key requirement since the normal form [CR94] is defined modulo similarity transformations (more general by stable association, Definition 1.2).

Theorem 2.2 ([CR99, Theorem 1.4]).

If $\pi^{\prime}=(u^{\prime},A^{\prime},v^{\prime})$ and $\pi^{\prime\prime}=(u^{\prime\prime},A^{\prime\prime},v^{\prime\prime})$ are equivalent (pure) linear representations, of which the first is minimal, then the second is isomorphic to a representation $(u,A,v)$ which has the block decomposition

[TABLE]

Lemma 2.3.

Let $f,g\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the admissible linear systems $\mathcal{A}_{f}=(u_{f},A_{f},v_{f})$ and $\mathcal{A}_{g}=(u_{g},A_{g},v_{g})$ of dimension $n_{f}$ and $n_{g}$ respectively. If there exist matrices $T,U\in\mathbb{K}^{n_{f}\times n_{g}}$ such that $u_{f}U=0$ , $TA_{g}-A_{f}U=A_{f}u_{f}^{\!\top}u_{g}$ and $Tv_{g}=v_{f}$ , then $f=g$ .

Proof.

The difference $f-g$ can be represented by the admissible linear system $As=v$ with

[TABLE]

Defining the (invertible) transformations

[TABLE]

and $A^{\prime}=PAQ$ , $s^{\prime}=Q^{-1}s$ and $v^{\prime}=Pv$ we get a new system $A^{\prime}s^{\prime}=v^{\prime}$ :

[TABLE]

Invertibility of $A^{\prime}$ over the free field implies $s_{f}-u_{f}^{\!\top}g-Us_{g}=0$ , in particular

[TABLE]

because $u_{f}U=0$ . ∎

Let $d$ be the number of letters in the alphabet $X$ , $\dim\mathcal{A}_{f}=n_{f}$ and $\dim\mathcal{A}_{g}=n_{g}$ . To determine the transformation matrices $T,U\in\mathbb{K}^{n_{f}\times n_{g}}$ from the lemma we just have to solve a linear system of $(d+1)n_{f}(n_{g}+1)$ equations in $2n_{f}n_{g}$ unknowns. If there is a solution then $f=g$ . Neither $\mathcal{A}_{f}$ nor $\mathcal{A}_{g}$ have to be minimal. Computer experiments show, that Hua’s identity [Ami66]

[TABLE]

can be tested positively by Lemma 2.3 when the left hand side is constructed by the rational operations from Proposition 1.13. However, without assuming minimality, the fact that there is no solution does not imply, that $f\neq g$ , see Example 2.5 below.

Theorem 2.4 (“Linear solution” of the Word Problem).

Let $f,g\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the minimal admissible linear systems $\mathcal{A}_{f}=(u_{f},A_{f},v_{f})$ and $\mathcal{A}_{g}=(u_{g},A_{g},v_{g})$ of dimension $n$ respectively. Then $f=g$ if and only if there exist matrices $T,U\in\mathbb{K}^{n\times n}$ such that $u_{f}U=0$ , $TA_{g}-A_{f}U=A_{f}u_{f}^{\!\top}u_{g}$ and $Tv_{g}=v_{f}$ .

Proof.

If $f=g$ then, since admissible linear systems correspond to (pure) linear representations, by Theorem 2.2 there exist invertible matrices $P,Q\in\mathbb{K}^{n\times n}$ such that $A_{f}=PA_{g}Q$ and $v_{f}=Pv_{g}$ . Let $T=P$ and $U=Q^{-1}-u_{f}^{\!\top}u_{g}$ . The admissible linear systems are minimal. Hence, the left family $s_{f}$ is $\mathbb{K}$ -linearly independent. Since the first component of $s_{g}$ is equal to the first component of $s_{f}=Q^{-1}s_{g}$ and the left family $s_{g}$ is $\mathbb{K}$ -linearly independent, the first row of $Q^{-1}$ must be $[1,0,\ldots,0]$ . Therefore $u_{f}U=u_{f}(Q^{-1}-u_{f}^{\!\top}u_{g})=0$ . Clearly $v_{f}=Tv_{g}$ and

[TABLE]

The other implication follows from Lemma 2.3. ∎

Example 2.5.

Let $f=x^{-1}$ and $g=x^{-1}$ be given by the admissible linear systems

[TABLE]

respectively. Then the ALS

[TABLE]

represents $f-g=0$ .

While it is obvious here that the second component of the solution vector $s_{g}$ is zero, it is not clear in general how one can exclude such “pathological” LRs without minimality assumption. One might ask, for which class of constructions (rational operations [CR99], Higman’s trick [Hig40, Coh85], selfadjoint linearization trick [And13], etc.) there are sufficient conditions for the existence of matrices $T,U$ (over $\mathbb{K}$ ) in Lemma 2.3 if $f=g$ . Unfortunately this seems to be impossible except for some specific examples as the following ALS (constructed by the rational operations from Proposition 1.13) for $x-xyy^{-1}=0$ suggests (some zeros are kept to emphasize the block structure):

[TABLE]

There are no $T,U$ and therefore no $P,Q$ (admissible, with blocks $T,U$ ) such that $PAQ$ has a $2\times 7$ upper right block of zeros and the first two components of $Pv$ are zero. Therefore we would like to construct minimal linear representations directly. For regular elements, algorithms are known, see Section 3. How to proceed in general is open except for the inverse. This is discussed in Section 4.

3 Regular Elements

For regular elements (Definition 1.7) in the free field minimal linear representations can be obtained via the Extended Ho-Algorithm [FM80] from the Hankel matrix or by minimizing a given linear representation via the algorithm in [CC80] by detecting linearly dependent rows in the controllability matrix and linearly dependent columns in the observability matrix, see Definition 3.3. The basic idea goes back to Schützenberger [Sch61]. Controllability and observability is discussed in [KFA69, Chapter 10].

For an alphabet $X=\{x_{1},x_{2},\ldots,x_{d}\}$ (finite, non-empty set), the free monoid generated by $X$ is denoted by $X^{*}$ . A formal power series (in non-commuting variables) is a mapping $f$ from $X^{*}$ to a commutative field $\mathbb{K}$ , written as formal sum

[TABLE]

with coefficients $(f,w)\in\mathbb{K}$ . In general, $\mathbb{K}$ could be replaced by a ring or a skew field. [SS78] and [BR11] contain detailed introductions. On the set of formal power series $\mathbb{K}\langle\!\langle X\rangle\!\rangle$ the following rational operations are defined for $f,g,h\in\mathbb{K}\langle\!\langle X\rangle\!\rangle$ with $(h,1)=0$ , and $\mu\in\mathbb{K}$ :

The scalar multiplication

[TABLE]

The sum

[TABLE]

The product

[TABLE]

And the quasiinverse

[TABLE]

The set of non-commutative (nc) rational series $\mathbb{K}^{\text{rat}}\langle\!\langle X\rangle\!\rangle$ is the smallest rationally closed (that is, closed under scalar multiplication, sum, product and quasiinverse) subset of $\mathbb{K}\langle\!\langle X\rangle\!\rangle$ containing the nc polynomials $\mathbb{K}\langle X\rangle$ . A series $f\in\mathbb{K}\langle\!\langle X\rangle\!\rangle$ is called recognizable if there exists a natural number $n$ , a monoid homomorphism $\mu:X^{*}\to\mathbb{K}^{n\times n}$ and two vectors $\alpha\in\mathbb{K}^{1\times n}$ , $\beta\in\mathbb{K}^{n\times 1}$ such that $f$ can be written as

[TABLE]

The triple $(\alpha,\mu,\beta)$ is called a linear representation [SS78, Section II.2].

Theorem 3.1 ([Sch61]).

A series $f\in\mathbb{K}\langle\!\langle X\rangle\!\rangle$ is rational if and only if it is recognizable.

A rational series $f$ can be represented by a proper linear system (PLS for short) $s=v+Qs$ where $f$ is the first component of the unique solution vector $s$ (with $v\in\mathbb{K}^{n\times 1}$ , $Q=Q_{1}\otimes x_{1}+Q_{2}\otimes x_{2}+\ldots+Q_{d}\otimes x_{d}$ , $Q_{i}\in\mathbb{K}^{n\times n}$ for some $n\in\mathbb{N}$ ). Rational operations are then formulated on this level [SS78, Section II.1]. Clearly, every proper linear system gives rise to an admissible linear system $\mathcal{A}=(u,I-Q,v)$ with $u=e_{1}$ . When $(\alpha,\mu,\beta)$ is a linear representation of a recognizable series, then $\pi=(\alpha,I-Q,\beta)$ with

[TABLE]

is a linear representation of $f\in\mathbb{K}(\!\langle X\rangle\!)$ . For a PLS the solution vector $s$ can be computed by the quasiinverse $Q^{+}$ :

[TABLE]

Definition 3.3 (Controllability and Observability Matrix).

Let $\mathcal{P}=(u,I-Q,v)$ be a proper linear system of dimension $n$ (for some nc rational series). Then the controllability matrix and the observability matrix are defined as

[TABLE]

respectively.

Remark. Note that the monomials (in the polynomials) in $Q^{k}$ have length $k$ . The matrices $\mathcal{V}$ and $\mathcal{U}$ are over $\mathbb{K}\langle X\rangle$ . A priori these matrices would have an infinite number of columns and rows respectively. However, by [Coh95, Lemma 6.6.3], it suffices to use the columns of $\mathcal{V}$ and rows of $\mathcal{U}$ only. This gives the connection to [BR11, Section I.2] and could be used for minimizing proper linear systems [SS78]. In other words: Instead of identifying $\mathbb{K}$ -linear dependence of the left family $s=(I-Q)^{-1}v=(I+Q+Q^{2}+\ldots)v$ , we can restrict to the “approximated” left family $\tilde{s}=(I+Q+\ldots+Q^{n-1})v$ .

Now let $X_{k}^{*}\subseteq X^{*}$ denote the set of words of length $k$ and use $\mu:X^{*}\to\mathbb{K}^{n\times n}$ from (3.2) to define $V_{k}\in\mathbb{K}^{n\times d^{k}}$ with columns $\mu(w)v$ for $w\in X_{k}^{*}$ and $U_{k}\in\mathbb{K}^{d^{k}\times n}$ with rows $u\mu(w)$ for $w\in X_{k}^{*}$ . Then the controllability matrix and the observability matrix can be defined alternatively as

[TABLE]

with entries in $\mathbb{K}$ . Note that the rank of $\mathcal{V}^{\prime}$ is at most $n$ while the number of columns of the blocks $V_{k}$ is $d^{k}$ . So —for an alphabet with more than one letter— most of the columns are not needed. For $X=\{x\}$ and $Q=Q_{x}\otimes x$ we have $V_{k}=Q_{x}^{k}v$ and $U_{k}=uQ_{x}^{k}$ . Hence $\mathcal{V}^{\prime}$ and $\mathcal{U}^{\prime}$ can be written as

[TABLE]

Compare with [KFA69, Section 6.3]. For controllability and observability in connection with realizations (Definition 1.20) see also [HMV06].

4 Minimizing the Inverse

Having solved the word problem, our next goal is “minimal arithmetics” in the free field $\mathbb{K}(\!\langle X\rangle\!)$ . That is, given elements by minimal admissible linear systems, to compute minimal ones for the rational operations. For the scalar multiplication this is trivial. For the inverse some preparation is necessary. The result is presented in Theorem 4.20. The “minimal sum” and the “minimal product” are considered in future work. The main difficulty is not minimality but the restriction to linear techniques.

Proposition 4.1.

Let $k\in\mathbb{N}$ and $f=x_{i_{1}}x_{i_{2}}\cdots x_{i_{k}}$ be a monomial in $\mathbb{K}\langle X\rangle\subseteq\mathbb{K}(\!\langle X\rangle\!)$ . Then

[TABLE]

is a minimal ALS of dimension $\dim\mathcal{A}=k+1$ .

Proof.

The system matrix of $\mathcal{A}$ is full. For row indices $[1,x_{i_{1}},x_{i_{1}}x_{i_{2}},$ $\ldots,x_{i_{1}}\cdots x_{i_{k}}]$ and column indices $[1,x_{i_{k}},x_{i_{k-1}}x_{i_{k}},$ $\ldots,x_{i_{1}}\cdots x_{i_{k}}]$ the Hankel matrix [Fli74], [SS78, Section II.3] of $f$ is

[TABLE]

and has rank $k+1$ . ∎

Remark. Trivially, $\mathcal{A}=([1],[1],[1])$ is a minimal ALS for the unit element (empty word).

The following proposition is a variant of the inverse in Proposition 1.13 and is motivated by inverting the inverse of a monomial, for example, $f=(xyz)^{-1}$ . A minimal ALS for $f$ is given by

[TABLE]

Minimality is clear immediately by also checking the $\mathbb{K}$ -linear independence of the right family. Using the construction of the inverse from Proposition 1.13 we get the system

[TABLE]

for $f^{-1}=xyz$ . To obtain the form of Proposition 4.1 we have to reverse the rows $1,2,3$ and the columns $2,3,4$ and multiply the rows $1,2,3$ by $-1$ .

Proposition 4.2 (Standard Inverse).

Let $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the admissible linear system $\mathcal{A}=(u,A,v)$ of dimension $n$ . Then an admissible linear system of dimension $n+1$ for $f^{-1}$ is given by

[TABLE]

(Recall that the permutation matrix $\Sigma=\Sigma_{n}$ reverses the order of rows/columns.)

Definition 4.4 (Standard Inverse).

Let $\mathcal{A}$ be an ALS for a non-zero element. We call the ALS (4.3) the standard inverse of $\mathcal{A}$ , denoted by $\mathcal{A}^{-1}$ .

Proof.

The reader can easily verify that the solution vector of $\mathcal{A}^{-1}$ is

[TABLE]

Compare with Proposition 1.13. ∎

We proceed with the calculation of minimal admissible linear systems for the inverse. We distinguish four types of ALS according to the form of the system matrix. Later, in the remark following Lemma 4.18 and 4.19, we will see how to bring a system matrix to one of these forms depending on the left and right families.

Lemma 4.5 (Inverse Type $(1,1)$ ).

Assume that $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ has a minimal admissible linear system of dimension $n$ of the form

[TABLE]

Then a minimal ALS for $f^{-1}$ of dimension $n-1$ is given by

[TABLE]

with $1\notin R(\mathcal{A}^{\prime})$ and $1\notin L(\mathcal{A}^{\prime})$ .

Proof.

The standard inverse of $\mathcal{A}$ is

[TABLE]

Adding row 4 to row 3 and $\lambda$ -times column 2 to column 1 gives

[TABLE]

It follows that $s_{2}^{\prime}=0$ and $s_{n+1}^{\prime}$ does not contribute to the solution $s_{1}^{\prime}$ and thus the first and the last row as well as the second and the last column can be removed. If $\mathcal{A}^{\prime}$ were not minimal, then there would exist a system $\mathcal{A}^{\prime\prime}$ of dimension $m<n-1$ for $f^{-1}$ . The standard inverse $(\mathcal{A}^{\prime\prime})^{-1}$ would give a system of dimension $m+1<n$ for $f$ , contradicting minimality of $\mathcal{A}$ . It remains to show that $1\notin R(\mathcal{A}^{\prime})$ and $1\notin L(\mathcal{A}^{\prime})$ . Let $t=(t_{1},t_{2},\ldots,t_{n})$ be the right family of $\mathcal{A}$ which is (due to minimality) $\mathbb{K}$ -linearly independent. Then the right family of $\mathcal{A}^{-1}$ is $(f^{-1}t_{n},\ldots,f^{-1}t_{2},f^{-1}t_{1},f^{-1})$ , that after the row operation becomes $f^{-1}(t_{n},\ldots,t_{2},t_{1},1-t_{1})$ . Removing the first and the last component (corresponding to the first and the last row) yields the right family $f^{-1}(t_{n-1},\ldots,t_{2},t_{1})$ . Therefore $1\notin R(\mathcal{A}^{\prime})$ , because otherwise $f\in\operatorname{span}\{t_{n-1},\ldots,t_{2},t_{1}\}$ , contradicting $\mathbb{K}$ -linear independence of $t$ . Similar arguments show that $1\notin L(\mathcal{A}^{\prime})$ . ∎

Lemma 4.8 (Inverse Type $(1,0)$ ).

Assume that $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ has a minimal admissible linear system of dimension $n$ of the form

[TABLE]

with $1\not\in L(\mathcal{A})$ . Then a minimal ALS for $f^{-1}$ of dimension $n$ is given by

[TABLE]

with $1\notin L(\mathcal{A}^{\prime})$ .

Proof.

The standard inverse of $\mathcal{A}$ is

[TABLE]

and has dimension $n+1$ . After adding row $n+1$ to row $n$ we can remove row and column $n+1$ because $\tilde{s}_{n+1}$ does not contribute to the solution $\tilde{s}_{1}=f^{-1}$ . Then we divide the first row by $\lambda$ and obtain (4.10). It remains to show that $\mathcal{A}^{\prime}$ is minimal and $1\notin L(\mathcal{A}^{\prime})$ . Let $(s_{1},s_{2},\ldots,s_{n})$ be the left family of $\mathcal{A}$ which is (due to minimality) $\mathbb{K}$ -linearly independent. Then the left family of $\mathcal{A}^{-1}$ is $(f^{-1},s_{n}f^{-1},\ldots,s_{2}f^{-1},1)$ . Note that (admissible) row operations do not affect the left family. Since we removed the last entry $s_{1}f^{-1}=\tilde{s}_{n+1}=1$ , the left family of $\mathcal{A}^{\prime}$ is $(1,s_{n},\ldots,s_{2})f^{-1}$ . By assumption $1\notin L(\mathcal{A})$ . Therefore $1\notin\operatorname{span}\{s_{2},s_{3},\ldots,s_{n}\}$ , hence $(1,s_{n},\ldots,s_{2})$ is $\mathbb{K}$ -linearly independent. Clearly, $1\notin L(\mathcal{A}^{\prime})$ because $f\notin\operatorname{span}\{1,s_{n},\ldots,s_{2}\}$ . Similarly, let $(t_{1},t_{2},\ldots,t_{n})$ be the right family of $\mathcal{A}$ which is $\mathbb{K}$ -linearly independent. Then the right family of $\mathcal{A}^{-1}$ is $(f^{-1}t_{n},\ldots,f^{-1}t_{2},f^{-1}t_{1},f^{-1})$ , that after the row operation is $(f^{-1}t_{n},\ldots,f^{-1}t_{2},f^{-1}t_{1},f^{-1}-f^{-1}t_{1})$ . Since we removed the last entry, the right family of $\mathcal{A}^{\prime}$ is $f^{-1}(t_{n},\ldots,t_{2},t_{1})$ which is clearly $\mathbb{K}$ -linearly independent. Proposition 1.9 gives minimality of $\mathcal{A}^{\prime}$ . ∎

Lemma 4.11 (Inverse Type $(0,1)$ ).

Assume that $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ has a minimal admissible linear system of dimension $n$ of the form

[TABLE]

with $1\not\in R(\mathcal{A})$ . Then a minimal ALS for $f^{-1}$ of dimension $n$ is given by

[TABLE]

with $1\notin R(\mathcal{A^{\prime}})$ .

Proof.

The standard inverse of $\mathcal{A}$ is

[TABLE]

After adding $\lambda$ -times column 2 to column 1 we can remove row 1 and column 2, because $\tilde{s}_{2}=0$ . Showing minimality and $1\notin R(\mathcal{A}^{\prime})$ is similar to the proof of Lemma 4.8 (column operations affect the left family). ∎

Lemma 4.14 (Inverse Type $(0,0)$ ).

Let $\mathcal{A}=(u,A,v)$ be a minimal admissible linear system of dimension $n$ for $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ with $1\notin R(\mathcal{A})$ and $1\notin L(\mathcal{A})$ . Then the standard inverse $\mathcal{A}^{-1}$ is a minimal ALS of dimension $n+1$ for $f^{-1}$ .

Proof.

If $\mathcal{A}^{-1}$ were not minimal, then there would be a system $\mathcal{A}^{\prime}$ of dimension $m<n+1$ for $f^{-1}$ . Applying Lemma 4.5 (Inverse Type $(1,1)$ ), we would get a system of dimension $m-1<n$ for $f$ , contradicting minimality of $\mathcal{A}$ . ∎

Example 4.15.

Taking the minimal ALS (for the anticommutator) from Example 1.19, we get, by Lemma 4.5 (Inverse Type $(1,1)$ ), a minimal ALS for $(xy+yx)^{-1}$ :

[TABLE]

Lemma 4.14 (Inverse Type $(0,0)$ ) gives again a minimal system for $xy+yx$ .

Since it is possible to construct minimal linear representations for regular elements (see Section 3), this is in particular true for polynomials. These are of the form (4.6), since $A=I-Q$ with nilpotent $Q$ , which can be choosen upper triangular. This can be seen either by looking at proper linear systems (Section 3) where admissible transformations are conjugations (of the system matrix such that the first component of the solution vector is left untouched) or by the following proposition.

Proposition 4.16 ([CR99, Proposition 2.1]).

Let $f\in\mathbb{K}(\!\langle X\rangle\!)$ .

(i) $f$ is a power series if and only if in any minimal representation, the constant term of its system matrix is invertible. There is then a minimal representation which is unital, that is, $A_{0}=I$ .

(ii) $f$ is a polynomial if and only if in any unital minimal representation, the matrix $A=A_{1}\otimes x_{1}+\ldots+A_{d}\otimes x_{d}$ is nilpotent. There is then a minimal representation with a unitriangular (ones on and zeros below the diagonal) system matrix.

Example 4.17.

The element $xyz^{-1}$ admits the following minimal ALS

[TABLE]

with right family $t=(1,x,xyz^{-1})$ and left family $s=(xyz^{-1},yz^{-1},z^{-1})$ . Now Lemma 4.8 (Inverse Type $(1,0)$ ) can be applied to get the minimal ALS

[TABLE]

for $zy^{-1}x^{-1}$ . Note that we can apply Lemma 4.8 again, because $1\notin L(\mathcal{A}^{\prime})$ .

If an admissible linear system $\mathcal{A}$ is of the form (4.12) in Lemma 4.11 (Inverse Type $(0,1)$ ), then it follows immediately that $1\in L(\mathcal{A})$ . Conversely, assuming $1\in L(\mathcal{A})$ , the proof of the existence of an admissible transformation $(P,Q)$ such that $P\mathcal{A}Q$ is of the form (4.12) is a bit more involved and requires minimality.

Lemma 4.18 (for Inverse Type $(0,1)$ ).

Let $\mathcal{A}=(u,A,v)$ be a minimal admissible linear system with $\dim\mathcal{A}=n\geq 2$ and $1\in L(\mathcal{A})$ . Then there exists an admissible transformation $(P,Q)$ such that $(uQ,PAQ,Pv)$ is of the form (4.12).

Proof.

Without loss of generality, assume that $v=[0,\ldots,0,1]^{\!\top}$ and the left family $s=A^{-1}v$ is $(s_{1},s_{2},$ $\ldots,s_{n-1},1)$ . Otherwise it can be brought to this form by some admissible transformation $(P^{\circ},Q^{\circ})$ . Now let $\bar{A}$ denote the upper left $(n-1)\times(n-1)$ block of $A$ , let $\bar{s}=(s_{1},\ldots,s_{n-1})$ and write $As=v$ as

[TABLE]

This system is equivalent to

[TABLE]

Since the left family is $\mathbb{K}$ -linearly independent (by minimality of $\mathcal{A}$ ), the matrix $\tilde{A}=\bigl{[}\begin{smallmatrix}\bar{A}&b\\ c&d-1\end{smallmatrix}\bigr{]}$ cannot be full. We claim that there is only one possibility to transform $\tilde{A}$ to a hollow matrix, namely with zero last row. If we cannot produce a $(n-i)\times i$ block of zeros (by invertible transformations) in the first $n-1$ rows of $\tilde{A}$ , then we cannot get blocks of zeros of size $(n-i+1)\times i$ and we are done.

Now assume that there are invertible matrices $P^{\prime}\in\mathbb{K}^{(n-1)\times(n-1)}$ and (admissible) $Q\in\mathbb{K}^{n\times n}$ with $(Q^{-1}s)_{1}=s_{1}$ , such that $P^{\prime}[\bar{A},b]Q$ contains a zero block of size $(n-i)\times i$ for some $i=1,\ldots,n-1$ . There are two cases. If the first $n-i$ entries in column 1 cannot be made zero, we construct an upper right zero block:

[TABLE]

where $A_{11}$ has size $(n-i)\times(n-i)$ . If $A_{11}$ were not full, then $A$ would not be full (the last row is not involved in the transformation). Hence this pivot block is invertible over the free field. Therefore $\hat{s}_{1}=\hat{s}_{2}=\ldots=\hat{s}_{n-i}=0$ . Otherwise we construct an upper left zero block in $PAQ$ . But then $\hat{s}_{i+1}=\hat{s}_{i+2}=\ldots=\hat{s}_{n}=0$ . Both contradict $\mathbb{K}$ -linear independence of the left family.

So there is only one block left, which can make $\tilde{A}$ non-full. Hence, by Lemma 1.3, the modified (system) matrix $\tilde{A}$ is associated over $\mathbb{K}$ to a linear hollow matrix with a $1\times n$ block of zeros, say in the last row (the columns and the first $n-1$ rows are left unouched):

[TABLE]

Hence we have $T\bar{A}+c=0$ and therefore

[TABLE]

Clearly $Tb+d=1$ . The transformation

[TABLE]

does the job. ∎

Lemma 4.19 (for Inverse Type $(1,0)$ ).

Let $\mathcal{A}=(u,A,v)$ be a minimal admissible linear system with $\dim\mathcal{A}=n\geq 2$ and $1\in R(\mathcal{A})$ . Then there exists an admissible transformation $(P,Q)$ such that $(uQ,PAQ,Pv)$ is of the form (4.9).

Proof.

The proof is similar to the previous one switching the role of left and right family. ∎

Remark. If $1\in R(\mathcal{A})$ for some minimal ALS $\mathcal{A}=(u,A,v)$ , say $\dim\mathcal{A}=n$ , then, by Lemma 4.19, there is an admissible transformation $(P,Q)$ such that the first column in $PAQ$ is $[1,0,\ldots,0]^{\!\top}$ . So, if the first column of $A=(a_{ij})$ is not in this form, an admissible transformation can be found in two steps: Firstly, we can set up a linear system to determine an $(n-1)$ -duple of scalars $(\mu_{2},\mu_{3},\ldots,\mu_{n})$ such that $a_{i1}+\mu_{2}a_{i2}+\mu_{3}a_{i3}+\ldots+\mu_{n}a_{in}$ is in $\mathbb{K}$ for $i=1,2,\ldots,n$ . Secondly, we use elementary row transformations (Gaussian elimination in the first column) and —if necessary— permutations to get the desired form of the first column. Together, these transformations give some (admissible) transformation $(P^{\prime},Q^{\prime})$ .

An analogous procedure can be applied if $1\in L(\mathcal{A})$ . And it can be combined for the case as in Lemma 4.5 (Inverse Type $(1,1)$ ). It works more generally for non-minimal systems, but can fail in “pathological” cases. Compare with Example 2.5.

Theorem 4.20 (Minimal Inverse).

Let $0\neq f\in\mathbb{K}(\!\langle X\rangle\!)$ be given by the minimal system $\mathcal{A}=(u,A,v)$ of dimension $n$ . Then a minimal admissible linear system for $f^{-1}$ is given by

[TABLE]

provided that the necessary transformations according to Lemma 4.18 and 4.19 are done before.

Proof.

See Lemma 4.5, 4.8, 4.11 and 4.14. ∎

There are two immediate consequences: The first follows from Proposition 4.16, that is, the inverse type $(1,1)$ applies in particular to polynomials. The second can be used to distinguish between “trivial” units (in the ground field $\mathbb{K}$ ) and “non-trivial” units, that is, elements in $\mathbb{K}(\!\langle X\rangle\!)\setminus\mathbb{K}$ .

Corollary 4.21.

Let $p\in\mathbb{K}\langle X\rangle$ with $\operatorname{rank}p=n\geq 2$ . Then $\operatorname{rank}(p^{-1})=n-1$ .

Corollary 4.22.

Let $0\neq f\in\mathbb{F}$ . Then $f\in\mathbb{K}$ if and only if $\operatorname{rank}(f)=\operatorname{rank}(f^{-1})=1$ .

Acknowledgement

Special thanks go to Marek Bożejko and Victor Vinnikov for the hospitality and all the motivating discussions in Wrocław and Beer-Sheva, respectively. However, I am very grateful to Franz Lehner. He gave me the chance to enter the world of non-commutative mathematics. Without the freedom, support and advice he offered this work would not have been possible. Additonally I thank the anonymous referees for the valuable comments.

Bibliography26

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[AL 50] A. S. Amitsur and J. Levitzki. Minimal identities for algebras. Proc. Amer. Math. Soc. , 1:449–463, 1950.
2[Ami 66] S. A. Amitsur. Rational identities and applications to algebra and geometry. J. Algebra , 3:304–359, 1966.
3[And 13] G. W. Anderson. Convergence of the largest singular value of a polynomial in independent Wigner matrices. Ann. Probab. , 41(3B):2103–2181, 2013.
4[Ber 78] G. M. Bergman. The diamond lemma for ring theory. Adv. in Math. , 29(2):178–218, 1978.
5[BMS 17] S. T. Belinschi, T. Mai, and R. Speicher. Analytic subordination theory of operator-valued free additive convolution and the solution of a general random matrix problem. J. Reine Angew. Math. , 732:21–53, 2017.
6[BR 11] J. Berstel and C. Reutenauer. Noncommutative rational series with applications , volume 137 of Encyclopedia of Mathematics and its Applications . Cambridge University Press, Cambridge, 2011.
7[CC 80] A. Cardon and M. Crochemore. Détermination de la représentation standard d’une série reconnaissable. RAIRO Inform. Théor. , 14(4):371–379, 1980.
8[Coh 72] P. M. Cohn. Generalized rational identities. In Ring theory (Proc. Conf., Park City, Utah, 1971) , pages 107–115. Academic Press, New York, 1972.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Linearizing the Word Problem

Abstract

Introduction

1 Representing Elements

Definition 1.1** **(Inner Rank, Full Matrix, Hollow Matrix

Definition 1.2** **(Associated and Stably Associated Matrices

Lemma 1.3** ([Coh95, Corollary 6.3.6]).**

Definition 1.4** **(Linear Representations

Definition 1.5** ([CR99]).**

Definition 1.6** **(Rank

Definition 1.7**.**

Definition 1.8** **(Left and Right Families

Proposition 1.9** ([CR94], Proposition 4.7).**

Definition 1.10** **(Admissible Linear Systems

Definition 1.11** (Admissible Transformations).**

Remark 1.12** (Elementary Transformations).**

Proposition 1.13** **(Rational Operations

Remark 1.14**.**

Definition 1.15** **(Linearization

Proposition 1.16** ([BMS17, Proposition 3.2]).**

Remark 1.17**.**

Proposition 1.18**.**

Proof.

Example 1.19**.**

Definition 1.20** **(Realization

2 The Word Problem

Theorem 2.1** ([CR99, Theorem 4.1]).**

Theorem 2.2** ([CR99, Theorem 1.4]).**

Lemma 2.3**.**

Proof.

Theorem 2.4** (“Linear solution” of the Word Problem).**

Proof.

Example 2.5**.**

3 Regular Elements

Theorem 3.1** ([Sch61]).**

Definition 3.3** (Controllability and Observability Matrix).**

4 Minimizing the Inverse

Proposition 4.1**.**

Proof.

Proposition 4.2** (Standard Inverse).**

Definition 4.4** (Standard Inverse).**

Proof.

Lemma 4.5** (Inverse Type (1,1)(1,1)(1,1)).**

Proof.

Lemma 4.8** (Inverse Type (1,0)(1,0)(1,0)).**

Proof.

Lemma 4.11** (Inverse Type (0,1)(0,1)(0,1)).**

Proof.

Lemma 4.14** (Inverse Type (0,0)(0,0)(0,0)).**

Proof.

Example 4.15**.**

Proposition 4.16** ([CR99, Proposition 2.1]).**

Example 4.17**.**

Lemma 4.18** (for Inverse Type (0,1)(0,1)(0,1)).**

Proof.

Lemma 4.19** (for Inverse Type (1,0)(1,0)(1,0)).**

Proof.

Theorem 4.20** (Minimal Inverse).**

Proof.

Corollary 4.21**.**

Corollary 4.22**.**

Acknowledgement

Definition 1.1 (Inner Rank, Full Matrix, Hollow Matrix

Definition 1.2 (Associated and Stably Associated Matrices

Lemma 1.3 ([Coh95, Corollary 6.3.6]).

Definition 1.4 (Linear Representations

Definition 1.5 ([CR99]).

Definition 1.6 (Rank

Definition 1.7.

Definition 1.8 (Left and Right Families

Proposition 1.9 ([CR94], Proposition 4.7).

Definition 1.10 (Admissible Linear Systems

Definition 1.11 (Admissible Transformations).

Remark 1.12 (Elementary Transformations).

Proposition 1.13 (Rational Operations

Remark 1.14.

Definition 1.15 (Linearization

Proposition 1.16 ([BMS17, Proposition 3.2]).

Remark 1.17.

Proposition 1.18.

Example 1.19.

Definition 1.20 (Realization

Theorem 2.1 ([CR99, Theorem 4.1]).

Theorem 2.2 ([CR99, Theorem 1.4]).

Lemma 2.3.

Theorem 2.4 (“Linear solution” of the Word Problem).

Example 2.5.

Theorem 3.1 ([Sch61]).

Definition 3.3 (Controllability and Observability Matrix).

Proposition 4.1.

Proposition 4.2 (Standard Inverse).

Definition 4.4 (Standard Inverse).

Lemma 4.5 (Inverse Type $(1,1)$ ).

Lemma 4.8 (Inverse Type $(1,0)$ ).

Lemma 4.11 (Inverse Type $(0,1)$ ).

Lemma 4.14 (Inverse Type $(0,0)$ ).

Example 4.15.

Proposition 4.16 ([CR99, Proposition 2.1]).

Example 4.17.

Lemma 4.18 (for Inverse Type $(0,1)$ ).

Lemma 4.19 (for Inverse Type $(1,0)$ ).

Theorem 4.20 (Minimal Inverse).

Corollary 4.21.

Corollary 4.22.