Computational Limitations of Affine Automata

Mika Hirvensalo; Etienne Moutot; Abuzer Yakary{\i}lmaz

arXiv:1904.02428·cs.FL·April 5, 2019

Computational Limitations of Affine Automata

Mika Hirvensalo, Etienne Moutot, Abuzer Yakary{\i}lmaz

PDF

TL;DR

This paper investigates the computational limits of affine automata, demonstrating their simulation in logarithmic space for certain cases and establishing impossibility results for algebraic-valued affine automata, thus delineating their recognition capabilities.

Contribution

It provides new theoretical bounds on affine automata, including their simulation complexity and limitations in recognizing specific unary languages.

Findings

01

Bounded-error rational-valued affine automata are simulated in logarithmic space.

02

Algebraic-valued affine automata cannot recognize certain unary languages.

03

Identifies limitations of affine automata with respect to recognition power.

Abstract

We present two new results on the computational limitations of affine automata. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space. Second, we give an impossibility result for algebraic-valued affine automata. As a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued affine automata with cutpoints.

Equations78

P = (x, {M_{i} ∣ i \in Σ}, y)

P = (x, {M_{i} ∣ i \in Σ}, y)

f_{P} (w) = y^{T} M_{w^{R}} x .

f_{P} (w) = y^{T} M_{w^{R}} x .

A = (x, {M_{i} \leavevmode ∣ \leavevmode i \in Σ}, F)

A = (x, {M_{i} \leavevmode ∣ \leavevmode i \in Σ}, F)

f_{A} (w) = \frac{∣ F M _{w} v _{0} ∣}{∣ M _{w} v _{0} ∣} .

f_{A} (w) = \frac{∣ F M _{w} v _{0} ∣}{∣ M _{w} v _{0} ∣} .

L = {w \in Σ^{*} \leavevmode ∣ \leavevmode f_{A} (w) > λ} .

L = {w \in Σ^{*} \leavevmode ∣ \leavevmode f_{A} (w) > λ} .

L = {w \in Σ^{*} \leavevmode ∣ \leavevmode f_{A} (w) \neq = λ} .

L = {w \in Σ^{*} \leavevmode ∣ \leavevmode f_{A} (w) \neq = λ} .

x \equiv a_{1} (mod n_{1}), \dots, x \equiv a_{r} (mod n_{r}),

x \equiv a_{1} (mod n_{1}), \dots, x \equiv a_{r} (mod n_{r}),

L = {w \in Σ^{*} ∣ f_{P} (w) = \frac{1}{2}} .

L = {w \in Σ^{*} ∣ f_{P} (w) = \frac{1}{2}} .

f_{P} (w) = y^{T} M_{w_{n}} \dots M_{w_{1}} x .

f_{P} (w) = y^{T} M_{w_{n}} \dots M_{w_{1}} x .

f_{P} (w) = \frac{1}{D ^{n}} f_{P^{'}} (w) y^{T} M_{w_{n}}^{'} \dots M_{w_{1}}^{'} x,

f_{P} (w) = \frac{1}{D ^{n}} f_{P^{'}} (w) y^{T} M_{w_{n}}^{'} \dots M_{w_{1}}^{'} x,

L = {w \in Σ^{*} ∣ 2 f_{P^{'}} (w) = D^{n}} .

L = {w \in Σ^{*} ∣ 2 f_{P^{'}} (w) = D^{n}} .

(2 f_{P^{'}} (w) mod p) = y^{T} M_{w_{n}}^{(p)} \dots M_{w_{1}}^{(p)} x

(2 f_{P^{'}} (w) mod p) = y^{T} M_{w_{n}}^{(p)} \dots M_{w_{1}}^{(p)} x

(x-y\!\!\!\mod N)=\left\{\begin{array}[]{rl}x-y&\text{if $x\geq y$}\\ N+x-y&\text{if $x<y$},\end{array}\right.

(x-y\!\!\!\mod N)=\left\{\begin{array}[]{rl}x-y&\text{if $x\geq y$}\\ N+x-y&\text{if $x<y$},\end{array}\right.

L = {w \in Σ^{*} ∣ f_{A} (w) > \frac{1}{2}} .

L = {w \in Σ^{*} ∣ f_{A} (w) > \frac{1}{2}} .

C_{i} = B_{i} + m E,

C_{i} = B_{i} + m E,

C_{w} = B_{w} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} E,

C_{w} = B_{w} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} E,

C_{w} x^{'} = B_{w} x^{'} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} E x^{'} .

C_{w} x^{'} = B_{w} x^{'} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} E x^{'} .

F^{'} C_{w} x^{'} = F^{'} B_{w} x^{'} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} F^{'} E x^{'} .

F^{'} C_{w} x^{'} = F^{'} B_{w} x^{'} + m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} F^{'} E x^{'} .

\frac{∣ F M _{w} v _{0} ∣}{∣ M _{w} v _{0} ∣} = \frac{∣ F ^{'} B _{w} v _{0} ∣}{∣ B _{w} v _{0} ∣} = \frac{F ^{'} C _{w} v _{0}^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} F ^{'} E x ^{'}}{C _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} E x ^{'}}

\frac{∣ F M _{w} v _{0} ∣}{∣ M _{w} v _{0} ∣} = \frac{∣ F ^{'} B _{w} v _{0} ∣}{∣ B _{w} v _{0} ∣} = \frac{F ^{'} C _{w} v _{0}^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} F ^{'} E x ^{'}}{C _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} E x ^{'}}

\frac{∣ F M _{w} x ∣}{∣ M _{w} x ∣} = \frac{∣ F ^{'} B _{w} x ∣}{∣ B _{w} x ∣} = \frac{F ^{'} D _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} g ^{∣ w ∣ + 1} F ^{'} E x ^{'}}{D _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} g ^{∣ w ∣ + 1} E x ^{'}} .

\frac{∣ F M _{w} x ∣}{∣ M _{w} x ∣} = \frac{∣ F ^{'} B _{w} x ∣}{∣ B _{w} x ∣} = \frac{F ^{'} D _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} g ^{∣ w ∣ + 1} F ^{'} E x ^{'}}{D _{w} x ^{'} - m ^{∣ w ∣} ( k + 2 ) ^{∣ w ∣ - 1} g ^{∣ w ∣ + 1} E x ^{'}} .

\frac{∣ F M _{w} x ∣}{∣ M _{w} x ∣} \geq \frac{1}{2}

\frac{∣ F M _{w} x ∣}{∣ M _{w} x ∣} \geq \frac{1}{2}

2 F^{'} D_{w} x^{'} - m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} g^{∣ w ∣ + 1} F^{'} E x^{'}

2 F^{'} D_{w} x^{'} - m^{∣ w ∣} (k + 2)^{∣ w ∣ - 1} g^{∣ w ∣ + 1} F^{'} E x^{'}

a - b = a_{1} - b_{1} + \dots + a_{k} - b_{k}

a - b = a_{1} - b_{1} + \dots + a_{k} - b_{k}

\underline{d e n s} (L) = n \to \infty lim inf \frac{{ a ^{k} \in L \leavevmode ∣ \leavevmode k \leq n }}{n + 1} .

\underline{d e n s} (L) = n \to \infty lim inf \frac{{ a ^{k} \in L \leavevmode ∣ \leavevmode k \leq n }}{n + 1} .

n \to \infty lim \frac{C ( I , n )}{n} = (b_{1} - a_{1}) \dots (b_{k} - a_{k}) .

n \to \infty lim \frac{C ( I , n )}{n} = (b_{1} - a_{1}) \dots (b_{k} - a_{k}) .

f_{A} (a^{n}) = \frac{∣ P M ^{n} v ∣}{∣ M ^{n} v ∣} .

f_{A} (a^{n}) = \frac{∣ P M ^{n} v ∣}{∣ M ^{n} v ∣} .

(M^{n} v)_{j} = k = 1 \sum s p_{j k} (n) λ_{k}^{n},

(M^{n} v)_{j} = k = 1 \sum s p_{j k} (n) λ_{k}^{n},

\frac{∣ P ( α M ) ^{n} v ∣}{∣ ( α M ) ^{n} v ∣} = \frac{∣ α ^{n} P M ^{n} v ∣}{∣ α ^{n} M ^{n} v ∣} = \frac{∣ P M ^{n} v ∣}{∣ M ^{n} v ∣} .

\frac{∣ P ( α M ) ^{n} v ∣}{∣ ( α M ) ^{n} v ∣} = \frac{∣ α ^{n} P M ^{n} v ∣}{∣ α ^{n} M ^{n} v ∣} = \frac{∣ P M ^{n} v ∣}{∣ M ^{n} v ∣} .

(M^{r + m N} v)_{j} = k = 1 \sum s p_{j k} (r + m N) λ_{k}^{r} (λ_{k})^{N m} = k = 1 \sum s^{'} q_{j k} (m) μ_{k}^{m},

(M^{r + m N} v)_{j} = k = 1 \sum s p_{j k} (r + m N) λ_{k}^{r} (λ_{k})^{N m} = k = 1 \sum s^{'} q_{j k} (m) μ_{k}^{m},

f_{A} (a^{n}) > \frac{1}{2} \Leftrightarrow 2 ∣ P M^{n} v ∣ > ∣ M^{n} v ∣

f_{A} (a^{n}) > \frac{1}{2} \Leftrightarrow 2 ∣ P M^{n} v ∣ > ∣ M^{n} v ∣

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

11institutetext: Department of Mathematics and Statistics, University of Turku, FI-20014 Turku, Finland

11email: [email protected] 22institutetext: LIP, ENS de Lyon – CNRS – UCBL – Université de Lyon , École Normale Supérieure de Lyon, Lyon, France

22email: [email protected] 33institutetext: Center for Quantum Computer Science, Faculty of Computing

University of Latvia, Rīga, Latvia

33email: [email protected]

Computational Limitations of Affine Automata

Mika Hirvensalo 11

Etienne Moutot 1122 0000-0003-2073-4709

Abuzer Yakaryılmaz 33 0000-0002-2372-252X

Abstract

We present two new results on the computational limitations of affine automata. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space. Second, we give an impossibility result for algebraic-valued affine automata. As a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued affine automata with cutpoints.

1 Introduction

Finite automata are an interesting model to study since they express the very natural limitation of finite memory. They are also good computational models, since they are simpler than many others machines like pushdown automata or Turing machines. Due to this simplicity, there exists many different models of finite automata, all trying to express different computational settings. Deterministic [16], probabilistic [14] and quantum [3] finite automata (DFAs, PFAs, and QFAs, respectively) have been studied to try to understand better the computational limitations inherent to all these cases.

Recently, Díaz-Caro and Yakaryılmaz introduced a new model, called affine computation [5]. As a non-physical model, the goal of affine computation is to investigate the power of interference caused by negative amplitudes in the computation, like in the quantum case. But unlike QFAs, affine finite automata (AfAs) have unbounded state set and the final operation corresponding to quantum measurement cannot be interpreted as linear. The final operation in AfAs is analogous to renormalization in Kondacs-Watrous [11] or Latvian [2] quantum automata models.

AfAs and their certain generalizations have been investigated in a series of works [5, 21, 9, 8]. In most of the cases, affine models (e.g., bounded-error and unbouded-error AfAs, zero-error affine OBDDs, zero-error affine counter automata, etc.) have been shown more powerful than their classical or quantum counterparts. On the other hand, we still do not know too much regarding the computational limitations of AfAs. Towards this direction, we present two new results. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space, and so we answer positively one of the open problems in [5]. Second, we give an impossibility result for algebraic-valued AfAs, and, as a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued AfAs with cutpoints.

2 Preliminaries

For a given word $w$ , $w_{i}$ represents its $i$ -th letter. For any given class $\sf C$ , $\mathsf{C}_{\mathbb{Q}}$ and $\mathsf{C}_{\mathbb{A}}$ denotes the classes defined by the machines restricted to have rational-valued and algebraic-valued components, respectively. The logarithmic and polynomial space classes are denoted as $\mathsf{L}$ and $\mathsf{PSPACE}$ , respectively. We assume that the reader is familiar with the basics of automata theory.

2.1 Models

As a probability distribution (also known as a stochastic vector) we understand a (column) vector with nonnegative entries summing up to one, and a stochastic matrix (also known as a Markov matrix) here stands for a square matrix whose all columns are probability distributions.

Definition 1 (PFA)

A $k$ -state probabilistic finite automaton (PFA) $P$ over alphabet $\Sigma$ is a triplet

[TABLE]

where $\vec{x}\in\mathbb{R}^{k}$ is a stochastic vector called initial distribution, each $M_{i}\in\mathbb{R}^{k\times k}$ is a stochastic matrix, and $\vec{y}\in\mathbb{\{}0,1\}^{k}$ is the final vector (each 1 in $\vec{y}$ represents an accepting state).

For any input word $w\in\Sigma^{*}$ with length $n$ , $P$ has a probability distribution of states as follows: $M_{w}\vec{x}=M_{w_{n}}\cdots M_{w_{1}}\vec{x}.$ The accepting probability corresponds to the probability of $P$ being in an accepting state after reading $w$ , which is given by

[TABLE]

Affine finite automaton (AfA) is a generalization of PFA allowing negative transition values. Only allowing negative values in the transition matrices does not add any power (generalized PFAs are equivalent to PFAs, see [19]), but affine automata introduce also a non-linear behaviour. The automaton acts like a generalized probabilistic automaton until the last operation, which is a non-linear operation called a weighting operation.

Definition 2

A vector $\vec{v}\in\mathbb{R}^{k}$ is an affine vector if and only if its coordinates sums up to $1$ . A matrix $M$ is an affine matrix if and only if all its columns are affine vectors.

The following property is straightforward to verify, and it will ensure that affine automata are well defined.

Property 2.1

If $M$ and $N$ are affine matrices, then $MN$ is also an affine matrix. In particular, if $\vec{v}$ is an affine vector, then $M\vec{v}$ is also an affine vector.

Definition 3 (AfA)

A $k$ -state AfA $A$ over alphabet $\Sigma$ is a triplet

[TABLE]

where $\vec{x}$ is an initial affine vector, each $M_{i}$ is an affine transition matrix, and $F=\operatorname{diag}(\delta_{1},\ldots,\delta_{n})$ is the final projection matrix, where each $\delta_{i}\in\{0,1\}$ for $1\leq i\leq n$ .

The value computed by an affine automaton can be most conveniently be defined via the following notion:

Definition 4

Notation $|\vec{v}|=\sum_{i}|v_{i}|$ stands for the usual $L^{1}$ norm.

Now, the final value of the affine automaton $A$ of Definition 3 is

[TABLE]

Clearly $f_{A}(w)\in[0,1]$ for any input word $w\in\Sigma^{*}$ .

Remark 1

Notice that the final value for PFAs (1) is defined as matrix product $\vec{v}_{f}\mapsto\vec{y}^{T}\vec{v}_{f}$ , which is a linear operation on $\vec{v}_{f}$ . On the other hand, computing final value from $\vec{v}_{f}$ as in (2) involves nonlinear operations $\displaystyle\vec{v}_{f}\mapsto\frac{|F\vec{v}_{f}|}{|\vec{v}_{f}|}$ such as $L^{1}$ -norm and normalization (division).

2.2 Cutpoint languages

Given a function $f:\Sigma^{*}\to[0,1]$ computed by an automaton (stochastic or affine), there are different ways of defining the language of recognized by this automaton.

Definition 5 (Cutpoint languages)

A language $L\subseteq\Sigma^{*}$ is recognized by an automaton $A$ with cutpoint $\lambda\in[0,1)$ if and only if

[TABLE]

These languages are called cutpoint languages. In the case of probabilistic (resp., affine automata), the set of cut-point languages are called stochastic languages (resp., affine languages) and denoted by $\mathsf{SL}$ (resp., $\mathsf{AfL}$ ).

We remark that fixing the cutpoint in the interval $(0,1)$ does not change the classes $\mathsf{SL}$ and $\mathsf{AfL}$ [14, 5].

Definition 6 (Exclusive cutpoint languages)

A language $L\subseteq\Sigma^{*}$ is recognized by an automaton $A$ with exclusive cutpoint $\lambda\in[0,1]$ if and only if

[TABLE]

These languages are called exclusive cutpoint languages. In the case of probabilistic (resp., affine automata), the set of exclusive cut-point languages are called exclusive stochastic languages (resp., exclusive affine languages) and denoted by $\mathsf{SL}^{\neq}$ (resp., $\mathsf{AfL}^{\neq}$ ). The complement of $\mathsf{SL}^{\neq}$ (resp., $\mathsf{AfL}^{\neq}$ ) is $\mathsf{SL}^{=}$ (resp., $\mathsf{AfL}^{=}$ ).

Again, we remark that fixing the cutpoint in the interval $(0,1)$ does not change the classes $\mathsf{SL}^{\neq}$ , $\mathsf{SL}^{=}$ , $\mathsf{AfL}^{\neq}$ , and $\mathsf{AfL}^{=}$ [14, 13, 5].

A stronger condition is to impose that accepted and rejected words are separated by a gap: the cutpoint is said to be isolated.

Definition 7 (Isolated cutpoint or bounded error)

A language $L$ is recognized by an automaton $A$ with isolated cutpoint $\lambda$ if and only if there exist $\delta>0$ such that $\forall w\in L,f_{A}(w)\geq\lambda+\delta$ , and $\forall w\notin L,f_{A}(w)\leq\lambda-\delta$ . The set of languages recognized with bounded error (or isolated cutpoint) affine automata is denoted by $\mathsf{BAfL}$ .

A classical result by Rabin [15] shows that isolated cutpoint stochastic languages are regular. Rabin’s proof essentially relies on two facts: 1) the function mapping the final vector into $[0,1]$ is a contraction, and 2) the state vector set is bounded. By modifying Rabin’s proof, it is possible to show that also many quantum variants of stochastic automata obey the same principle [3]: bounded-error property implies the regularity of the accepted languages. In fact, E. Jeandel generalized Rabin’s proof by demonstrating that the compactness of the state vector set together with the continuity of the final function are sufficient to guarantee the regularity of the accepted language if the cutpoint is isolated [10].

3 Logarithmic simulation

Macarie [12] proved that $\mathsf{SL}^{=}_{\mathbb{Q}}\subseteq\mathsf{L}$ and $\mathsf{SL}_{\mathbb{Q}}\subseteq\mathsf{L}$ . That is, the computation of any rational-valued probabilistic automaton can be simulated by an algorithm using only logarithmic space. However, this logarithmic simulation cannot be directly generalized for rational-valued affine automata due to the non-linearity of their last operation. In order to understand why, we will first reproduce the proof.

Before that, let us introduce the most important space-saving technique:

Definition 8

Notation $(b\mod c)$ stands for the least nonnegative integer $a$ satisfying $a\equiv b\pmod{c}$ . If $\vec{x}=(x_{1},\ldots,x_{r})$ and $\vec{n}=(n_{1},\ldots,n_{r})\in\mathbb{Z}^{r}$ , we define $\vec{x}\pmod{\vec{n}}=((x_{1}\!\!\!\mod n_{1}),\ldots,(x_{r}\!\!\!\mod n_{r}))$ . Analogously, for any matrix $A\in\mathbb{Z}^{k\times k}$ , we define $(A\pmod{n})_{ij}=(A_{ij}\mod n)$ .

The problem of recovering $x$ from the residue representation $((x\!\!\!\mod n_{1}),\ldots,$ $(x\!\!\!\mod n_{r}))$ is practically resolved by the following well-known theorem.

Theorem 3.1 (The Chinese Remainder Theorem)

Let $n_{1},\ldots,n_{r}$ be pairwise coprime integers, $a_{1},\ldots,a_{r}$ be arbitrary integers, and $N=n_{1}\cdots n_{r}$ . Then there exists an integer $x$ such that

[TABLE]

and any two integers $x_{1}$ and $x_{2}$ satisfying (3) satisfy also $x_{1}\equiv x_{2}\pmod{N}$ .

Remark 2

The above remarks and the Chinese Remainder Theorem imply that the integer ring operations $(+,\cdot)$ can be implemented using the residue representation, and that the integers can be uncovered from the residue representations provided that 1) $\vec{n}=(n_{1},\ldots,n_{r})$ consists of pairwise coprime integers and 2) the integers stay in interval of length $N-1$ , where $N=n_{1}\cdots n_{r}$ .

Remark 3

In order to ensure that $\vec{n}=(n_{1},\ldots,n_{r})$ consists of pairwise coprime integers, we select numbers $n_{i}$ from the set of prime numbers. For the reasons that will become obvious later, we will however omit the first prime $2$ .

Definition 9

$\vec{p}_{r}$ is an $r$ -tuple $\vec{p}_{r}=(3,5,7,\ldots,p_{r})$ consisting of $r$ first primes by excluding $2$ . For this selection, a consequence of the prime number theorem is that, asymptotically, $P_{r}=3\cdot 5\cdot 7\cdot\leavevmode\nobreak\ \cdots\leavevmode\nobreak\ \cdot p_{r}=\frac{1}{2}e^{(1+o(1))r\ln r}$ .

Theorem 3.2 (Macarie

[12])

$\mathsf{SL}^{=}_{\mathbb{Q}}\subseteq\mathsf{L}$ **

Proof

For a given alphabet $\Sigma$ , let $L\in\Sigma^{*}$ be a language in $\mathsf{SL}^{=}_{\mathbb{Q}}$ and $P=(\vec{x},\{M_{i}\mid i\in\Sigma\},\vec{y})$ be a $k$ -state rational-valued PFA over $\Sigma$ such that

[TABLE]

We remind that, for any input word $w=w_{1}\cdots w_{n}\in\Sigma^{*}$ , we have

[TABLE]

Since each $M_{i}\in\mathbb{Q}^{k\times k}$ , there exists a number $D\in\mathbb{N}$ providing that each matrix $M_{i}^{\prime}=DM_{i}\in\mathbb{Z}^{k\times k}$ , and (4) can be rewritten as

[TABLE]

and the language $L$ can be characterized as

[TABLE]

Since the original matrices $M_{i}$ are stochastic, meaning that their entries are in $[0,1]$ , it follows that each matrix $M_{i}^{\prime}=DM_{i}$ has integer entries in $[0,D]$ . Moreover, $f_{P}(w)\in[0,1]$ implies that $f_{P^{\prime}}(w)\in[0,D^{n}]$ for every input word $w\in\Sigma^{n}$ . As now $f_{P^{\prime}}(w)$ can be computed by multiplying $k\times k$ integer matrices, the residue representation will serve as a space-saving technique.

We will fix $r$ later, but the description of the algorithm is as follows: For each entry $p$ of $\vec{p}_{r}=(3,5,7,\ldots,p_{r})$ , we let $M_{i}^{(p)}=M_{i}^{\prime}\mod p$ , and compute

[TABLE]

as all the products are computed modulo $p$ , $k^{2}\log p$ bits are needed to compute (6). Likewise, $(D^{n}\!\!\!\mod p)$ can be computed in space $O(\log p)$ for each coordinate $p$ of $\vec{p}_{r}$ . The comparison $2f_{P^{\prime}}(w)\equiv D^{n}\pmod{p}$ can hence done in $O(\log p)$ space.

Reusing the space, the comparison can be made sequentially for each coordinate of $\vec{p}_{r}$ , and if any comparison gives a negative outcome, we can conclude that $2P^{\prime}(w)\neq D^{n}$ .

To conclude the proof, it remains to fix $r$ so that both $2f_{P^{\prime}}(w)$ and $D^{n}$ are smaller than $P_{r}=3\cdot 5\cdot 7\cdot\leavevmode\nobreak\ \cdots\leavevmode\nobreak\ \cdot p_{r}$ . If no congruence test is negative, then the Chinese Remainder Theorem ensures that $2f_{P^{\prime}}(w)=D^{n}$ . Since $2f_{P^{\prime}}(w)\leq D^{n}$ , we need to select $r$ so that $\frac{1}{2}e^{(1+o(1))r\ln r}>2D^{n},$ which is equivalent to $\log\frac{1}{2}+(1+o(1))r\ln r>\log 2+n\log D.$ This inequality is clearly satisfied with $r=n$ for large enough $n$ , and for each $n\geq 1$ by choosing $r=c\cdot n$ , where $c$ is a positive constant (depending on $D$ ).

As a final remark let us note that $p_{\lfloor cn\rfloor}$ , the $\lfloor cn\rfloor$ -th prime, can be generated in logarithmic space and the prime number theorem implies that $O(\log n)$ bits are enough to present $p_{\lfloor cn\rfloor}$ , since $c$ is a constant. ∎

To extend the above theorem to cover $\mathsf{SL}_{\mathbb{Q}}$ as well, auxiliary results are used.

Lemma 1 (Macarie [12])

If $N$ is an odd integer and $x$ , $y\in[0,N-1]$ are also integers, then $x\geq y$ iff $x-y$ has the same parity as $((x-y)\!\!\!\mod N)$ .

Proof

As $x$ , $y\in[0,N-1]$ , it follows that

[TABLE]

which shows that the parity changes in the latter case since $N$ is odd. ∎

The problem of using the above lemma is that, in modular computing, numbers $x$ and $y$ are usually known only by their residue representations $\operatorname{Res}_{\vec{p}_{r}}(x)$ and $\operatorname{Res}_{\vec{p}_{r}}(y)$ , and it is not straightforward to compute the parity from the modular representation in logarithmic space. Macarie solved this problem not only for parity but also for a more general modulus (not necessarily equal to $2$ ).

Lemma 2 (Claim modified from [12])

For any integer $x$ and modulus $\vec{p}_{r}=(3,5,7,\ldots,p_{r})$ , there is a deterministic algorithm that given $\operatorname{Res}_{\vec{p}_{r}}(x)$ and $M\in\mathbb{Z}$ as input, produces the output $x\pmod{M}$ in space $O(\log p_{r}+\log M)$

As a corollary of the previous lemma, Macarie presented a conclusion which implies the logarithmic space simulation of rational stochastic automata.

Lemma 3 (Claim modified from [12])

Let $\vec{p}_{r}=(3,5,7,\ldots,p_{r})$ and $P_{r}=3\cdot 5\cdot 7\cdot\leavevmode\nobreak\ \cdots\leavevmode\nobreak\ \cdot p_{r}$ . Given the residue representations of integers $x$ , $y\in[0,P_{r}-1]$ , the decisions $x>y$ , $x=y$ or $x<y$ can be made in $O(\log p_{r})$ space.

Proof

The equality test can be done as in the proof Theorem 3.2, testing the congruence sequentially for each prime. Testing $x\geq y$ is possible by lemmata 1 and 2: First compute $\operatorname{Res}_{\vec{p}_{r}}(z)=\operatorname{Res}_{\vec{p}_{r}}(x)-\operatorname{Res}_{\vec{p}_{r}}(y)\pmod{\vec{p}_{r}}$ , then compute the parities of $x$ , $y$ , $z$ using Lemma 2 with $M=2$ . ∎

The following theorem is a straightforward corollary from the above:

Theorem 3.3

$\mathsf{SL}_{\mathbb{Q}}\subseteq\mathsf{L}$ .

When attempting to prove an analogous result to affine automata, there is at least one obstacle: computing the final value includes the absolute values, but the absolute value is not even a well-defined operation in the modular arithmetic. For example, $2\equiv-3\pmod{5}$ , but $\left|2\right|\not\equiv\left|-3\right|\pmod{5}$ . This is actually another way to point out that, in the finite fields, there is no order relation compatible with the algebraic structure.

Hence for affine automata with matrix entries of both signs, another approach must be adopted. One obvious approach is to present an integer $n$ as a pair $(\left|n\right|,\operatorname{sgn}(n))$ , and apply modular arithmetic to $\left|n\right|$ . The signum function and the absolute value indeed behave smoothly with respect to the product, but not with the sum, which is a major problem with this approach, since to decide the sign of the sum requires a comparison of the absolute values, which seems impossible without having the whole residue representation. The latter, in its turn seems to cost too much space resources to fit the simulation in logarithmic space.

Hence the logspace simulation for automata with matrices having both positive and negative entries seems to need another approach. It turns out that we can use the procedure introduced by Turakainen already in 1969 [17, 19].

Theorem 3.4

$\mathsf{AfL}_{\mathbb{Q}}\subseteq\mathsf{L}$ .

Proof

For a given alphabet $\Sigma$ , let $L\in\Sigma^{*}$ be a language in $\mathsf{AfL}_{\mathbb{Q}}$ and $A=(\vec{x},\{M_{i}\mid i\in\Sigma\},F)$ be a $k$ -state rational-valued AfA over $\Sigma$ such that

[TABLE]

For each $M_{i}\in\mathbb{Q}^{k\times k}$ , we define a new matrix as $B_{i}=\left(\begin{array}[]{ccc}0&\vec{0}^{T}&0\\ \vec{c}_{i}&M_{i}&\vec{0}\\ e_{i}&\vec{d}_{i}^{T}&0\end{array}\right),$ where $\vec{c}_{i}$ , $\vec{d}_{i}$ , and $e_{i}$ are chosen so that the column and row sums of $B_{i}$ are zero. We define $\vec{x}^{\prime}=\left(\begin{array}[]{c}0\\ \vec{x}\\ 0\end{array}\right)$ as the new initial state. For the projection matrix $F$ , we define an extension $F^{\prime}=\left(\begin{array}[]{ccc}0&0&0\\ 0&F&0\\ 0&0&0\end{array}\right).$ It is straightforward to see that $\left|B_{w}\vec{v}_{0}^{\prime}\right|=\left|M_{w}v_{0}\right|$ as well as $\left|F^{\prime}B_{w}\vec{v}_{0}^{\prime}\right|=\left|FM_{w}v_{0}\right|$ .

For the next step, we introduce a $(k+2)\times(k+2)$ matrix $\mathbb{E}$ , whose each element is $1$ . It is then clear that $\mathbb{E}^{n}=(k+2)^{n-1}\mathbb{E}$ and $B_{i}\mathbb{E}=\mathbb{E}B_{i}=\mathbf{0}$ . Now we define

[TABLE]

where $m\in\mathbb{Z}$ is selected large enough to ensure the nonnegativity of the matrix entries of each $C_{i}$ . It follows that

[TABLE]

and

[TABLE]

Similarly,

[TABLE]

Now

[TABLE]

which can further be modified by expanding the denominators away: For an integer $g$ large enough all matrices $D_{i}=gC_{i}$ will be integer matrices and the former equation becomes

[TABLE]

Hence the inequality

[TABLE]

is equivalent to

[TABLE]

In order to verify inequality (8) in logarithmic space, it sufficient to demonstrate that the residue representations of both sides can be obtained in logarithmic space.

For that end, the residue representation of vector $\vec{a}=F^{\prime}D_{w}\vec{x}^{\prime}\in\mathbb{R}^{k+2}$ can be obtained in logarithmic space as in the proof of Theorem 3.2.

Trivially, the residue representation of $\vec{b}=m^{\left|w\right|}(k+2)^{\left|w\right|-1}g^{\left|w\right|+1}F^{\prime}{\mathbb{E}}\vec{x}^{\prime}\in\mathbb{R}^{k+2}$ can be found in logarithmic space, as well. In order to compute the residue representation of

[TABLE]

it is sufficient to decide whether $\vec{a}_{i}\geq\vec{b}_{i}$ holds. As the residue representations for each $\vec{a}_{i}$ and $\vec{b}_{i}$ is known, all the decisions can be made in logspace, according to Lemma 3. The same conclusion can be made for the right hand side of (8). ∎

4 A Non-affine Language

As we saw in the previous section, $\mathsf{AfL}_{\mathbb{Q}}\subseteq\mathsf{L}$ , and hence languages beyond $\mathsf{L}$ , are good candidates for non-affine languages.111It is known that $\mathsf{L}\subsetneq\mathsf{PSPACE}$ , so it is plausible that $\mathsf{PSPACE}$ -complete languages are not in $\mathsf{AfL}_{\mathbb{Q}}$ . In this section, we will however demonstrate that the border of non-affinity may lie considerably lower: There are languages in $\mathsf{L}$ which are not affine.

In an earlier work [8], we applied the method of Turakainen [20] to show that there are languages in $\mathsf{L}$ which however are not contained in $\mathsf{BAfL}$ . Here we will extend the previous result to show that those languages are not contained even in $\mathsf{AfL}_{\mathbb{A}}$ . (We leave open whether a similar technique can be applied for $\mathsf{AfL}$ .)

Definition 10 (Lower density)

Let $L\subseteq a^{*}$ be a unary language. We call lower density of $L$ the limit

[TABLE]

Definition 11 (Uniformly distributed sequence)

Let $(\textbf{x}_{n})$ be a sequence of vectors in $\mathbb{R}^{k}$ and $I=[a_{1},b_{1})\times\dots\times[a_{k},b_{k})$ be an interval in $\mathbb{R}^{k}$ . We define $C(I,n)$ as $C(I,n)=\left|\{\textbf{x}_{i}\mod 1\in I\leavevmode\nobreak\ |\leavevmode\nobreak\ 1\leq i\leq n\}\right|$ .

We say that $(\textbf{x}_{n})$ ** is uniformly distributed mod 1** if and only if for any $I$ of such type,

[TABLE]

Theorem 4.1

If $L\subseteq a^{*}$ satisfies the following conditions:

dens*(L) = 0.* 2. 2.

For all $N\in\mathbb{N}$ , there exists $r\in\mathbb{N}$ and an ascending sequence $(m_{i})\in\mathbb{N}$ such that $a^{r+m_{i}N}\subseteq L$ and for any irrational number $\alpha$ , the sequence $\left((r+m_{i}N)\alpha\right)$ is uniformly distributed mod 1.

Then $L$ is not in $\mathsf{AfL}_{\mathbb{A}}$ .

Proof

Let’s assume for contradiction that $L\in\mathsf{AfL}_{\mathbb{A}}$ . Then there exists an AfA $A$ with $s$ states, matrix $M$ and initial vector $\vec{v}$ such that the acceptance value of $A$ is

[TABLE]

Without loss of generality, we can assume that the cutpoint equals to $\frac{1}{2}$ , and hence $w\in L\Leftrightarrow f_{A}(w)>\frac{1}{2}.$

Using the Jordan decomposition $M=PJP^{-1}$ , one has $M^{n}=PJ^{n}P^{-1}$ . So the coordinates of $M^{n}\vec{v}$ have the form

[TABLE]

where $\lambda_{k}$ are the eigenvalues of $M$ and $p_{jk}$ are polynomials of degree less than the degree of the corresponding eigenvalue. For short, we denote $F(n)=f_{A}(a^{n})$ , and let $\lambda_{k}=\left|\lambda_{k}\right|e^{2i\pi\theta_{k}}$ .

When studying expression (9), we can assume without loss of generality, that all numbers $\theta_{k}$ are irrational. In fact, replacing matrix $M$ with $\alpha M$ , where $\alpha\neq 0$ does not change (9), since

[TABLE]

Selecting now $\alpha=e^{2\pi i\theta}$ (where $\theta\in\mathbb{R}$ ) implies that the eigenvalues of $M$ are $\lambda_{k}e^{2i\pi(\theta_{k}+\theta)}$ . The field extension $\mathbb{Q}(\theta_{1},\ldots,\theta_{s})$ is finite, and hence there is always an irrational number $\theta\notin\mathbb{Q}(\theta_{1},\ldots,\theta_{s})$ . It follows directly that all numbers $\theta_{k}+\theta$ are irrational. Hence we can assume that all the numbers $\theta_{k}$ are irrational in the first place.222Note that the new matrix obtained may not be affine, so it would be wrong to assume that all AfAs have to admit an equivalent one with only irrational eigenvalues. However, this does not affect this proof, since we do not require the new matrix to be affine, we only study the values that the fraction $\frac{\left|P(\alpha M)^{n}\vec{v}\right|}{\left|(\alpha M)^{n}\vec{v}\right|}=\frac{\left|PM^{n}\vec{v}\right|}{\left|M^{n}\vec{v}\right|}$ take.

By restricting to an arithmetic progression $n=r+mN$ ( $m\in\mathbb{N}$ ) we can also assume that no $\lambda_{i}/\lambda_{j}$ is a root of unity for $i\neq j$ . In fact, selecting $N=\operatorname{lcm}\{\operatorname{ord}(\lambda_{i}/\lambda_{j})\mid\text{$ i\neq j $and$ \lambda_{i}/\lambda_{j} $is a root of unity}\}$ (10) becomes

[TABLE]

where $\{\mu_{1},\ldots,\mu_{s^{\prime}}\}$ are the distinct elements of set $\{\lambda_{1}^{N},\ldots,\lambda_{s}^{N}\}$ Now for $i\neq j$ $\mu_{i}/\mu_{j}$ cannot be a root of unity, since $(\mu_{i}/\mu_{j})^{t}=1$ would imply $(\lambda_{i^{\prime}}/\lambda_{j^{\prime}})^{Nt}=1$ , which in turn implies $(\lambda_{i^{\prime}}/\lambda_{j^{\prime}})^{N}=1$ and hence $\mu_{i}=\lambda_{i^{\prime}}^{N}=\lambda_{j^{\prime}}^{N}=\mu_{j}$ , which contradicts the assumption $\mu_{i}\neq\mu_{j}$ .

We can now write the acceptance condition $f_{A}(a^{n})>\frac{1}{2}$ equivalently as

[TABLE]

Where $E$ is the set of states of $A$ , $E_{a}\subseteq E$ its set of accepting states, and $\overline{E_{a}}$ the complement of $E_{a}$ . According to (10), $g(n):=\sum_{j\in E_{a}}\left|(M^{n}\vec{v})_{j}\right|-\sum_{j\in\overline{E_{a}}}\left|(M^{n}\vec{v})_{j}\right|$ consists of combinations of absolute values of linear combination of functions of type $n^{d}\lambda^{n}$ .

We say that $n^{d_{1}}\lambda_{1}^{n}$ is of larger order than $n^{d_{2}}\lambda_{2}^{n}$ , if $\left|\lambda_{1}\right|>\left|\lambda_{2}\right|$ ; and in the case $\left|\lambda_{1}\right|=\left|\lambda_{2}\right|$ , if $d_{1}>d_{2}$ . If $\left|\lambda_{1}\right|=\left|\lambda_{2}\right|$ , we say that $n^{d}\lambda_{1}^{n}$ and $n^{d}\lambda_{2}^{n}$ and of the same order. It is clear that if term $t_{1}(n)$ is of larger order than $t_{2}(n)$ , then $\displaystyle\lim_{n\to\infty}\frac{t_{2}(n)}{t_{1}(n)}=0$ .

We can organize the terms in expression (10) as

[TABLE]

where each $\Lambda^{(m)}_{j}(n)$ consists of terms with equal order multiplier:

[TABLE]

(for notational simplicity, we mostly omit the dependency on $j$ in the right hand side of (13)). Here $\lambda_{m}\in\mathbb{R}_{+}$ is the common absolute value of all eigenvalues $\lambda_{mk}=\lambda_{m}e^{2\pi i\theta_{mk}}$ , and expression (12) is organized in descending order: $\Lambda^{(N)}_{j}$ is the sum of terms of the highest order multiplier, $\Lambda^{(N-1)}_{j}$ contains the terms of the second highest order multiplier, etc. We say that $\Lambda^{(k_{2})}_{j}$ is lower than $\Lambda^{(k_{1})}_{j}$ if $k_{2}<k_{1}$

We will then fix a representation

[TABLE]

where $A_{j}(n)+B_{j}(n)+C_{j}(n)$ is a grouping of all $\Lambda$ -terms in (12) defined as follows:

$\displaystyle A_{j}(n)=\sum_{k=0}^{m}\Lambda_{j}^{(N-k)}(n)$ , where $m\in[-1,N]\cap\mathbb{Z}$ is chosen as the maximal number so that

[TABLE]

is a constant function $\mathbb{N}\to\mathbb{R}$ . Such an $m$ exists, since for $m=-1$ , the sum is regarded empty and $A_{j}(n)=0$ , but for $m=N$ , all $\Lambda$ -terms are included, and then (15) becomes $f_{A}(a^{n})$ , which is not constant (otherwise condition 1 or 2 of the theorem would be false). 2. 2.

$B_{j}(n)$ consists a single $\Lambda$ -term immediately lower than those in $A_{j}(n)4$ , and 3. 3.

$C_{j}(n)$ contains the rest of the $\Lambda$ -terms, lower than $B_{j}(n)$

Lemma 4

If $A\neq 0$ , then $\forall z\in\mathbb{C},\left|A+z\right|=\left|A\right|+\operatorname{Re}{\dfrac{\left|A\right|}{A}}z+O(\dfrac{z^{2}}{A}).$

Proof

Denote $z=x+iy$ . Because $\left|\operatorname{Re}z\right|\leq\left|z\right|$ , we have

[TABLE]

Now

[TABLE]

∎

We choose $\lambda\in\mathbb{R}_{+}$ and $d$ so that the highest $\Lambda$ -term in $B(n)$ is of order $n^{d}\lambda^{n}$ and define $A^{\prime}_{j}(n)=n^{-d}\lambda^{-n}A_{j}(n)$ , $B^{\prime}_{j}(n)=n^{-d}\lambda^{-n}B_{j}(n)$ , $g^{\prime}(n)=g(n)n^{-d}\lambda^{-n}$ . Then clearly $g^{\prime}(n)>0$ if and only if $g(n)>0$ and each $B_{j}(n)$ remains bounded as $n\to\infty$ . To simplify the notations, we omit the primes and recycle the notations to have a new version of $g(n)$ of (14) where $A_{j}$ -terms may tend to infinity but $B_{j}$ -terms remain bounded.

Recall that we may assume (by restricting to a arithmetic progression) that no $\lambda_{i}/\lambda_{j}$ is a root of unity. By Skolem-Mahler-Lech theorem [7], this implies that functions $A_{j}$ can have only a finite number of zeros, and in the continuation we assume that $n$ is chosen so large that no function $A_{j}$ becomes zero. Furthermore, by the main theorem of [6], then $\left|A_{j}(n)\right|=\Omega(n^{d}\lambda^{n-\epsilon})$ for each $\epsilon>0$ .333This is the only point we need the assumption that the matrix entries are algebraic. As each $B_{j}$ remains bounded, we find that $B_{j}^{2}/A_{j}$ tend to zero as $n\to\infty$ , and hence by Lemma 4, defining

[TABLE]

we have a function $g_{1}(n)$ with the property $g_{1}(n)-g(n)\to 0$ ( $C$ -terms are lower than $B$ -terms, so they can be dropped without violating this property), when $n\to\infty$ . Also by the construction it is clear that $h(n)=C\cdot n^{d}\lambda^{n}$ , where $C$ is a constant, and by the conditions of the theorem, this is possible only if $C=0$ .

Notice tat $g_{1}(n)$ is not a constant function by construction. Also, each $B_{j}$ is a linear combination of functions of form $e^{2\pi i\theta_{k}n}$ , each $\theta_{k}$ can be assumed irrational, and $\left|\left|A_{j}(n)\right|{A_{j}(n)}=1\right|$ , so we can conclude that $g_{1}(n)$ is a continuous function formed of terms of form $ce^{i\theta_{k}n}$ and of ratios $\left|A_{j}\right|/A_{j}$ . In these terms, however the behaviour is asymptotically determined by the highest $\Lambda$ -terms, so the conclusion remains even if we drop the lower terms.

By assumption, for all $k$ , the sequence $(r+mN)\theta_{k}$ is uniformly distributed modulo 1. It follows that the values $e^{2i\pi(r+mN)\theta_{k}}$ are dense in the unit circle. If for some $m$ , $g_{1}(r+mN)<0$ , then $g_{1}(r+Nm)\leq-\varepsilon$ for some $\epsilon>0$ . Then, because of the density argument, there are arbitrarily large values of $i$ for which $g_{1}(r+m_{i}N)\leq 0$ contradicting condition 2 of the statement. Hence $g_{1}(r+mN)\geq 0$ for each $m$ large enough. As $g_{1}$ is not a constant, there must be some $m_{0}$ so that $g_{1}(m_{0})\geq\epsilon>0$ .

Next, let $R(x_{1},\ldots,x_{s})$ be a function obtained from $g_{1}$ by replacing each occurrence of $e^{i\theta_{k}n}$ by a variable $x_{k}$ , hence each $x_{k}$ will assume its value in the unit circle. Moreover, by the assumptions of the theorem, the values of $x_{k}$ will be uniformly distributed in the unit circle.

Note that $g_{1}(n)=R((e^{2i\pi(r+m_{i}N)\theta_{k}})_{k\in A})$ . Then, because the sequences $((r+m_{i}N)\theta_{k})_{i}$ are uniformly distributed modulo 1, it follows that any value obtained by the function $R((e^{2i\pi y_{k}})_{k\in A})$ can be approximated by some $g_{1}(r+m_{i}M)$ with arbitrary precision. The function $R$ is continuous, therefore there exists an interval $I=(x_{1},y_{1},...)=((x_{k},y_{k}))_{k\in A}$ on which $R((x_{k}))>\frac{\varepsilon}{2}$ . So, if $m_{i}$ is large enough and satisfies

[TABLE]

then $g_{1}(r+m_{i}N)>\frac{\varepsilon}{2}$ , which implies $f_{A}(r+m_{i}N)>0$ and hence $a^{r+m_{i}N}\in L$ . Now we just have to prove that the sequence $(r+m_{i}N)$ is "dense enough" to have $\underline{dens}(L)>0$ , contradicting again condition 1.

Then, because of uniform distribution imposed by condition 2, one has

[TABLE]

And so for $i$ large enough, $\frac{C(I,r+m_{i}N)}{r+m_{i}N}\geq\frac{d}{2}$ , with $a^{h+n_{i}Q}\in L$ , implying $\underline{dens}(L)>0$ , a contradiction. ∎

Corollary 1

Let $P$ be any polynomial with nonnegative coefficients and ${\deg(P)>2}$ . The language $\{a^{P(n)}\leavevmode\nobreak\ |\leavevmode\nobreak\ n\in\mathbb{N}\}$ is not in $\mathsf{AfL}_{\mathbb{A}}$ .

Corollary 2

The language $\{a^{p}\leavevmode\nobreak\ |\leavevmode\nobreak\ p\text{ prime}\}$ is not in $\mathsf{AfL}_{\mathbb{A}}$ .

Proof (Proof of Corollary 1 and Corollary 2.)

Turakainen proved that these two languages satisfies the two conditions of Theorem 4.1 [20]. Therefore, these two languages not in $\mathsf{AfL}_{\mathbb{A}}$ . ∎

Acknowledgments

Yakaryılmaz was partially supported by Akadēmiskā personāla atjaunotne un kompetenču pilnveide Latvijas Universitātē līg Nr. 8.2.2.0/18/A/010 LU reģistrācijas Nr. ESS2018/289 and ERC Advanced Grant MQC. Hirvensalo was partially supported by the Väisälä Foundation and Moutot by ANR project CoCoGro (ANR-16-CE40-0005).

Bibliography21

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] Andris Ambainis and John Watrous. Two-way finite automata with quantum and classical states. Theoretical Computer Science , 287(1):299–311, sep 2002.
2[2] A. Ambainis, M. Beaudry, M. Golovkins, A. Ķikusts, M. Mercer, and D. Thérien. Algebraic results on quantum automata. Theory of Computing Systems, 39(1):165–-188, 2006.
3[3] Andris Ambainis and Abuzer Yakaryılmaz. Automata and Quantum Computing. Co RR , abs/1507.0:1–32, 2015.
4[4] Aleksandrs Belovs, Juan Andrés Montoya, and Abuzer Yakaryılmaz. Can one quantum bit separate any pair of words with zero-error? Tech. Rep. , 1602.07967, ar Xiv, 2016.
5[5] Alejandro Díaz-Caro and Abuzer Yakaryılmaz. Affine computation and affine automaton. In Computer Science - Theory and Applications - 11th International Computer Science Symposium in Russia, CSR 2016, St. Petersburg, Russia, June 9-13, 2016, Proceedings , pages 146–160, 2016.
6[6] J.-H. Evertse. On sums of S-units and linear recurrences. Compositio Math., 53(2):225–244,1984.
7[7] Georges Hansel. A simple proof of the skolem-mahler-lech theorem. Theoretical Computer Science, 43(1):91–98, 1986.
8[8] Mika Hirvensalo, Etienne Moutot, and Abuzer Yakaryılmaz: On the computational power of affine automata. Lecture Notes in Computer Science 10168 (Proceedings of LATA 2017), pp. 405–417, 2017.