Computational Limitations of Affine Automata
Mika Hirvensalo, Etienne Moutot, Abuzer Yakary{\i}lmaz

TL;DR
This paper investigates the computational limits of affine automata, demonstrating their simulation in logarithmic space for certain cases and establishing impossibility results for algebraic-valued affine automata, thus delineating their recognition capabilities.
Contribution
It provides new theoretical bounds on affine automata, including their simulation complexity and limitations in recognizing specific unary languages.
Findings
Bounded-error rational-valued affine automata are simulated in logarithmic space.
Algebraic-valued affine automata cannot recognize certain unary languages.
Identifies limitations of affine automata with respect to recognition power.
Abstract
We present two new results on the computational limitations of affine automata. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space. Second, we give an impossibility result for algebraic-valued affine automata. As a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued affine automata with cutpoints.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
11institutetext: Department of Mathematics and Statistics, University of Turku, FI-20014 Turku, Finland
11email: [email protected] 22institutetext: LIP, ENS de Lyon – CNRS – UCBL – Université de Lyon , École Normale Supérieure de Lyon, Lyon, France
22email: [email protected] 33institutetext: Center for Quantum Computer Science, Faculty of Computing
University of Latvia, Rīga, Latvia
33email: [email protected]
Computational Limitations of Affine Automata
Mika Hirvensalo 11
Etienne Moutot 1122 0000-0003-2073-4709
Abuzer Yakaryılmaz 33 0000-0002-2372-252X
Abstract
We present two new results on the computational limitations of affine automata. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space. Second, we give an impossibility result for algebraic-valued affine automata. As a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued affine automata with cutpoints.
1 Introduction
Finite automata are an interesting model to study since they express the very natural limitation of finite memory. They are also good computational models, since they are simpler than many others machines like pushdown automata or Turing machines. Due to this simplicity, there exists many different models of finite automata, all trying to express different computational settings. Deterministic [16], probabilistic [14] and quantum [3] finite automata (DFAs, PFAs, and QFAs, respectively) have been studied to try to understand better the computational limitations inherent to all these cases.
Recently, Díaz-Caro and Yakaryılmaz introduced a new model, called affine computation [5]. As a non-physical model, the goal of affine computation is to investigate the power of interference caused by negative amplitudes in the computation, like in the quantum case. But unlike QFAs, affine finite automata (AfAs) have unbounded state set and the final operation corresponding to quantum measurement cannot be interpreted as linear. The final operation in AfAs is analogous to renormalization in Kondacs-Watrous [11] or Latvian [2] quantum automata models.
AfAs and their certain generalizations have been investigated in a series of works [5, 21, 9, 8]. In most of the cases, affine models (e.g., bounded-error and unbouded-error AfAs, zero-error affine OBDDs, zero-error affine counter automata, etc.) have been shown more powerful than their classical or quantum counterparts. On the other hand, we still do not know too much regarding the computational limitations of AfAs. Towards this direction, we present two new results. First, we show that the computation of bounded-error rational-values affine automata is simulated in logarithmic space, and so we answer positively one of the open problems in [5]. Second, we give an impossibility result for algebraic-valued AfAs, and, as a result, we identify some unary languages (in logarithmic space) that are not recognized by algebraic-valued AfAs with cutpoints.
2 Preliminaries
For a given word , represents its -th letter. For any given class , and denotes the classes defined by the machines restricted to have rational-valued and algebraic-valued components, respectively. The logarithmic and polynomial space classes are denoted as and , respectively. We assume that the reader is familiar with the basics of automata theory.
2.1 Models
As a probability distribution (also known as a stochastic vector) we understand a (column) vector with nonnegative entries summing up to one, and a stochastic matrix (also known as a Markov matrix) here stands for a square matrix whose all columns are probability distributions.
Definition 1 (PFA)
A -state probabilistic finite automaton (PFA) over alphabet is a triplet
[TABLE]
where is a stochastic vector called initial distribution, each is a stochastic matrix, and is the final vector (each 1 in represents an accepting state).
For any input word with length , has a probability distribution of states as follows: The accepting probability corresponds to the probability of being in an accepting state after reading , which is given by
[TABLE]
Affine finite automaton (AfA) is a generalization of PFA allowing negative transition values. Only allowing negative values in the transition matrices does not add any power (generalized PFAs are equivalent to PFAs, see [19]), but affine automata introduce also a non-linear behaviour. The automaton acts like a generalized probabilistic automaton until the last operation, which is a non-linear operation called a weighting operation.
Definition 2
A vector is an affine vector if and only if its coordinates sums up to . A matrix is an affine matrix if and only if all its columns are affine vectors.
The following property is straightforward to verify, and it will ensure that affine automata are well defined.
Property 2.1
If and are affine matrices, then is also an affine matrix. In particular, if is an affine vector, then is also an affine vector.
Definition 3 (AfA)
A -state AfA over alphabet is a triplet
[TABLE]
where is an initial affine vector, each is an affine transition matrix, and is the final projection matrix, where each for .
The value computed by an affine automaton can be most conveniently be defined via the following notion:
Definition 4
Notation stands for the usual norm.
Now, the final value of the affine automaton of Definition 3 is
[TABLE]
Clearly for any input word .
Remark 1
Notice that the final value for PFAs (1) is defined as matrix product , which is a linear operation on . On the other hand, computing final value from as in (2) involves nonlinear operations such as -norm and normalization (division).
2.2 Cutpoint languages
Given a function computed by an automaton (stochastic or affine), there are different ways of defining the language of recognized by this automaton.
Definition 5 (Cutpoint languages)
A language is recognized by an automaton with cutpoint if and only if
[TABLE]
These languages are called cutpoint languages. In the case of probabilistic (resp., affine automata), the set of cut-point languages are called stochastic languages (resp., affine languages) and denoted by (resp., ).
We remark that fixing the cutpoint in the interval does not change the classes and [14, 5].
Definition 6 (Exclusive cutpoint languages)
A language is recognized by an automaton with exclusive cutpoint if and only if
[TABLE]
These languages are called exclusive cutpoint languages. In the case of probabilistic (resp., affine automata), the set of exclusive cut-point languages are called exclusive stochastic languages (resp., exclusive affine languages) and denoted by (resp., ). The complement of (resp., ) is (resp., ).
Again, we remark that fixing the cutpoint in the interval does not change the classes , , , and [14, 13, 5].
A stronger condition is to impose that accepted and rejected words are separated by a gap: the cutpoint is said to be isolated.
Definition 7 (Isolated cutpoint or bounded error)
A language is recognized by an automaton with isolated cutpoint if and only if there exist such that , and . The set of languages recognized with bounded error (or isolated cutpoint) affine automata is denoted by .
A classical result by Rabin [15] shows that isolated cutpoint stochastic languages are regular. Rabin’s proof essentially relies on two facts: 1) the function mapping the final vector into is a contraction, and 2) the state vector set is bounded. By modifying Rabin’s proof, it is possible to show that also many quantum variants of stochastic automata obey the same principle [3]: bounded-error property implies the regularity of the accepted languages. In fact, E. Jeandel generalized Rabin’s proof by demonstrating that the compactness of the state vector set together with the continuity of the final function are sufficient to guarantee the regularity of the accepted language if the cutpoint is isolated [10].
3 Logarithmic simulation
Macarie [12] proved that and . That is, the computation of any rational-valued probabilistic automaton can be simulated by an algorithm using only logarithmic space. However, this logarithmic simulation cannot be directly generalized for rational-valued affine automata due to the non-linearity of their last operation. In order to understand why, we will first reproduce the proof.
Before that, let us introduce the most important space-saving technique:
Definition 8
Notation stands for the least nonnegative integer satisfying . If and , we define . Analogously, for any matrix , we define .
The problem of recovering from the residue representation is practically resolved by the following well-known theorem.
Theorem 3.1 (The Chinese Remainder Theorem)
Let be pairwise coprime integers, be arbitrary integers, and . Then there exists an integer such that
[TABLE]
and any two integers and satisfying (3) satisfy also .
Remark 2
The above remarks and the Chinese Remainder Theorem imply that the integer ring operations can be implemented using the residue representation, and that the integers can be uncovered from the residue representations provided that 1) consists of pairwise coprime integers and 2) the integers stay in interval of length , where .
Remark 3
In order to ensure that consists of pairwise coprime integers, we select numbers from the set of prime numbers. For the reasons that will become obvious later, we will however omit the first prime .
Definition 9
is an -tuple consisting of first primes by excluding . For this selection, a consequence of the prime number theorem is that, asymptotically, .
Theorem 3.2 (Macarie
[12])
**
Proof
For a given alphabet , let be a language in and be a -state rational-valued PFA over such that
[TABLE]
We remind that, for any input word , we have
[TABLE]
Since each , there exists a number providing that each matrix , and (4) can be rewritten as
[TABLE]
and the language can be characterized as
[TABLE]
Since the original matrices are stochastic, meaning that their entries are in , it follows that each matrix has integer entries in . Moreover, implies that for every input word . As now can be computed by multiplying integer matrices, the residue representation will serve as a space-saving technique.
We will fix later, but the description of the algorithm is as follows: For each entry of , we let , and compute
[TABLE]
as all the products are computed modulo , bits are needed to compute (6). Likewise, can be computed in space for each coordinate of . The comparison can hence done in space.
Reusing the space, the comparison can be made sequentially for each coordinate of , and if any comparison gives a negative outcome, we can conclude that .
To conclude the proof, it remains to fix so that both and are smaller than . If no congruence test is negative, then the Chinese Remainder Theorem ensures that . Since , we need to select so that which is equivalent to This inequality is clearly satisfied with for large enough , and for each by choosing , where is a positive constant (depending on ).
As a final remark let us note that , the -th prime, can be generated in logarithmic space and the prime number theorem implies that bits are enough to present , since is a constant. ∎
To extend the above theorem to cover as well, auxiliary results are used.
Lemma 1 (Macarie [12])
If is an odd integer and , are also integers, then iff has the same parity as .
Proof
As , , it follows that
[TABLE]
which shows that the parity changes in the latter case since is odd. ∎
The problem of using the above lemma is that, in modular computing, numbers and are usually known only by their residue representations and , and it is not straightforward to compute the parity from the modular representation in logarithmic space. Macarie solved this problem not only for parity but also for a more general modulus (not necessarily equal to ).
Lemma 2 (Claim modified from [12])
For any integer and modulus , there is a deterministic algorithm that given and as input, produces the output in space
As a corollary of the previous lemma, Macarie presented a conclusion which implies the logarithmic space simulation of rational stochastic automata.
Lemma 3 (Claim modified from [12])
Let and . Given the residue representations of integers , , the decisions , or can be made in space.
Proof
The equality test can be done as in the proof Theorem 3.2, testing the congruence sequentially for each prime. Testing is possible by lemmata 1 and 2: First compute , then compute the parities of , , using Lemma 2 with . ∎
The following theorem is a straightforward corollary from the above:
Theorem 3.3
.
When attempting to prove an analogous result to affine automata, there is at least one obstacle: computing the final value includes the absolute values, but the absolute value is not even a well-defined operation in the modular arithmetic. For example, , but . This is actually another way to point out that, in the finite fields, there is no order relation compatible with the algebraic structure.
Hence for affine automata with matrix entries of both signs, another approach must be adopted. One obvious approach is to present an integer as a pair , and apply modular arithmetic to . The signum function and the absolute value indeed behave smoothly with respect to the product, but not with the sum, which is a major problem with this approach, since to decide the sign of the sum requires a comparison of the absolute values, which seems impossible without having the whole residue representation. The latter, in its turn seems to cost too much space resources to fit the simulation in logarithmic space.
Hence the logspace simulation for automata with matrices having both positive and negative entries seems to need another approach. It turns out that we can use the procedure introduced by Turakainen already in 1969 [17, 19].
Theorem 3.4
.
Proof
For a given alphabet , let be a language in and be a -state rational-valued AfA over such that
[TABLE]
For each , we define a new matrix as B_{i}=\left(\begin{array}[]{ccc}0&\vec{0}^{T}&0\\ \vec{c}_{i}&M_{i}&\vec{0}\\ e_{i}&\vec{d}_{i}^{T}&0\end{array}\right), where , , and are chosen so that the column and row sums of are zero. We define \vec{x}^{\prime}=\left(\begin{array}[]{c}0\\ \vec{x}\\ 0\end{array}\right) as the new initial state. For the projection matrix , we define an extension F^{\prime}=\left(\begin{array}[]{ccc}0&0&0\\ 0&F&0\\ 0&0&0\end{array}\right). It is straightforward to see that as well as .
For the next step, we introduce a matrix , whose each element is . It is then clear that and . Now we define
[TABLE]
where is selected large enough to ensure the nonnegativity of the matrix entries of each . It follows that
[TABLE]
and
[TABLE]
Similarly,
[TABLE]
Now
[TABLE]
which can further be modified by expanding the denominators away: For an integer large enough all matrices will be integer matrices and the former equation becomes
[TABLE]
Hence the inequality
[TABLE]
is equivalent to
[TABLE]
In order to verify inequality (8) in logarithmic space, it sufficient to demonstrate that the residue representations of both sides can be obtained in logarithmic space.
For that end, the residue representation of vector can be obtained in logarithmic space as in the proof of Theorem 3.2.
Trivially, the residue representation of can be found in logarithmic space, as well. In order to compute the residue representation of
[TABLE]
it is sufficient to decide whether holds. As the residue representations for each and is known, all the decisions can be made in logspace, according to Lemma 3. The same conclusion can be made for the right hand side of (8). ∎
4 A Non-affine Language
As we saw in the previous section, , and hence languages beyond , are good candidates for non-affine languages.111It is known that , so it is plausible that -complete languages are not in . In this section, we will however demonstrate that the border of non-affinity may lie considerably lower: There are languages in which are not affine.
In an earlier work [8], we applied the method of Turakainen [20] to show that there are languages in which however are not contained in . Here we will extend the previous result to show that those languages are not contained even in . (We leave open whether a similar technique can be applied for .)
Definition 10 (Lower density)
Let be a unary language. We call lower density of the limit
[TABLE]
Definition 11 (Uniformly distributed sequence)
Let be a sequence of vectors in and be an interval in . We define as .
We say that ** is uniformly distributed mod 1** if and only if for any of such type,
[TABLE]
Theorem 4.1
If satisfies the following conditions:
dens*(L) = 0.* 2. 2.
For all , there exists and an ascending sequence such that and for any irrational number , the sequence is uniformly distributed mod 1.
Then is not in .
Proof
Let’s assume for contradiction that . Then there exists an AfA with states, matrix and initial vector such that the acceptance value of is
[TABLE]
Without loss of generality, we can assume that the cutpoint equals to , and hence
Using the Jordan decomposition , one has . So the coordinates of have the form
[TABLE]
where are the eigenvalues of and are polynomials of degree less than the degree of the corresponding eigenvalue. For short, we denote , and let .
When studying expression (9), we can assume without loss of generality, that all numbers are irrational. In fact, replacing matrix with , where does not change (9), since
[TABLE]
Selecting now (where ) implies that the eigenvalues of are . The field extension is finite, and hence there is always an irrational number . It follows directly that all numbers are irrational. Hence we can assume that all the numbers are irrational in the first place.222Note that the new matrix obtained may not be affine, so it would be wrong to assume that all AfAs have to admit an equivalent one with only irrational eigenvalues. However, this does not affect this proof, since we do not require the new matrix to be affine, we only study the values that the fraction take.
By restricting to an arithmetic progression () we can also assume that no is a root of unity for . In fact, selecting N=\operatorname{lcm}\{\operatorname{ord}(\lambda_{i}/\lambda_{j})\mid\text{i\neq j\lambda_{i}/\lambda_{j} is a root of unity}\} (10) becomes
[TABLE]
where are the distinct elements of set Now for cannot be a root of unity, since would imply , which in turn implies and hence , which contradicts the assumption .
We can now write the acceptance condition equivalently as
[TABLE]
Where is the set of states of , its set of accepting states, and the complement of . According to (10), consists of combinations of absolute values of linear combination of functions of type .
We say that is of larger order than , if ; and in the case , if . If , we say that and and of the same order. It is clear that if term is of larger order than , then .
We can organize the terms in expression (10) as
[TABLE]
where each consists of terms with equal order multiplier:
[TABLE]
(for notational simplicity, we mostly omit the dependency on in the right hand side of (13)). Here is the common absolute value of all eigenvalues , and expression (12) is organized in descending order: is the sum of terms of the highest order multiplier, contains the terms of the second highest order multiplier, etc. We say that is lower than if
We will then fix a representation
[TABLE]
where is a grouping of all -terms in (12) defined as follows:
, where is chosen as the maximal number so that
[TABLE]
is a constant function . Such an exists, since for , the sum is regarded empty and , but for , all -terms are included, and then (15) becomes , which is not constant (otherwise condition 1 or 2 of the theorem would be false). 2. 2.
consists a single -term immediately lower than those in , and 3. 3.
contains the rest of the -terms, lower than
Lemma 4
If , then
Proof
Denote . Because , we have
[TABLE]
Now
[TABLE]
∎
We choose and so that the highest -term in is of order and define , , . Then clearly if and only if and each remains bounded as . To simplify the notations, we omit the primes and recycle the notations to have a new version of of (14) where -terms may tend to infinity but -terms remain bounded.
Recall that we may assume (by restricting to a arithmetic progression) that no is a root of unity. By Skolem-Mahler-Lech theorem [7], this implies that functions can have only a finite number of zeros, and in the continuation we assume that is chosen so large that no function becomes zero. Furthermore, by the main theorem of [6], then for each .333This is the only point we need the assumption that the matrix entries are algebraic. As each remains bounded, we find that tend to zero as , and hence by Lemma 4, defining
[TABLE]
we have a function with the property (-terms are lower than -terms, so they can be dropped without violating this property), when . Also by the construction it is clear that , where is a constant, and by the conditions of the theorem, this is possible only if .
Notice tat is not a constant function by construction. Also, each is a linear combination of functions of form , each can be assumed irrational, and , so we can conclude that is a continuous function formed of terms of form and of ratios . In these terms, however the behaviour is asymptotically determined by the highest -terms, so the conclusion remains even if we drop the lower terms.
By assumption, for all , the sequence is uniformly distributed modulo 1. It follows that the values are dense in the unit circle. If for some , , then for some . Then, because of the density argument, there are arbitrarily large values of for which contradicting condition 2 of the statement. Hence for each large enough. As is not a constant, there must be some so that .
Next, let be a function obtained from by replacing each occurrence of by a variable , hence each will assume its value in the unit circle. Moreover, by the assumptions of the theorem, the values of will be uniformly distributed in the unit circle.
Note that . Then, because the sequences are uniformly distributed modulo 1, it follows that any value obtained by the function can be approximated by some with arbitrary precision. The function is continuous, therefore there exists an interval on which . So, if is large enough and satisfies
[TABLE]
then , which implies and hence . Now we just have to prove that the sequence is "dense enough" to have , contradicting again condition 1.
Then, because of uniform distribution imposed by condition 2, one has
[TABLE]
And so for large enough, , with , implying , a contradiction. ∎
Corollary 1
Let be any polynomial with nonnegative coefficients and . The language is not in .
Corollary 2
The language is not in .
Proof (Proof of Corollary 1 and Corollary 2.)
Turakainen proved that these two languages satisfies the two conditions of Theorem 4.1 [20]. Therefore, these two languages not in . ∎
Acknowledgments
Yakaryılmaz was partially supported by Akadēmiskā personāla atjaunotne un kompetenču pilnveide Latvijas Universitātē līg Nr. 8.2.2.0/18/A/010 LU reģistrācijas Nr. ESS2018/289 and ERC Advanced Grant MQC. Hirvensalo was partially supported by the Väisälä Foundation and Moutot by ANR project CoCoGro (ANR-16-CE40-0005).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Andris Ambainis and John Watrous. Two-way finite automata with quantum and classical states. Theoretical Computer Science , 287(1):299–311, sep 2002.
- 2[2] A. Ambainis, M. Beaudry, M. Golovkins, A. Ķikusts, M. Mercer, and D. Thérien. Algebraic results on quantum automata. Theory of Computing Systems, 39(1):165–-188, 2006.
- 3[3] Andris Ambainis and Abuzer Yakaryılmaz. Automata and Quantum Computing. Co RR , abs/1507.0:1–32, 2015.
- 4[4] Aleksandrs Belovs, Juan Andrés Montoya, and Abuzer Yakaryılmaz. Can one quantum bit separate any pair of words with zero-error? Tech. Rep. , 1602.07967, ar Xiv, 2016.
- 5[5] Alejandro Díaz-Caro and Abuzer Yakaryılmaz. Affine computation and affine automaton. In Computer Science - Theory and Applications - 11th International Computer Science Symposium in Russia, CSR 2016, St. Petersburg, Russia, June 9-13, 2016, Proceedings , pages 146–160, 2016.
- 6[6] J.-H. Evertse. On sums of S-units and linear recurrences. Compositio Math., 53(2):225–244,1984.
- 7[7] Georges Hansel. A simple proof of the skolem-mahler-lech theorem. Theoretical Computer Science, 43(1):91–98, 1986.
- 8[8] Mika Hirvensalo, Etienne Moutot, and Abuzer Yakaryılmaz: On the computational power of affine automata. Lecture Notes in Computer Science 10168 (Proceedings of LATA 2017), pp. 405–417, 2017.
