The abelian complexity of infinite words and the Frobenius problem
Ian Kaye, Narad Rampersad

TL;DR
This paper investigates conditions under which the abelian complexity of infinite words ensures that a semigroup homomorphism applied to their factors covers all but finitely many natural numbers, linking combinatorics and number theory.
Contribution
It introduces new conditions connecting abelian complexity of infinite words with the Frobenius problem, expanding understanding of factor sets in combinatorics on words.
Findings
Identifies specific conditions on S and abelian complexity for coverage of N
Analyzes various infinite words with different abelian complexity functions
Establishes links between combinatorics on words and number theory
Abstract
We study the following problem, first introduced by Dekking. Consider an infinite word x over an alphabet {0,1,...,k-1} and a semigroup homomorphism S:{0,1,...,k-1}* -> N. Let L_x denote the set of factors of x. What conditions on S and the abelian complexity of x guarantee that S(L_x) contains all but finitely many elements of N? We examine this question for some specific infinite words x having different abelian complexity functions.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Coding theory and cryptography · Computability, Logic, AI Algorithms
The abelian complexity of infinite words and the Frobenius problem
Ian Kaye and Narad Rampersad
Department of Mathematics and Statistics
University of Winnipeg
[email protected] The author was supported by an NSERC USRA.The author was supported by NSERC Discovery Grants 418646-2012 and RGPIN-2019-04111.
Abstract
We study the following problem, first introduced by Dekking. Consider an infinite word over an alphabet and a semigroup homomorphism . Let denote the set of factors of . What conditions on and the abelian complexity of guarantee that contains all but finitely many elements of ? We examine this question for some specific infinite words having different abelian complexity functions.
1 Introduction
It is well-known that if and are two co-prime positive integers then all sufficiently large positive integers can be written as a linear combination , where and are non-negative integers. Frobenius posed the problem of determining the largest positive integer that cannot be so represented; Sylvester [12] was the first to give a solution to Frobenius’ problem: he showed that the largest non-representable number is
[TABLE]
Ramírez Alfonsín [10] has written a monograph devoted entirely to this problem.
Dekking [6] studied the following variation of this problem. Let be a semigroup homomorphism: i.e., there are non-negative integers and such that is defined by , , and for any words and over the binary alphabet . Given an infinite word over the alphabet , let denote the set of all factors of and let denote the set of all length- factors of . Define
[TABLE]
What conditions on and ensure that is co-finite (contains all but finitely many elements of )?
Certainly and must be co-prime (and so we will assume this to be the case for the remainder of the paper). The set is closely related to the abelian complexity [14] of (as well as the additive complexity [2] of ). For any word over an alphabet , we write to denote the number of occurrences of a letter in the word and to denote the length of . If , the Parikh vector of is the vector whose -th entry equals . Let . For any , we have exactly when there is a factor of such that and . The abelian complexity function of is the function that maps to the cardinality of the set . If for all (i.e, ), then has maximal abelian complexity and it is clear that in this case is co-finite. Indeed, in this case the problem is the classical one stated by Frobenius. On the other hand, for words with lower abelian complexity functions, this may not be the case.
Dekking studied the case where is a Sturmian word. Sturmian words are the aperiodic words with the smallest possible abelian complexity; i.e., if is an aperiodic binary word then is Sturmian if and only if for all [5]. Dekking gave an explicit formula for for any Sturmian word ; this formula implies that for any given there are only finitely many maps such that is co-finite. For the Fibonacci word, Dekking characterized exactly the set of such maps . Given the close relationship between Sturmian words and Beatty sequences, we also mention the work of Steuding and Stumpf [11] concerning the Frobenius problem and Beatty sequences.
The general question we are interested in then is, “What conditions on the abelian complexity of are sufficient to ensure that is co-finite for all maps ?” (Remember, we are assuming that and are relatively prime.) If is co-finite for all maps , we say that has the Frobenius property. As previously noted, if has maximal abelian complexity, then has the Frobenius property, and if is Sturmian, then does not have the Frobenius property. In this paper we analyze some example of words whose abelian complexity is intermediate between these two extremes.
Finally, we note that the Frobenius problem can be extended from two given positive integers and to any number of given positive integers. Similarly, we can extend the notions defined above to words over larger alphabets. Recall that Dekking studied for Sturmian words , which are infinite binary words with constant abelian complexity . We examine for a certain infinite ternary word with constant abelian complexity .
To summarize, in the next sections we study:
- •
the paperfolding word , which has abelian complexity ; this word does not have the Frobenius property.
- •
a pure morphic binary word with abelian complexity ; this word has the Frobenius property.
- •
a balanced ternary word with abelian complexity for all ; this word does not have the Frobenius property.
2 The paperfolding word
In this section we examine whether the (ordinary) paperfolding word has the Frobenius property. This is a word whose abelian complexity function is unbounded, unlike that of the Sturmian words. For a nice introduction to the paperfolding words and their properties, see the series of papers by Dekking, Mendès France, and Poorten [7]. There are a number of equivalent definitions of the paperfolding word . If is a word over then the complement of is the word and the reversal of is the word . The word may be constructed as the limit of the following process: Let . Having constructed , we define . Then .
The next construction of the paperfolding word is known as the Toeplitz construction. We begin with a sequence of empty spaces and fill every second space with the alternating sequence . After infinitely many repetitions of this process, we obtain the ordinary paperfolding word . Beginning with , the first few steps in this process are
_ _ _ _ _ _ _ _ _ _ _ _ …
0 _ 1 _ 0 _ 1 _ 0 _ 1 _ …
0 0 1 _ 0 1 1 _ 0 0 1 _ …
0 0 1 0 0 1 1 _ 0 0 1 1 …
0 0 1 0 0 1 1 0 0 0 1 1 …
This construction implies the following recursive definition of :
[TABLE]
We may also define the -th term of from the binary representation of . Let be given, where is odd. Then define
[TABLE]
Madill and Rampersad [9] studied the abelian complexity of . They proved that ; however, it is also the case that takes the value infinitely often. In particular, we have
[TABLE]
This can be proved by induction on , using [9, Claim 5] (which states that ). As we will see, these low values of the abelian complexity function prevent from having the Frobenius property.
We define by and by .
Example 1**.**
For we have , , , and .
Note that for any we have , so . We need the following two facts [9, Claims 3 and 4 (and their proofs)]:
[TABLE]
Lemma 2**.**
For , the Parikh vectors do not occur in .
Proof.
Since , , are all elements of , we can apply the recursive definition (2) inductively to show that and are elements of . From (3), we see that these three vectors are the only vectors in , which establishes the claim. ∎
Theorem 3**.**
If and and then is an infinite set. In particular, the word does not have the Frobenius property.
Proof.
Suppose that and consider a positive integer with representation for some (large) . By Lemma 2, does not contain any factor with Parikh vector , so so we must look for another representation for some non-zero integer . This representation will correspond to a factor of length with Parikh vector . Then . Now by (4), we have
[TABLE]
Furthermore, by (4)–(5), we have , which implies
[TABLE]
The inequalities (6) and (7) give
[TABLE]
If we get a contradiction immediately, since and (8) becomes , which is impossible. If we have (since ), and (8) becomes . Since we find that . We conclude that if , there are infinitely many . ∎
3 A binary word with abelian complexity
In the last section we saw that the ordinary paperfolding word does not have the Frobenius property, and that in this case this is due to the fact that is bounded. This suggests that to produce an (interesting) example of an infinite word with the Frobenius property, we should consider a word with less than maximal abelian complexity but for which
[TABLE]
Let be the morphism that sends and . Let be the fixed point of that starts with [math]: that is, let
For a general morphism we define the incidence matrix of as the matrix whose column is the Parikh vector of . Blanchet-Sadri et al. [4] conducted an extensive study of the asymptotic abelian complexities of binary words generated by iterating morphisms. We will make use of several ideas from their paper in this section. Following the notation of [4], we will use to denote the number of zeroes that appear in the factor . Let and . We will also use (resp. ) to denote the maximum (resp. minimum) number of zeroes among factors of length in . The difference and delta functions are defined in [4] for a general -uniform morphism; for our morphism we have and .
Example 4**.**
For as defined above, we have , , , , , , , and
[TABLE]
From [4, Theorem 7] we get that , which is certainly not maximal. The following is the main result of this section.
Theorem 5**.**
The word has the Frobenius property.
We need a preliminary result. In the proof of this result, and again later in this section, we will need to determine, by computer search, the Parikh vectors of all factors of of length for up to some specified bound. In order to perform this computation we make use of the following fact:
If for some , then each factor of of length appears in some , where .
We also note that when performing such a computation there is no need to save all Parikh vectors for factors of length : indeed, by [14, Lemma 2.1], the Parikh vectors of factors of length in are completely determined by the pair .
Proposition 6**.**
For each integer , define . Then
* whenever and* 2. 2.
* whenever .*
Proof.
We prove part 1; part 2 is proven similarly with . For clarity, we parametrize the property
[TABLE]
Clearly, if holds for a given and for all then our proposition holds for that . Thus, we proceed by double-induction on and .
We fix and verify by computer that and thus is satisfied for . Suppose that holds for some and let . We may write for some integers with . Then and we have two cases: either or .
If then and by we have . One of the inequalities (for an -uniform morphism) in the proof of [4, Proposition 18] is
[TABLE]
which, after substituting the appropriate values for the constants for , becomes
[TABLE]
since . Thus, we have
[TABLE]
as required.
If then by [4, Lemma 13] we get
[TABLE]
as required, and so in either case, holds and by induction we have for all .
Suppose that there exist and with . Now if we may write where and . Then we have
[TABLE]
so and the result holds by induction. ∎
Corollary 7**.**
For each and , we have
[TABLE]
for all .
We will use Corollary 7 to show that, given and , every sufficiently large integer has a representation where . Theorem 5 therefore follows from the next lemma.
Lemma 8**.**
Let . Then every integer
[TABLE]
has a representation where for some .
Proof.
Suppose that is given and let for some non-negative integers (note that is larger than the quantity from (1), so such a representation exists). For each we have . Our aim is to show that there is a choice of for which . Note that, from Corollary 7, if we look at large enough factors of we eventually obtain a factor that is roughly one third 0’s. Thus, if we define , then we seek a such that and thus let . However, is not necessarily an integer, so we will use either the floor or ceiling and show the existence of a subword with length and zeroes.
We first claim that and are nonnegative (and thus it is possible to speak of a factor with length and zeroes). We have
[TABLE]
and so , , and each have the same sign. As well,
[TABLE]
so the three integers are nonnegative. Now note that replacing with only changes each expression by a small amount: and . Thus if then we have
[TABLE]
and
[TABLE]
and thus both and are nonnegative as required.
We now show that the corresponding factor exists within . We have two cases:
Case 1: . Then we have
[TABLE]
and
[TABLE]
so
[TABLE]
Case 2: . Then we have
[TABLE]
and
[TABLE]
so
[TABLE]
In either case, we may take and since
[TABLE]
by Corollary 7 we have that there exists a subword of such that and .
∎
As noted, Theorem 5 follows directly from Lemma 8. However, the bound on described in Lemma 8 is certainly not optimal; the maximum non-representable integer may be much smaller than . We therefore now compute exactly the largest value of for several small values of .
We compute the complement of based on the Parikh vectors of factors of length up to
[TABLE]
and thus for any integer , if it is representable then its representation should appear among the Parikh vectors of factors up to length . For convenience, we collected the Parikh vectors of factors up to length and then computed and its complement only using the Parikh vectors of factors of the appropriate lengths. The results are reported in Table 1.
[TABLE]
4 A ternary word with constant abelian complexity
Dekking [6] proved that Sturmian words do not have the Frobenius property. If is a Stumian word, then is balanced: i.e., for all letters , we have whenever and are factors of of the same length. Furthermore, as noted in the introduction, we have for all , and indeed, the aperiodic words with this abelian complexity function are exactly the Sturmian words. Dekking also performed a detailed analysis of for the Fibonacci word defined as follows.
Definition 9** (Fibonacci Word).**
Let and let . We define
[TABLE]
We also note that
[TABLE]
is the sequence obtained from by applying the map .
Dekking showed that is co-finite except when . If one wished to extend Dekking’s analysis to ternary words, then in this setting, the natural ternary analogue of Sturmian words are aperiodic ternary words with abelian complexity for . Currently there is no complete characterization of such words; however, Richomme, Saari, and Zamboni [14] proved that if is aperiodic, ternary, and balanced, then for .
Hubert [8] gave a useful characterization of aperiodic balanced words. The reader may consult Hubert’s paper for more details. Here, we will use his characterization to construct a word from the Fibonacci word with abelian complexity for all lengths. For ease of notation, let be the operation that sends and every second , starting with the second [math]. Similarly, let be the operation that sends and every second , starting with the first [math].
Example 10**.**
Let . Then and .
We define
[TABLE]
and we immediately have the following.
Lemma 11**.**
* for all .*
Proof.
By [8] (and its English explanation in [13, Section 4]), the word is an aperiodic, uniformly recurrent, balanced word on , so the result follows from [14, Theorem 4.2]. ∎
We will also make use of the following property.
Definition 12** (WELLDOC Property [3]).**
We say that an infinite aperiodic word on has well distributed occurrences (WELLDOC) if for every and every subword of we have
[TABLE]
Sturmian words have the WELLDOC property [3, Theorem 3.3].
Definition 13**.**
For a subset and a constant we define .
Lemma 14**.**
.
Proof.
Certainly , since any factor of is obtained by taking a factor of and and replacing every other [math] with a . Let . Without loss of generality, say for some . Then by the WELLDOC property (with ), there is an occurrence of in where it is preceded by an even number of 0’s and an occurrence where it is preceded by an odd number of 0’s. Then and both occur as subwords of . ∎
It is well-known that and . Thus we have , , , and in . We will refer to these as the generating prefixes later on. Since we only have 3 possible Parikh vectors for each , exactly two of these must be equal. This equality depends on the parity of .
Theorem 15**.**
For define . If is odd then
[TABLE]
If is even then
[TABLE]
Proof.
First note that
[TABLE]
If is odd, it is clear that
[TABLE]
since exactly half of the 0’s in will become 2’s after we apply . For , we have
[TABLE]
By Lemma 14, we get the third Parikh vector by swapping the first and last components.
If is even, we apply a similar line of reasoning to , , and , which gives the above. ∎
Let be a morphism with , and . As always, we assume that . Define
[TABLE]
(Note that is a generalized Beatty sequence, in the sense of Allouche and Dekking [1].) Using the fact that for , we see that . Using this identity and the fact that , we obtain (after some algebra) the following corollary of Theorem 15.
Corollary 16**.**
If is odd then
[TABLE]
If is even then
[TABLE]
Define
[TABLE]
We will refer to the ’s as main terms and the ’s as offsets.
Theorem 17**.**
Define . Then , where
[TABLE]
Proof.
Note that is when is odd and when is even. We therefore obtain the equations
[TABLE]
from Corollary 16. ∎
Theorem 18**.**
The word does not have the Frobenius property.
Proof.
From Theorem 17 we see that among the first natural numbers, at most are in . From (12) and Theorem 17 we find that there is a constant such that for , we have
[TABLE]
Let
[TABLE]
denote the natural density of . Then
[TABLE]
The denominator of this last expression is approximately . Since each is at least , we see that if any is at least , this denominator is larger than and hence . It follows that if for some , then has an infinite complement. Thus does not have the Frobenius property. ∎
Next, we determine the maps for which is co-finite. We only have to consider those for which for . We will show that it is possible to determine if is co-finite by checking (by computer) a finite initial segment of the sequence . We begin with an analysis of the first difference sequence
[TABLE]
Recalling that is equal to the Fibonacci sequence over , we see that is equal to the Fibonacci sequence over . Let ; i.e, if and if . There is one degenerate case to consider here, namely, the case where . In this case is constant with each term equal to . However, the analysis below is not affected by this degenerate situation.
Let
[TABLE]
and for a given factor of , let
[TABLE]
Definition 19** (Semi-image).**
We define the even semi-image of as
[TABLE]
and the odd semi-image of as where
[TABLE]
These formulas are analogous to the ones from Theorem 17, but instead of using the generating prefixes we can use any factor of . Since, by the WELLDOC property, each factor of appears with either parity of -steps prior to it, we must have two semi-images; the even (resp. odd) semi-image represents the image of the factor with an even (resp. odd) number of -steps before it. The odd semi-image is shifted by to account for non-integral but the same lines of reasoning will apply.
Definition 20** (Semi-complement).**
We define the even semi-complement as
[TABLE]
and the odd semi-complement as
[TABLE]
Example 21**.**
Consider the triple . The odd offsets are {0.5, -0.5, 0.5}, the even offsets are {0,1,0},
[TABLE]
and
[TABLE]
Let . Then we have , , , and . We also have , and .
Theorem 22**.**
Fix and let . Then the complement of is finite if and only if for all .
We need two preliminary lemmas.
Lemma 23**.**
Let . Then
[TABLE]
Proof.
It suffices to show that
[TABLE]
which happens if and only if
[TABLE]
Since we have , we are done. ∎
Lemma 24**.**
If then
[TABLE]
for every and .
Proof.
For any and we have
[TABLE]
as required. ∎
Proof of Theorem 22.
We begin with the converse. First note that if is a factor of and then so is nonempty. If every semi-complement is empty, then there exists a sequence on such that
[TABLE]
By Lemma 23, we get that is co-finite.
Now suppose that for some the set (resp. ) is non-empty, and so one of the semi-images ‘misses’ an integer . By the WELLDOC property, there exist infinitely many indices where and is even (resp. odd). Thus, for each there exists an integer such that (resp. ). By Lemma 24, . Thus the complement of is infinite. ∎
Note that by Lemma 14, our results are symmetric with respect to and and if then all of the results in [6] hold. As well, any triple with a greatest common divisor greater than one will have infinitely many elements in the complement of . Thus, in all of the following calculations we skip any triple where , , or where has already been evaluated.
For each triple, we first calculate and then calculate all factors111In the cases where , i.e. is constant, we merely check the semi-image for the single factor . of length in222Different letters may follow different occurrences of each factor. The extra term at the end allows us to account for all possible values of when calculating . . We then calculate the semi-complements of each factor of , and by Theorem 22, if we find a non-empty semi-complement we know that the complement of is infinite; otherwise, the complement of is finite. We found 13 triples with finite complements. These are listed in Table 2.
[TABLE]
5 Futher work
We have just given some examples of infinite words that either have or do not have the Frobenius property. In general, we would like to have a theorem that classifies an infinite word as either having or not having the Frobenius property based on its abelian complexity. For instance, is it true that if has abelian complexity for some , or perhaps even , then has the Frobenius property? What happens when we move to ternary or larger alphabets?
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J.-P. Allouche, M. Dekking. Generalized Beatty sequences and complementary triples. Preprint. https://arxiv.org/abs/1809.03424
- 2[2] H. Ardal, T. Brown, V. Jungić, and J. Sahasrabudhe. On abelian and additive complexity in infinite words. Integers, 12:#A 21, 2012.
- 3[3] L. Balková, M. Bucci, A. De Luca, J. Hladký, S. Puzynina. Aperiodic pseudorandom number generators based on infinite words. Theoret. Comput. Sci. 647:85–100, 2016.
- 4[4] F. Blanchet-Sadri, N. Fox, N. Rampersad. On the asymptotic abelian complexity of morphic words. Adv. Appl. Math. 61:46–84, 2014.
- 5[5] E. M. Coven and G. A. Hedlund. Sequences with minimal block growth. Mathematical Systems Theory, 7:138–153, 1973.
- 6[6] M. Dekking. The Frobenius problem for homomorphic embeddings of languages into the integers. Theoret. Comput. Sci. 732:73–79, 2018.
- 7[7] M. Dekking, M. Mendès France, A. van der Poorten. Folds I–III. Math. Intelligencer 4:130–138,172–181,190–195, 1982.
- 8[8] P. Hubert. Suites équilibrées. Theoret. Comput. Sci. 242:91–108, 2000.
