The abelian complexity of infinite words and the Frobenius problem

Ian Kaye; Narad Rampersad

arXiv:1907.08247·math.CO·July 22, 2019

The abelian complexity of infinite words and the Frobenius problem

Ian Kaye, Narad Rampersad

PDF

Open Access

TL;DR

This paper investigates conditions under which the abelian complexity of infinite words ensures that a semigroup homomorphism applied to their factors covers all but finitely many natural numbers, linking combinatorics and number theory.

Contribution

It introduces new conditions connecting abelian complexity of infinite words with the Frobenius problem, expanding understanding of factor sets in combinatorics on words.

Findings

01

Identifies specific conditions on S and abelian complexity for coverage of N

02

Analyzes various infinite words with different abelian complexity functions

03

Establishes links between combinatorics on words and number theory

Abstract

We study the following problem, first introduced by Dekking. Consider an infinite word x over an alphabet {0,1,...,k-1} and a semigroup homomorphism S:{0,1,...,k-1}* -> N. Let L_x denote the set of factors of x. What conditions on S and the abelian complexity of x guarantee that S(L_x) contains all but finitely many elements of N? We examine this question for some specific infinite words x having different abelian complexity functions.

Equations174

ab - a - b .

ab - a - b .

S (L_{w}) = {S (u) : u \in L_{w}} .

S (L_{w}) = {S (u) : u \in L_{w}} .

(f_{2 n - 1})_{n \geq 1} = (01)^{ω} and (f_{2 n})_{n \geq 1} = pf .

(f_{2 n - 1})_{n \geq 1} = (01)^{ω} and (f_{2 n})_{n \geq 1} = pf .

f_{n} = {01 \mbox i f m \equiv 1 (mod 4) \mbox i f m \equiv 3 (mod 4) .

f_{n} = {01 \mbox i f m \equiv 1 (mod 4) \mbox i f m \equiv 3 (mod 4) .

ρ_{pf} (2^{n}) = 3 for n \geq 1.

ρ_{pf} (2^{n}) = 3 for n \geq 1.

ρ (n)

ρ (n)

M (n + 1)

∣ t (a + b) - 4∣ + 1 = ∣Δ (w) ∣ + 1 \leq M (2^{n} + t (b - a)) + 1 = ρ (2^{n} + t (b - a))

∣ t (a + b) - 4∣ + 1 = ∣Δ (w) ∣ + 1 \leq M (2^{n} + t (b - a)) + 1 = ρ (2^{n} + t (b - a))

ρ (2^{n - 1} + t (b - a)) \leq ρ (2^{n - 1}) + ∣ t ∣ (b - a) = 3 + ∣ t ∣ (b - a) .

ρ (2^{n - 1} + t (b - a)) \leq ρ (2^{n - 1}) + ∣ t ∣ (b - a) = 3 + ∣ t ∣ (b - a) .

∣ t (a + b) - 4∣ + 1 \leq 3 + ∣ t ∣ (b - a) .

∣ t (a + b) - 4∣ + 1 \leq 3 + ∣ t ∣ (b - a) .

lim n \to \infty in f ρ_{Φ} (n) = \infty.

lim n \to \infty in f ρ_{Φ} (n) = \infty.

M_{ϕ} = [3214] .

M_{ϕ} = [3214] .

P (j, C) : [5^{j} \cdot N_{C} \leq n \leq 5^{j + 1} \cdot N_{C} \Rightarrow z_{M} (n) \geq \frac{n}{3} + C]

P (j, C) : [5^{j} \cdot N_{C} \leq n \leq 5^{j + 1} \cdot N_{C} \Rightarrow z_{M} (n) \geq \frac{n}{3} + C]

z_{M} (ℓ k + r) \geq d z_{M} (k + 1) + z_{1} (k + 1) + Δ - z_{M} (ℓ - r),

z_{M} (ℓ k + r) \geq d z_{M} (k + 1) + z_{1} (k + 1) + Δ - z_{M} (ℓ - r),

z_{M} (ℓ k + r) \geq 2 z_{M} (k + 1) + k + 1 - z_{M} (5 - r) \geq 2 z_{M} (k + 1) + k - 2,

z_{M} (ℓ k + r) \geq 2 z_{M} (k + 1) + k + 1 - z_{M} (5 - r) \geq 2 z_{M} (k + 1) + k - 2,

z_{M} (n)

z_{M} (n)

\geq 2 z_{M} (k + 1) + k - 2

\geq 2 (\frac{k + 1}{3} + 4) + k - 2

= \frac{1}{3} (5 k + 4) + \frac{16}{3}

\geq \frac{1}{3} (5 k + r) + 4

= \frac{n}{3} + 4,

z_{M} (n)

z_{M} (n)

= 2 z_{M} (5^{j + 1} N_{4}) + 5^{j + 1} N_{4}

\geq 2 (\frac{5 ^{j + 1} N _{4}}{3} + 4) + 5^{j + 1} N_{4}

= \frac{5}{3} (5^{j + 1} N_{4}) + 8

= \frac{n}{3} + 8

\geq \frac{n}{3} + 4,

z_{M} (n)

z_{M} (n)

\geq 2 z_{M} (k + 1) + k - 2

\geq 2 (\frac{k + 1}{3} + C) + k - 2

= \frac{1}{3} (5 k + 4) + 2 C - \frac{8}{3}

\geq \frac{1}{3} (5 k + r) + C + 1

= \frac{n}{3} + (C + 1),

{(⌊ \frac{n}{3} ⌋ + D, n - ⌊ \frac{n}{3} ⌋ - D) : - C \leq D \leq C} \subseteq ψ (L_{n, Φ})

{(⌊ \frac{n}{3} ⌋ + D, n - ⌊ \frac{n}{3} ⌋ - D) : - C \leq D \leq C} \subseteq ψ (L_{n, Φ})

M \geq M_{a, b} := max {(a + 2 b) \cdot max {a, b}, \frac{a + 2 b}{3} (132 \cdot 5^{C - 4} + ∣ a - b ∣)}

M \geq M_{a, b} := max {(a + 2 b) \cdot max {a, b}, \frac{a + 2 b}{3} (132 \cdot 5^{C - 4} + ∣ a - b ∣)}

x - t_{0} b = \frac{1}{3} n (t_{0}) = \frac{1}{2} (y + t_{0} a)

x - t_{0} b = \frac{1}{3} n (t_{0}) = \frac{1}{2} (y + t_{0} a)

x - t_{0} b = \frac{a x + b y}{2 b + a} = \frac{M}{2 b + a} \geq 0

x - t_{0} b = \frac{a x + b y}{2 b + a} = \frac{M}{2 b + a} \geq 0

x - ⌊ t_{0} ⌉ b > (x - t_{0} b) - b > \frac{M}{2 b + a} - b > (\frac{2 b + a}{2 b + a}) max {a, b} - b \geq 0

x - ⌊ t_{0} ⌉ b > (x - t_{0} b) - b > \frac{M}{2 b + a} - b > (\frac{2 b + a}{2 b + a}) max {a, b} - b \geq 0

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topicssemigroups and automata theory · Coding theory and cryptography · Computability, Logic, AI Algorithms

Full text

The abelian complexity of infinite words and the Frobenius problem

Ian Kaye and Narad Rampersad

Department of Mathematics and Statistics

University of Winnipeg

[email protected] The author was supported by an NSERC USRA.The author was supported by NSERC Discovery Grants 418646-2012 and RGPIN-2019-04111.

Abstract

We study the following problem, first introduced by Dekking. Consider an infinite word ${\bf x}$ over an alphabet $\{0,1,\ldots,k-1\}$ and a semigroup homomorphism $S:\{0,1,\ldots,k-1\}^{*}\to\mathbb{N}$ . Let $\mathcal{L}_{\bf x}$ denote the set of factors of ${\bf x}$ . What conditions on $S$ and the abelian complexity of ${\bf x}$ guarantee that $S(\mathcal{L}_{\bf x})$ contains all but finitely many elements of $\mathbb{N}$ ? We examine this question for some specific infinite words ${\bf x}$ having different abelian complexity functions.

1 Introduction

It is well-known that if $a$ and $b$ are two co-prime positive integers then all sufficiently large positive integers $n$ can be written as a linear combination $n=xa+yb$ , where $x$ and $y$ are non-negative integers. Frobenius posed the problem of determining the largest positive integer that cannot be so represented; Sylvester [12] was the first to give a solution to Frobenius’ problem: he showed that the largest non-representable number is

[TABLE]

Ramírez Alfonsín [10] has written a monograph devoted entirely to this problem.

Dekking [6] studied the following variation of this problem. Let $S:\{0,1\}^{*}\to\mathbb{N}$ be a semigroup homomorphism: i.e., there are non-negative integers $a$ and $b$ such that $S$ is defined by $S(0)=a$ , $S(1)=b$ , and $S(uv)=S(u)+S(v)$ for any words $u$ and $v$ over the binary alphabet $\{0,1\}$ . Given an infinite word ${\bf w}$ over the alphabet $\{0,1\}$ , let $\mathcal{L}_{\bf w}$ denote the set of all factors of ${\bf w}$ and let $\mathcal{L}_{n,\mathbf{w}}$ denote the set of all length- $n$ factors of ${\bf w}$ . Define

[TABLE]

What conditions on ${\bf w}$ and $S$ ensure that $S(\mathcal{L}_{\bf w})$ is co-finite (contains all but finitely many elements of $\mathbb{N}$ )?

Certainly $a$ and $b$ must be co-prime (and so we will assume this to be the case for the remainder of the paper). The set $S(\mathcal{L}_{\bf w})$ is closely related to the abelian complexity [14] of ${\bf w}$ (as well as the additive complexity [2] of ${\bf w}$ ). For any word $u$ over an alphabet $A$ , we write $|u|_{a}$ to denote the number of occurrences of a letter $a\in A$ in the word $u$ and $|u|$ to denote the length of $n$ . If $A=\{a_{1},\ldots,a_{k}\}$ , the Parikh vector of $u$ is the vector $\psi(u)$ whose $i$ -th entry equals $|u|_{a_{i}}$ . Let $A=\{0,1\}$ . For any $n$ , we have $n\in S(\mathcal{L}_{\bf w})$ exactly when there is a factor $u$ of ${\bf w}$ such that $n=xa+yb$ and $\psi(u)=(x,y)$ . The abelian complexity function of ${\bf w}$ is the function $\rho_{\bf w}(n)$ that maps $n$ to the cardinality of the set $\psi(\mathcal{L}_{n,\mathbf{w}})$ . If $\psi(\mathcal{L}_{n,\mathbf{w}})=\{(0,n),(1,n-1),\ldots,(n-1,1),(n,0)\}$ for all $n$ (i.e, $\rho_{\bf w}(n)=n+1$ ), then ${\bf w}$ has maximal abelian complexity and it is clear that in this case $S(\mathcal{L}_{\bf w})$ is co-finite. Indeed, in this case the problem is the classical one stated by Frobenius. On the other hand, for words with lower abelian complexity functions, this may not be the case.

Dekking studied the case where ${\bf w}$ is a Sturmian word. Sturmian words are the aperiodic words with the smallest possible abelian complexity; i.e., if ${\bf w}$ is an aperiodic binary word then ${\bf w}$ is Sturmian if and only if $\rho_{\bf w}(n)=2$ for all $n\geq 1$ [5]. Dekking gave an explicit formula for $S(\mathcal{L}_{\bf w})$ for any Sturmian word ${\bf w}$ ; this formula implies that for any given ${\bf w}$ there are only finitely many maps $S$ such that $S(\mathcal{L}_{\bf w})$ is co-finite. For the Fibonacci word, Dekking characterized exactly the set of such maps $S$ . Given the close relationship between Sturmian words and Beatty sequences, we also mention the work of Steuding and Stumpf [11] concerning the Frobenius problem and Beatty sequences.

The general question we are interested in then is, “What conditions on the abelian complexity of ${\bf w}$ are sufficient to ensure that $S(\mathcal{L}_{\bf w})$ is co-finite for all maps $S$ ?” (Remember, we are assuming that $S(0)$ and $S(1)$ are relatively prime.) If $S(\mathcal{L}_{\bf w})$ is co-finite for all maps $S$ , we say that ${\bf w}$ has the Frobenius property. As previously noted, if ${\bf w}$ has maximal abelian complexity, then ${\bf w}$ has the Frobenius property, and if ${\bf w}$ is Sturmian, then ${\bf w}$ does not have the Frobenius property. In this paper we analyze some example of words ${\bf w}$ whose abelian complexity is intermediate between these two extremes.

Finally, we note that the Frobenius problem can be extended from two given positive integers $a$ and $b$ to any number of given positive integers. Similarly, we can extend the notions defined above to words over larger alphabets. Recall that Dekking studied $S(\mathcal{L}_{\bf w})$ for Sturmian words ${\bf w}$ , which are infinite binary words with constant abelian complexity $\rho_{\bf w}(n)=2$ . We examine $S(\mathcal{L}_{\bf t})$ for a certain infinite ternary word ${\bf t}$ with constant abelian complexity $\rho_{\bf t}(n)=3$ .

To summarize, in the next sections we study:

•

the paperfolding word ${\bf pf}$ , which has abelian complexity $\rho_{\bf pf}(n)=O(\log n)$ ; this word does not have the Frobenius property.

•

a pure morphic binary word $\Phi$ with abelian complexity $\rho_{\Phi}(n)=\Theta(n^{\log_{5}2})$ ; this word has the Frobenius property.

•

a balanced ternary word ${\bf t}$ with abelian complexity $\rho_{\bf t}(n)=3$ for all $n\geq 1$ ; this word does not have the Frobenius property.

2 The paperfolding word

In this section we examine whether the (ordinary) paperfolding word has the Frobenius property. This is a word whose abelian complexity function is unbounded, unlike that of the Sturmian words. For a nice introduction to the paperfolding words and their properties, see the series of papers by Dekking, Mendès France, and Poorten [7]. There are a number of equivalent definitions of the paperfolding word ${\bf pf}$ . If $w=w_{1}w_{2}\ldots w_{k}$ is a word over $\{0,1\}$ then the complement of $w$ is the word $\overline{w}=(1-w_{1})(1-w_{2})\ldots(1-w_{k})$ and the reversal of $w$ is the word $w^{R}=w_{k}w_{k-1}\ldots w_{1}$ . The word ${\bf pf}$ may be constructed as the limit of the following process: Let $f^{(1)}=0$ . Having constructed $f^{(n)}$ , we define $f^{(n+1)}:=f^{(n)}\;0\;\overline{f^{(n)}}^{R}$ . Then ${\bf pf}=\lim_{n\to\infty}f^{(n)}$ .

The next construction of the paperfolding word is known as the Toeplitz construction. We begin with a sequence of empty spaces and fill every second space with the alternating sequence $(01)^{\omega}$ . After infinitely many repetitions of this process, we obtain the ordinary paperfolding word ${\bf pf}$ . Beginning with $\_\ \_\ \_\;\_\;\_\;\_\;\ldots$ , the first few steps in this process are

_ _ _ _ _ _ _ _ _ _ _ _ …

0 _ 1 _ 0 _ 1 _ 0 _ 1 _ …

0 0 1 _ 0 1 1 _ 0 0 1 _ …

0 0 1 0 0 1 1 _ 0 0 1 1 …

0 0 1 0 0 1 1 0 0 0 1 1 …

This construction implies the following recursive definition of ${\bf pf}=(f_{n})_{n\geq 1}$ :

[TABLE]

We may also define the $n$ -th term $f_{n}$ of ${\bf pf}$ from the binary representation of $n$ . Let $n=m\cdot 2^{j}$ be given, where $m$ is odd. Then define

[TABLE]

Madill and Rampersad [9] studied the abelian complexity of ${\bf pf}$ . They proved that $\rho_{\bf pf}(n)=O(\log n)$ ; however, it is also the case that $\rho_{\bf pf}$ takes the value $3$ infinitely often. In particular, we have

[TABLE]

This can be proved by induction on $n$ , using [9, Claim 5] (which states that $\rho_{\bf pf}(4m)=\rho_{\bf pf}(2m)$ ). As we will see, these low values of the abelian complexity function prevent ${\bf pf}$ from having the Frobenius property.

We define $\Delta:\mathcal{L}_{\bf pf}\to\mathbb{Z}$ by $\Delta(w)=|w|_{0}-|w|_{1}$ and $M:\mathbb{N}\to\mathbb{Z}$ by $M(n)=\max\{\Delta(\mathcal{L}_{n,\bf pf})\}$ .

Example 1.

For $n=2$ we have $\mathcal{L}_{n,\bf pf}=\left\{00,01,10,11\right\}$ , $\psi(\mathcal{L}_{n,\bf pf})=\left\{(2,0),(1,1),(0,2)\right\}$ , $\Delta(\mathcal{L}_{n,\bf pf})=\left\{2,0,-2\right\}$ , and $M(n)=2$ .

Note that for any $w\in\mathcal{L}_{n,\bf pf}$ we have $\overline{w}^{R}\in\mathcal{L}_{n,\bf pf}$ , so $-M(n)\leq\Delta(w)\leq M(n)$ . We need the following two facts [9, Claims 3 and 4 (and their proofs)]:

[TABLE]

Lemma 2.

For $n\geq 2$ , the Parikh vectors $\left(2^{n-1}\pm 2,2^{n-1}\mp 2\right)$ do not occur in $\psi(\mathcal{L}_{2^{n},\bf pf})$ .

Proof.

Since $(1,3)$ , $(2,2)$ , $(3,1)$ are all elements of $\psi(\mathcal{L}_{4,\bf pf})$ , we can apply the recursive definition (2) inductively to show that $\left(2^{n-1}\pm 1,2^{n-1}\mp 1\right)$ and $(2^{n-1},2^{n-1})$ are elements of $\psi(\mathcal{L}_{2^{n},\bf pf})$ . From (3), we see that these three vectors are the only vectors in $\psi(\mathcal{L}_{2^{n},\bf pf})$ , which establishes the claim. ∎

Theorem 3.

If $S(0)=a$ and $S(1)=b$ and $4\leq a<b$ then $\mathbb{N}\setminus S(\mathcal{L}_{\bf pf})$ is an infinite set. In particular, the word ${\bf pf}$ does not have the Frobenius property.

Proof.

Suppose that $2\leq a<b$ and consider a positive integer $m$ with representation $m=a\cdot(2^{n-1}-2)+b\cdot(2^{n-1}+2)$ for some (large) $n$ . By Lemma 2, ${\bf pf}$ does not contain any factor with Parikh vector $(2^{n-1}-2,2^{n-1}+2)$ , so so we must look for another representation $m=a\cdot(2^{n-1}-2+tb)+b\cdot(2^{n-1}+2-ta)$ for some non-zero integer $t$ . This representation will correspond to a factor $w$ of length $|w|=2^{n}+t(b-a)$ with Parikh vector $(u,v)=(2^{n-1}-2+tb,2^{n-1}+2-ta)$ . Then $\Delta(w)=u-v=t(b+a)-4$ . Now by (4), we have

[TABLE]

Furthermore, by (4)–(5), we have $\rho(|w|+1)\leq\rho(|w|)+1$ , which implies

[TABLE]

The inequalities (6) and (7) give

[TABLE]

If $t<0$ we get a contradiction immediately, since $|t(a+b)-4|=|t|(a+b)+4$ and (8) becomes $a|t|\leq-1$ , which is impossible. If $t>0$ we have $|t(a+b)-4|=|t|(a+b)-4$ (since $a+b\geq 4$ ), and (8) becomes $a|t|\leq 3$ . Since $t\geq 1$ we find that $a\leq 3$ . We conclude that if $a\geq 4$ , there are infinitely many $m\notin S(\mathcal{L}_{\bf pf})$ . ∎

3 A binary word with abelian complexity $\Theta(n^{\log_{5}2})$

In the last section we saw that the ordinary paperfolding word ${\bf pf}$ does not have the Frobenius property, and that in this case this is due to the fact that $\lim\inf_{n\to\infty}\rho_{\bf pf}(n)$ is bounded. This suggests that to produce an (interesting) example of an infinite word with the Frobenius property, we should consider a word $\Phi$ with less than maximal abelian complexity but for which

[TABLE]

Let $\phi:=\left\{0,1\right\}^{*}\to\left\{0,1\right\}^{*}$ be the morphism that sends $0\to 00101$ and $1\to 11011$ . Let $\Phi$ be the fixed point of $\phi$ that starts with [math]: that is, let $\Phi=\phi^{\omega}(0)=\lim_{n\to\infty}\phi^{n}(0).$

For a general morphism $h:\{0,1,\ldots,k-1\}^{*}\to\{0,1,\ldots,k-1\}^{*}$ we define the incidence matrix of $h$ as the matrix $M_{h}$ whose $i^{th}$ column is the Parikh vector of $h(i)$ . Blanchet-Sadri et al. [4] conducted an extensive study of the asymptotic abelian complexities of binary words generated by iterating morphisms. We will make use of several ideas from their paper in this section. Following the notation of [4], we will use $z(u)$ to denote the number of zeroes that appear in the factor $u$ . Let $z_{0}=z(\phi(0))$ and $z_{1}=z(\phi(1))$ . We will also use $z_{M}(n)$ (resp. $z_{m}(n)$ ) to denote the maximum (resp. minimum) number of zeroes among factors of length $n$ in $\Phi$ . The difference and delta functions are defined in [4] for a general $\ell$ -uniform morphism; for our morphism $\phi$ we have $d=|z_{0}-z_{1}|=2$ and $\Delta=z_{M}(\ell)-\max\left\{z_{0},z_{1}\right\}=3-3=0$ .

Example 4.

For $\phi$ as defined above, we have $\Phi=0010100101110110010111011\cdots$ , $z_{0}=3$ , $z_{1}=1$ , $d=2$ , $\Delta=0$ , $z_{m}(2)=0$ , $z_{M}(2)=2$ , and

[TABLE]

From [4, Theorem 7] we get that $\rho_{\Phi}(n)=\Theta(n^{\log_{5}2})$ , which is certainly not maximal. The following is the main result of this section.

Theorem 5.

The word $\Phi$ has the Frobenius property.

We need a preliminary result. In the proof of this result, and again later in this section, we will need to determine, by computer search, the Parikh vectors of all factors of $\Phi$ of length $r$ for $r$ up to some specified bound. In order to perform this computation we make use of the following fact:

If $r\leq 5^{t}$ for some $t\in\mathbb{N}$ , then each factor of $\Phi$ of length $r$ appears in some $\phi^{t}(x)$ , where $|x|=2$ .

We also note that when performing such a computation there is no need to save all Parikh vectors for factors of length $r$ : indeed, by [14, Lemma 2.1], the Parikh vectors of factors of length $r$ in $\Phi$ are completely determined by the pair $(z_{m}(r),z_{M}(r))$ .

Proposition 6.

For each integer $C\geq 4$ , define $N_{C}=132\cdot 5^{C-4}$ . Then

$z_{M}(n)\geq\frac{n}{3}+C$ * whenever $n\geq N_{C}$ and* 2. 2.

$z_{m}(n)\leq\frac{n}{3}-C$ * whenever $n\geq N_{C}$ .*

Proof.

We prove part 1; part 2 is proven similarly with $N_{4}=132$ . For clarity, we parametrize the property

[TABLE]

Clearly, if $P(j,C)$ holds for a given $C$ and for all $j\in\mathbb{N}$ then our proposition holds for that $C$ . Thus, we proceed by double-induction on $j$ and $C$ .

We fix $N_{4}=29$ and verify by computer that $29\leq n\leq 145\Rightarrow z_{M}(n)\geq\frac{n}{3}+4$ and thus $P(0,4)$ is satisfied for $N_{4}=29$ . Suppose that $P(j,4)$ holds for some $j\in\mathbb{N}$ and let $5^{j+1}\cdot N_{4}\leq n\leq 5^{j+2}\cdot N_{4}$ . We may write $n=5k+r$ for some integers $k,r$ with $0\leq r\leq 4$ . Then $5^{j}\cdot N_{4}\leq k+\frac{r}{5}\leq 5^{j+1}\cdot N_{4}$ and we have two cases: either $k<5^{j+1}\cdot N_{4}$ or $k=5^{j+1}\cdot N_{4}$ .

If $k<5^{j+1}\cdot N_{4}$ then $5^{j}\cdot N_{4}\leq k+1\leq 5^{j+1}\cdot N_{4}$ and by $P(j,4)$ we have $z_{M}(k+1)\geq\frac{k+1}{3}+4$ . One of the inequalities (for an $\ell$ -uniform morphism) in the proof of [4, Proposition 18] is

[TABLE]

which, after substituting the appropriate values for the constants for $\phi$ , becomes

[TABLE]

since $z_{M}(1)\leq\cdots\leq z_{M}(5)=3$ . Thus, we have

[TABLE]

as required.

If $k=5^{j+1}\cdot N_{4}$ then by [4, Lemma 13] we get

[TABLE]

as required, and so in either case, $P(j+1,4)$ holds and by induction we have $P(j,4)$ for all $j\in\mathbb{N}$ .

Suppose that there exist $C\geq 4$ and $N_{C}$ with $(\forall j\in\mathbb{N})[P(j,C)]$ . Now if $n\geq 5N_{C}$ we may write $n=5k+r$ where $k\geq N_{C}$ and $0\leq r\leq 4$ . Then we have

[TABLE]

so $N_{C+1}=5N_{C}$ and the result holds by induction. ∎

Corollary 7.

For each $C\geq 4$ and $N_{C}=132\cdot 5^{C-4}$ , we have

[TABLE]

for all $n\geq N_{C}$ .

We will use Corollary 7 to show that, given $a$ and $b$ , every sufficiently large integer has a representation $ax+by$ where $(x,y)\in\psi(\mathcal{L}_{\Phi})$ . Theorem 5 therefore follows from the next lemma.

Lemma 8.

Let $C=\left\lceil\max\left\{1+\frac{a+2b}{3},b,\frac{b-a}{3},4\right\}\right\rceil$ . Then every integer

[TABLE]

has a representation $M=a(x-tb)+b(y+ta)$ where $(x-tb,y+ta)\in\psi(\mathcal{L}_{\Phi})$ for some $t\in\mathbb{Z}$ .

Proof.

Suppose that $(a,b)=1$ is given and let $M=ax+by$ for some non-negative integers $x,y$ (note that $M$ is larger than the quantity from (1), so such a representation exists). For each $t\in\mathbb{Z}$ we have $M=a(x-tb)+b(y+ta)$ . Our aim is to show that there is a choice of $t$ for which $(x-tb,y+ta)\in\psi(\mathcal{L}_{\Phi})$ . Note that, from Corollary 7, if we look at large enough factors of $\Phi$ we eventually obtain a factor that is roughly one third 0’s. Thus, if we define $n(t)=x+y+t(a-b)$ , then we seek a $t_{0}$ such that $x-t_{0}b=\frac{1}{3}n(t_{0})$ and thus let $t_{0}=\frac{2x-y}{2b+a}$ . However, $t_{0}$ is not necessarily an integer, so we will use either the floor or ceiling $\lfloor t_{0}\rceil$ and show the existence of a subword with length $n(\lfloor t_{0}\rceil)$ and $x-\lfloor t_{0}\rceil b$ zeroes.

We first claim that $x-\lfloor t_{0}\rceil b$ and $y+\lfloor t_{0}\rceil a$ are nonnegative (and thus it is possible to speak of a factor with length $n(\lfloor t_{0}\rceil)$ and $x-\lfloor t_{0}\rceil b$ zeroes). We have

[TABLE]

and so $x-t_{0}a$ , $n(t_{0})$ , and $y+t_{0}a$ each have the same sign. As well,

[TABLE]

so the three integers are nonnegative. Now note that replacing $t_{0}$ with $\lfloor t_{0}\rceil$ only changes each expression by a small amount: $\left|x-t_{0}b-(x-\lfloor t_{0}\rceil)b\right|<b$ and $\left|y+t_{0}a-(y+\lfloor t_{0}\rceil)a\right|<a$ . Thus if $M>(2b+a)\cdot\max\{a,b\}$ then we have

[TABLE]

and

[TABLE]

and thus both $x-\lfloor t_{0}\rceil b$ and $y+\lfloor t_{0}\rceil a$ are nonnegative as required.

We now show that the corresponding factor exists within $\Phi$ . We have two cases:

Case 1: $a>b$ . Then we have

[TABLE]

and

[TABLE]

so

[TABLE]

Case 2: $a<b$ . Then we have

[TABLE]

and

[TABLE]

so

[TABLE]

In either case, we may take $C=\left\lceil\max\left\{1+\frac{a+2b}{3},b,\frac{b-a}{3},4\right\}\right\rceil$ and since

[TABLE]

by Corollary 7 we have that there exists a subword $w$ of $\Phi$ such that $|w|=n(\lfloor t_{0}\rceil)$ and $\psi(w)=(x-\lfloor t_{0}\rceil b,y+\lfloor t_{0}\rceil b)$ .

∎

As noted, Theorem 5 follows directly from Lemma 8. However, the bound on $M$ described in Lemma 8 is certainly not optimal; the maximum non-representable integer may be much smaller than $M_{a,b}$ . We therefore now compute exactly the largest value of $\mathbb{N}\setminus S(\mathcal{L}_{\Phi})$ for several small values of $a,b$ .

We compute the complement of $S(\mathcal{L}_{\Phi})$ based on the Parikh vectors of factors of length up to

[TABLE]

and thus for any integer $M<M_{a,b}$ , if it is representable then its representation should appear among the Parikh vectors of factors up to length $r_{a,b}$ . For convenience, we collected the Parikh vectors of factors up to length $r_{0}=\max\{r_{a,b}:1\leq a,b\leq 6\}=16500<5^{7}$ and then computed $S(\mathcal{L}_{\Phi})$ and its complement only using the Parikh vectors of factors of the appropriate lengths. The results are reported in Table 1.

[TABLE]

4 A ternary word with constant abelian complexity

Dekking [6] proved that Sturmian words do not have the Frobenius property. If ${\bf s}$ is a Stumian word, then ${\bf s}$ is balanced: i.e., for all letters $a\in\{0,1\}$ , we have $||u|_{a}-|v|_{a}|\leq 1$ whenever $u$ and $v$ are factors of ${\bf s}$ of the same length. Furthermore, as noted in the introduction, we have $\rho_{\bf s}(n)=2$ for all $n\geq 1$ , and indeed, the aperiodic words with this abelian complexity function are exactly the Sturmian words. Dekking also performed a detailed analysis of $S(\mathcal{L}_{\bf f})$ for the Fibonacci word ${\bf f}$ defined as follows.

Definition 9 (Fibonacci Word).

Let $\phi=\frac{1}{2}({1+\sqrt{5}})=1.618\cdots$ and let $\alpha=2-\phi=0.38196\cdots$ . We define

[TABLE]

We also note that

[TABLE]

is the sequence obtained from ${\bf f}$ by applying the map $0\to 2$ .

Dekking showed that $S(\mathcal{L}_{\bf f})$ is co-finite except when $(S(0),S(1))\in\{(1,1),(1,2),(1,3),(2,1)\}$ . If one wished to extend Dekking’s analysis to ternary words, then in this setting, the natural ternary analogue of Sturmian words are aperiodic ternary words ${\bf x}$ with abelian complexity $\rho_{\bf x}(n)=3$ for $n\geq 1$ . Currently there is no complete characterization of such words; however, Richomme, Saari, and Zamboni [14] proved that if ${\bf x}$ is aperiodic, ternary, and balanced, then $\rho_{\bf x}(n)=3$ for $n\geq 1$ .

Hubert [8] gave a useful characterization of aperiodic balanced words. The reader may consult Hubert’s paper for more details. Here, we will use his characterization to construct a word ${\bf t}$ from the Fibonacci word ${\bf f}$ with abelian complexity $3$ for all lengths. For ease of notation, let $T$ be the operation that sends $1\to 1$ and every second $0\to 2$ , starting with the second [math]. Similarly, let $\overline{T}$ be the operation that sends $1\to 1$ and every second $0\to{2}$ , starting with the first [math].

Example 10.

Let $\chi=01010101\cdots$ . Then $T(\chi)=01210121\cdots$ and $\overline{T}(\chi)=21012101\cdots$ .

We define

[TABLE]

and we immediately have the following.

Lemma 11.

$\rho^{ab}_{\bf t}(n)=3$ * for all $n\geq 1$ .*

Proof.

By [8] (and its English explanation in [13, Section 4]), the word ${\bf t}$ is an aperiodic, uniformly recurrent, balanced word on $\{0,1,2\}$ , so the result follows from [14, Theorem 4.2]. ∎

We will also make use of the following property.

Definition 12 (WELLDOC Property [3]).

We say that an infinite aperiodic word $\lambda$ on $A=\{0,1,\ldots,d-1\}$ has well distributed occurrences (WELLDOC) if for every $m\in\mathbb{N}$ and every subword $w$ of $\lambda$ we have

[TABLE]

Sturmian words have the WELLDOC property [3, Theorem 3.3].

Definition 13.

For a subset $A\subseteq\mathbb{R}$ and a constant $c\in\mathbb{R}$ we define $c+A:=\{c+a:a\in A\}$ .

Lemma 14.

$\mathcal{L}_{\bf t}=T(\mathcal{L}_{\bf f})\cup\overline{T}(\mathcal{L}_{\bf f})$ .

Proof.

Certainly $\mathcal{L}_{\bf t}\subseteq T(\mathcal{L}_{\bf f})\cup\overline{T}(\mathcal{L}_{\bf f})$ , since any factor of ${\bf t}$ is obtained by taking a factor of ${\bf f}$ and and replacing every other [math] with a $2$ . Let $t_{0}\in T(\mathcal{L}_{\bf f})\cup\overline{T}(\mathcal{L}_{\bf f})$ . Without loss of generality, say $t_{0}=T(w)$ for some $w\in\mathcal{L}_{\bf f}$ . Then by the WELLDOC property (with $m=2$ ), there is an occurrence of $w$ in ${\bf f}$ where it is preceded by an even number of 0’s and an occurrence where it is preceded by an odd number of 0’s. Then $T(w)$ and $\overline{T}(w)$ both occur as subwords of ${\bf}t$ . ∎

It is well-known that $0{\bf f}[1,n]\in\mathcal{L}_{\bf f}$ and $1{\bf f}[1,n]\in\mathcal{L}_{\bf f}$ . Thus we have $T(0{\bf f}[1,n])$ , $T(1{\bf f}[1,n])$ , $\overline{T}(0{\bf f}[1,n])$ , and $\overline{T}(1{\bf f}[1,n])$ in $\mathcal{L}_{t}$ . We will refer to these as the generating prefixes later on. Since we only have 3 possible Parikh vectors for each $n$ , exactly two of these must be equal. This equality depends on the parity of $|{\bf f}[1,n]|_{0}$ .

Theorem 15.

For $n\geq 1$ define $h(n)=\lfloor(n+1)\alpha\rfloor-\lfloor\alpha\rfloor$ . If $|{\bf f}[1,n]|_{0}$ is odd then

[TABLE]

If $|{\bf f}[1,n]|_{0}$ is even then

[TABLE]

Proof.

First note that

[TABLE]

If $|{\bf f}[1,n]|_{0}$ is odd, it is clear that

[TABLE]

since exactly half of the 0’s in $0{\bf f}[1,n]$ will become 2’s after we apply $T$ . For $1{\bf f}[1,n]$ , we have

[TABLE]

By Lemma 14, we get the third Parikh vector by swapping the first and last components.

If $|{\bf f}[1,n]|_{0}$ is even, we apply a similar line of reasoning to $\psi(T(1{\bf f}[1,n]))=\psi(\overline{T}(1{\bf f}[1,n]))$ , $\psi(T(0{\bf f}[1,n]))$ , and $\psi(\overline{T}(0{\bf f}[1,n]))$ , which gives the above. ∎

Let $S:\mathcal{L}_{\bf t}\to\mathbb{N}$ be a morphism with ${S(0)=S_{0}},\ {S(1)=S_{1}}$ , and ${S(2)=S_{2}}$ . As always, we assume that $\gcd(S_{0},S_{1},S_{2})=1$ . Define

[TABLE]

(Note that $2m(n)$ is a generalized Beatty sequence, in the sense of Allouche and Dekking [1].) Using the fact that $\lfloor-x\rfloor=-\lfloor x\rfloor-1$ for $x\notin\mathbb{Z}$ , we see that $\lfloor n\alpha\rfloor=2n-\lfloor n\phi\rfloor-1$ . Using this identity and the fact that $S(w)=S_{0}|w|_{0}+S_{1}|w|_{1}+S_{2}|w|_{2}$ , we obtain (after some algebra) the following corollary of Theorem 15.

Corollary 16.

If $|{\bf f}[1,n-1]|_{0}$ is odd then

[TABLE]

If $|{\bf f}[1,n-1]|_{0}$ is even then

[TABLE]

Define

[TABLE]

We will refer to the $m(n)$ ’s as main terms and the $k_{i}$ ’s as offsets.

Theorem 17.

Define $\mu(n)=[(n-1-\lfloor(n-1)\alpha\rfloor)\bmod 2]$ . Then $S(\mathcal{L}_{n,{\bf t}})=\{g_{1}(n),g_{2}(n),g_{3}(n)\}$ , where

[TABLE]

Proof.

Note that $e_{i}+(o_{i}-e_{i})\mu(n)$ is $o_{i}$ when $|{\bf f}[1,n-1]|_{0}$ is odd and $e_{i}$ when $|{\bf f}[1,n-1]|_{0}$ is even. We therefore obtain the equations

[TABLE]

from Corollary 16. ∎

Theorem 18.

The word ${\bf t}$ does not have the Frobenius property.

Proof.

From Theorem 17 we see that among the first $\max\{g_{1}(n),g_{2}(n),g_{3}(n)\}$ natural numbers, at most $3n$ are in $S(\mathcal{L}_{\bf t})$ . From (12) and Theorem 17 we find that there is a constant $C$ such that for $n\geq 1$ , we have

[TABLE]

Let

[TABLE]

denote the natural density of $S(\mathcal{L}_{\bf t})$ . Then

[TABLE]

The denominator of this last expression is approximately $0.618(S_{0}+S_{2})+0.764S_{1}$ . Since each $S_{i}$ is at least $1$ , we see that if any $S_{i}$ is at least $8$ , this denominator is larger than $6$ and hence $\delta<1$ . It follows that if $S_{i}\geq 8$ for some $i$ , then $S(\mathcal{L}_{\bf t})$ has an infinite complement. Thus ${\bf t}$ does not have the Frobenius property. ∎

Next, we determine the maps $S$ for which $S(\mathcal{L}_{\bf t})$ is co-finite. We only have to consider those $S$ for which $S_{i}\leq 7$ for $i=1,2,3$ . We will show that it is possible to determine if $S(\mathcal{L}_{\bf t})$ is co-finite by checking (by computer) a finite initial segment of the sequence $m(n)$ . We begin with an analysis of the first difference sequence

[TABLE]

Recalling that $(\lfloor(n+1)\phi\rfloor-\lfloor n\phi\rfloor)_{n\geq 1}$ is equal to the Fibonacci sequence over $\{2,1\}$ , we see that $\Delta m(n)$ is equal to the Fibonacci sequence over $\{k_{1}+S_{1},S_{1}\}$ . Let $F=(\Delta m(n))_{n\geq 1}$ ; i.e, $F[n]=k_{1}+S_{1}$ if ${\bf f}[n]=0$ and $F[n]=S_{1}$ if ${\bf f}[n]=1$ . There is one degenerate case to consider here, namely, the case where $k_{1}=0$ . In this case $F$ is constant with each term equal to $S_{1}$ . However, the analysis below is not affected by this degenerate situation.

Let

[TABLE]

and for a given factor $F[i,j]$ of $F$ , let

[TABLE]

Definition 19 (Semi-image).

We define the even semi-image of $F[i,j]$ as

[TABLE]

and the odd semi-image of $F[i,j]$ as $(k_{1}+S_{1})+\mathbb{S}^{1}(F[i,j])$ where

[TABLE]

These formulas are analogous to the ones from Theorem 17, but instead of using the generating prefixes we can use any factor of $F$ . Since, by the WELLDOC property, each factor of $F$ appears with either parity of $(k_{1}+S_{1})$ -steps prior to it, we must have two semi-images; the even (resp. odd) semi-image represents the image of the factor with an even (resp. odd) number of $(k_{1}+S_{1})$ -steps before it. The odd semi-image is shifted by $k_{1}+S_{1}$ to account for non-integral $k_{1}$ but the same lines of reasoning will apply.

Definition 20 (Semi-complement).

We define the even semi-complement as

[TABLE]

and the odd semi-complement as

[TABLE]

Example 21.

Consider the triple $(1,1,2)$ . The odd offsets are {0.5, -0.5, 0.5}, the even offsets are {0,1,0},

[TABLE]

and

[TABLE]

Let $w=F[1,4]=(1.5,1,1.5,1.5)$ . Then we have $k=1$ , $I(w)=[2,4.5]$ , $\mathbb{S}^{0}(w)=\{1,2,3,4,5,6\}$ , and $\mathbb{K}^{0}(w)=\{2,3,4\}\setminus\{1,2,3,4,5,6\}=\emptyset$ . We also have $\mathbb{S}^{1}(w)=\{1.5,2.5,3.5,4.5,5.5,6.5\}$ , and $\mathbb{K}^{1}(w)=\{4,5,6\}\setminus\{3,4,5,6,7,8\}=\emptyset$ .

Theorem 22.

Fix $(S_{0},S_{1},S_{2})$ and let $l=\left\lceil\frac{2(k+1)}{\min\{S_{1},k_{1}+S_{1}\}}\right\rceil$ . Then the complement of $S(\mathcal{L}_{\bf t})$ is finite if and only if $\mathbb{K}^{0}(F[i,i+l-1])=\mathbb{K}^{1}(F[i,i+l-1])=\emptyset$ for all $i\geq 1$ .

We need two preliminary lemmas.

Lemma 23.

Let $R(F[i,i+l-1])=\sum_{q=1}^{i-1}F[q]+I(F[i,i+l-1])$ . Then

[TABLE]

Proof.

It suffices to show that

[TABLE]

which happens if and only if

[TABLE]

Since we have $\sum_{q=i+1}^{i+l}F[q]\geq l\min\{S_{1},k_{1}+S_{1}\}\geq 2(k+1)$ , we are done. ∎

Lemma 24.

If $x\in R(F[i,i+l-1])$ then

[TABLE]

for every $s\geq 0$ and $j=1,2,\ldots,6$ .

Proof.

For any $s\geq 0$ and $j=1,2,\ldots,6$ we have

[TABLE]

as required. ∎

Proof of Theorem 22.

We begin with the converse. First note that if $x$ is a factor of $F$ and $|x|=l$ then $\sum x_{i}>2(k+1)$ so $I(x)$ is nonempty. If every semi-complement is empty, then there exists a sequence $(r(i))_{i\geq 1}$ on $\{0,1\}$ such that

[TABLE]

By Lemma 23, we get that $S(\mathcal{L}_{\bf t})$ is co-finite.

Now suppose that for some $i$ the set $\mathbb{K}^{0}(F[i,i+l-1])$ (resp. $\mathbb{K}^{1}(F[i,i+l-1])$ ) is non-empty, and so one of the semi-images ‘misses’ an integer $x_{i}$ . By the WELLDOC property, there exist infinitely many indices $\left\{i_{r}:r\in\mathbb{N}\right\}$ where $F[i_{r},i_{r}+l-1]=F[i,i+l-1]$ and $|F[1,i_{r}-1]|_{0}$ is even (resp. odd). Thus, for each $r$ there exists an integer $x_{i_{r}}\in R(F[i_{r},i_{r}+l-1])$ such that $x_{i_{r}}\notin\sum_{q=1}^{i_{r}-1}F[q]+\mathbb{S}^{0}(F[i_{r},i_{r}+l-1])$ (resp. $x_{i_{r}}\notin\sum_{q=1}^{i_{r}-1}F[q]+\mathbb{S}^{1}(F[i_{r},i_{r}+l-1])$ ). By Lemma 24, $x_{i_{r}}\notin S(\mathcal{L}_{\bf t})$ . Thus the complement of $S(\mathcal{L}_{\bf t})$ is infinite. ∎

Note that by Lemma 14, our results are symmetric with respect to $S_{0}$ and $S_{2}$ and if $S_{0}=S_{2}$ then all of the results in [6] hold. As well, any triple with a greatest common divisor greater than one will have infinitely many elements in the complement of $S(\mathcal{L}_{\bf t})$ . Thus, in all of the following calculations we skip any triple $(x,y,z)$ where $gcd(x,y,z)>1$ , $x=z$ , or where $(z,y,x)$ has already been evaluated.

For each triple, we first calculate $l=\left\lceil\frac{2(k+1)}{\min\{k_{1}+S_{1},S_{1}\}}\right\rceil$ and then calculate all $l+2$ factors111In the cases where $S_{0}+S_{2}=2S_{1}$ , i.e. $F$ is constant, we merely check the semi-image for the single factor $F[1,l+1]$ . of length $l+1$ in222Different letters may follow different occurrences of each factor. The extra term at the end allows us to account for all possible values of $F[j+1]$ when calculating $I(F[i,j])$ . $F$ . We then calculate the semi-complements of each factor of $F$ , and by Theorem 22, if we find a non-empty semi-complement we know that the complement of $S(\mathcal{L}_{\bf t})$ is infinite; otherwise, the complement of $S(\mathcal{L}_{\bf t})$ is finite. We found 13 triples with finite complements. These are listed in Table 2.

[TABLE]

5 Futher work

We have just given some examples of infinite words that either have or do not have the Frobenius property. In general, we would like to have a theorem that classifies an infinite word as either having or not having the Frobenius property based on its abelian complexity. For instance, is it true that if ${\bf w}$ has abelian complexity $\rho_{\bf w}(n)=\Omega(n^{r})$ for some $r>0$ , or perhaps even $\rho_{\bf w}(n)=\Omega(\log n)$ , then ${\bf w}$ has the Frobenius property? What happens when we move to ternary or larger alphabets?

Bibliography14

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] J.-P. Allouche, M. Dekking. Generalized Beatty sequences and complementary triples. Preprint. https://arxiv.org/abs/1809.03424
2[2] H. Ardal, T. Brown, V. Jungić, and J. Sahasrabudhe. On abelian and additive complexity in infinite words. Integers, 12:#A 21, 2012.
3[3] L. Balková, M. Bucci, A. De Luca, J. Hladký, S. Puzynina. Aperiodic pseudorandom number generators based on infinite words. Theoret. Comput. Sci. 647:85–100, 2016.
4[4] F. Blanchet-Sadri, N. Fox, N. Rampersad. On the asymptotic abelian complexity of morphic words. Adv. Appl. Math. 61:46–84, 2014.
5[5] E. M. Coven and G. A. Hedlund. Sequences with minimal block growth. Mathematical Systems Theory, 7:138–153, 1973.
6[6] M. Dekking. The Frobenius problem for homomorphic embeddings of languages into the integers. Theoret. Comput. Sci. 732:73–79, 2018.
7[7] M. Dekking, M. Mendès France, A. van der Poorten. Folds I–III. Math. Intelligencer 4:130–138,172–181,190–195, 1982.
8[8] P. Hubert. Suites équilibrées. Theoret. Comput. Sci. 242:91–108, 2000.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Taxonomy

The abelian complexity of infinite words and the Frobenius problem

Abstract

1 Introduction

2 The paperfolding word

Example 1**.**

Lemma 2**.**

Proof.

Theorem 3**.**

Proof.

3 A binary word with abelian complexity Θ(nlog⁡52)\Theta(n^{\log_{5}2})Θ(nlog5​2)

Example 4**.**

Theorem 5**.**

Proposition 6**.**

Proof.

Corollary 7**.**

Lemma 8**.**

Proof.

4 A ternary word with constant abelian complexity

Definition 9** (Fibonacci Word).**

Example 10**.**

Lemma 11**.**

Proof.

Definition 12** (WELLDOC Property [3]).**

Definition 13**.**

Lemma 14**.**

Proof.

Theorem 15**.**

Proof.

Corollary 16**.**

Theorem 17**.**

Proof.

Theorem 18**.**

Proof.

Definition 19** (Semi-image).**

Definition 20** (Semi-complement).**

Example 21**.**

Theorem 22**.**

Lemma 23**.**

Proof.

Lemma 24**.**

Proof.

Proof of Theorem 22.

5 Futher work

Example 1.

Lemma 2.

Theorem 3.

3 A binary word with abelian complexity $\Theta(n^{\log_{5}2})$

Example 4.

Theorem 5.

Proposition 6.

Corollary 7.

Lemma 8.

Definition 9 (Fibonacci Word).

Example 10.

Lemma 11.

Definition 12 (WELLDOC Property [3]).

Definition 13.

Lemma 14.

Theorem 15.

Corollary 16.

Theorem 17.

Theorem 18.

Definition 19 (Semi-image).

Definition 20 (Semi-complement).

Example 21.

Theorem 22.

Lemma 23.

Lemma 24.