Palindromic Ziv-Lempel and Crochemore Factorizations of $m$-Bonacci Infinite Words
Marieh Jahannia, Morteza Mohammad-noori, Narad Rampersad, Manon, Stipulanti

TL;DR
This paper introduces palindromic variants of Ziv-Lempel and Crochemore factorizations, computes them for Fibonacci and m-bonacci words, and explores their structural properties.
Contribution
It presents a novel palindromic factorization approach and extends it to Fibonacci and m-bonacci words, enriching combinatorial word analysis.
Findings
Palindromic factorizations are explicitly computed for Fibonacci words.
The method generalizes to all m-bonacci words.
Structural properties of these factorizations are analyzed.
Abstract
We introduce a variation of the Ziv-Lempel and Crochemore factorizations of words by requiring each factor to be a palindrome. We compute these factorizations for the Fibonacci word, and more generally, for all -bonacci words.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Palindromic Ziv–Lempel and
Crochemore Factorizations of -Bonacci Infinite Words
Marieh Jahannia
School of Mathematics, Statistics and Computer Science, College of Science
University of Tehran, Tehran, Iran
Morteza Mohammad-noori
School of Mathematics, Statistics and Computer Science, College of Science
University of Tehran, Tehran, Iran
School of Mathematics, Institute for Research in Fundamental Sciences (IPM)
Tehran, Iran
Narad Rampersad
Department of Mathematics and Statistics
University of Winnipeg, Winnipeg, Canada
Manon Stipulanti
Department of Mathematics
University of Liège, Liège, Belgium
Abstract
We introduce a variation of the Ziv–Lempel and Crochemore factorizations of words by requiring each factor to be a palindrome. We compute these factorizations for the Fibonacci word, and more generally, for all -bonacci words.
2010 Mathematics Subject Classification: 68R15.
Keywords: Ziv–Lempel factorization; Crochemore factorization; palindrome; Fibonacci word; -bonacci words; singular words; episturmian words.
1 Introduction
The Ziv–Lempel [9] and Crochemore [4] factorizations are two well-known factorizations of words used in text compression and other text algorithms. Here we apply them to infinite words. Let denote the length of a finite word . In this paper, we start indexing words at [math], i.e., if is a finite word over the alphabet , then we write where for all . If is an infinite word and is a finite word, we say there is an occurrence of at position in if for some word of length and some infinite word . Given an infinite word , the Ziv–Lempel or -factorization of is the factorization
[TABLE]
where is the shortest prefix of such that there is no occurrence of in at any position . The Crochemore or -factorization of is the factorization
[TABLE]
where is the longest prefix of such that there is an occurrence of in at some position , or, if this prefix does not exist, the factor is just a single letter.
For instance, if is the Fibonacci word, we have
[TABLE]
and
[TABLE]
Note that if is ultimately periodic the -factorization is not well-defined, since eventually there will be no factors that do not occur previously in . Similarly, if is ultimately periodic the definition of the -factorization will result in some factor being an infinite word. We are not interested in ultimately periodic words in this paper and will therefore ignore this possibility and assume that any infinite word considered in this paper is aperiodic.
In the context of combinatorics on words, these factorizations have been computed for certain important families of words. Berstel and Savelli [2] computed the -factorizations of all standard Sturmian words. They also observed that the -factorization of the Fibonacci word coincides with the singular factorization of the Fibonacci word introduced by Wen and Wen [13]. Fici [5] has given an excellent survey of these and other factorizations of the Fibonacci word. Ghareghani, Mohammad–noori, and Sharifani [6] determined the - and -factorizations of standard episturmian words. Constantinescu and Ilie [3] used the -factorization to define the Lempel–Ziv complexity of an infinite word.
We introduce the palindromic -factorization and palindromic -factorization by requiring that each of the factors in the previous definitions be palindromes. That is, the palindromic -factorization of is the factorization
[TABLE]
where is the shortest palindromic prefix of such that there is no occurrence of in at any position . The palindromic -factorization may not exist for certain infinite words . For instance, if only contains palindromes of bounded length, then the palindromic -factorization will not exist. This type of factorization is therefore only interesting when applied to infinite words with arbitrarily long palindromic factors. The palindromic -factorization of is the factorization
[TABLE]
where is the longest palindromic prefix of such that there is an occurrence of in at some position , or, if this prefix does not exist, the factor is just a single letter.
For instance, if is the Fibonacci word, we have
[TABLE]
and
[TABLE]
It turns out that and are the same, and in fact are equal to the singular factorization of (which we define later). However, the factorizations and are not the same. We show that the factors of can also be written in terms of the singular words and the factorization (except for the first few factors) coincides with a nice factorization of that appears in [5].
We believe that it could be of interest to compare the ordinary - and -factorizations of certain infinite words with their palindromic - and -factorizations, in the same way that one can compare the ordinary complexity function of an infinite word with its palindromic complexity function (see [1]).
The main results of this paper give a description of the palindromic - and -factorizations of the Fibonacci word and, more generally, the -bonacci word for .
2 Basics from combinatorics on words
Let be a finite alphabet, i.e., a finite set made of letters. A (finite) word over is a finite sequence of letters belonging to . If with and for all , then the length of is , i.e., it is the number of letters that contains. We let denote the empty word. This special word is the neutral element for concatenation of words, and its length is set to be [math]. The set of all finite words over is denoted by , and we let denote the set of non-empty finite words over . An infinite word over is any infinite sequence over . The set of all infinite words over is denoted by . Note that in this paper infinite words are written in bold.
A finite word is a prefix (resp., suffix) of another finite word if there exists such that (resp., ). The word is said to be a factor of if there exist such that . If is a finite word over , we write and . Observe that if with , then and . In particular, for any words , we have .
In the same way, a finite word is a prefix of an infinite word if there exist such that . The word is said to be a factor of if there exist and such that .
Let with and for all . The mirror image, or reversal, of is the word over , i.e., the word obtained by reading from right to left. We say that a word over is a palindrome if .
A factorization of a finite word is a finite sequence of finite words over such that
[TABLE]
Similarly, a factorization of an infinite word is a sequence of finite words over such that
[TABLE]
A morphism on is a map such that for all , we have . In order to define a morphism, it suffices to provide the image of letters belonging to . A morphism is said to be prolongable on a letter if with and is non-erasing, i.e., the image of no letter is the empty word. If is prolongable on , then is a proper prefix of for all . Therefore, the sequence of finite words defines an infinite word that is a fixed point of .
In combinatorics on words, given an alphabet , a set of non-empty words is a code on if any word has at most one factorization using words of . For more on this topic, see, for instance, [10, Chapter 6]. The following result can be found in [10, Chapter 6].
Proposition 1**.**
Let be two finite alphabets, and let be an injective morphism. If is a code on , then is a code on .
In the following definition, we introduce two new factorizations of interest.
Definition 2**.**
Let be an infinite word over . The palindromic Ziv–Lempel or palindromic -factorization of is the factorization
[TABLE]
where is the shortest palindromic prefix of such that there is no occurrence of in at any position . The palindromic Crochemore or palindromic -factorization of is the factorization
[TABLE]
where is the longest palindromic prefix of such that there is an occurrence of in at some position , or, if this prefix does not exist, the factor is just a single letter.
3 The Fibonacci case
3.1 Some known results and preliminaries
Before establishing the two palindromic factorizations of the Fibonacci word, we gather some definitions and necessary results. Some of them are well known and can be found in [5, 13]. In the following definition, we follow the lines of [5].
Definition 3**.**
Let be the (infinite) Fibonacci word, i.e., the fixed point of the morphism , starting with [math]. For all , define the finite word to be the th iteration of on [math]. The first few words of the sequence are . It is well known that the Fibonacci word is the limit of . Let be the sequence of the palindromic prefixes of , which are also called central words. The first few terms of this sequence are . The singular words satisfy , and, for all , and . The first few singular words are .
The following properties of the singular words can be found in [13].
Proposition 4**.**
Let be the sequence of Fibonacci numbers with initial conditions and .
- (1)
For all , is a palindrome.
- (2)
For all , .
- (3)
For all , .
- (4)
For all , is not a factor of .
- (5)
For all , is not a factor of .
- (6)
Let and let where and . If with , then .
- (7)
Let and define to be [math] if is odd, or if is even. Then .
The following result can be found in [5]. Note that the first factorization of the Fibonacci word also appears in [13].
Proposition 5**.**
We have the following two factorizations of the Fibonacci word
[TABLE]
Moreover, the Ziv–Lempel factorization of the Fibonacci word is given by the sequence of singular words, i.e.,
[TABLE]
As a matter of fact, the palindromic -factorization of is easily deduced from the previous result, as shown in the next section. However, the palindromic -factorization of cannot be obtained from already known results, and, to that aim, we define a sequence of specific prefixes of .
Definition 6**.**
For all , define
[TABLE]
From (5), observe that, for all , we have
[TABLE]
Interestingly, the prefix of can be factorized as a particular product of singular words.
Proposition 7**.**
For all , we have
[TABLE]
Proof.
Proceed by induction on . The result holds for because . For , we get by Definition 6 and therefore
[TABLE]
as desired. Assume that . Now we suppose the result holds up to and we show it still holds for . Using Definition 6, we have
[TABLE]
By the induction hypothesis, we get
[TABLE]
Since , Proposition 4 implies that , and we deduce that
[TABLE]
which ends the proof. ∎
3.2 The palindromic -factorization of the Fibonacci word
In this (very) short section, we obtain the palindromic -factorization of the Fibonacci word, which easily follows from already known results.
Theorem 8**.**
The palindromic -factorization of the Fibonacci word is
[TABLE]
Proof.
From Proposition 5, . Since the factors are all palindromes by Proposition 4, this factorization is also . ∎
3.3 The palindromic -factorization of the Fibonacci word
In this section, we show that, after the prefix of length , the factorization (5) coincides with the factorization . Note that in this case and are not the same, since the factors in are not palindromes.
Lemma 9**.**
For all , the only suffix of that is also a prefix of is the empty word.
Proof.
We proceed by induction on . From Definition 3, the first two singular words are and , so the result can be checked by hand for .
Now suppose that , and that the only suffix of that is also a prefix of is the empty word, for all . We show that the result still holds for . Proceed by contradiction and suppose there exists a word which is a non-empty suffix of and a non-empty prefix of . We have . Using Proposition 4(6), starts and ends with .
If , then is a prefix of (recall that is a prefix of ). Consequently, is a non-empty suffix of and a non-empty prefix of . This contradicts the inductive assumption.
If , then is a prefix of (recall that is a prefix of ). In particular, is a factor of , and also a factor of (recall that is a suffix of ). This contradicts Proposition 4(4). ∎
In the following lemma, recall that we start indexing words at [math].
Lemma 10**.**
Let . There are exactly two occurrences of the factor inside the word : one at position , the other at position .
Proof.
If , then and the factor occurs in at positions [math] and . If , then with for all . There are exactly two occurrences of in starting either at position or .
Suppose that . Using (3), let us write with . Thanks to this factorization, we immediately see that occurs at least twice as a factor of : one starting at position , the other beginning at position . We now show that there are no other occurrences of as a factor of . There are several cases to consider.
Case 1. The word cannot be a factor of , otherwise it contradicts Proposition 4(5).
Case 2. The word cannot be a factor of , otherwise it contradicts Proposition 4(4).
Case 3. The word cannot be a factor of since by Proposition 4(2) (note that ).
Case 4. Suppose that is a factor of , overlapping and . Using Proposition 4(2) (), we know that
[TABLE]
Consequently, is a factor of . If starts somewhere within , or if starts with the first letter of , then is a factor of , which contradicts Proposition 4(4). Therefore must be a factor of , i.e., there exist a non-empty suffix of and a non-empty prefix of such that . Then is also a non-empty prefix of , which contradicts Lemma 9.
Case 5. Suppose that is a factor of , overlapping and . This case is similar to the fourth case above. Indeed, observe that, since , Proposition 4(3) gives
[TABLE]
Using Proposition 4 again, we know that . Consequently, is a factor of , so is a factor of , which is impossible due to the fourth case.
Case 6. Suppose that is a factor of , overlapping and . In this case, is a factor of since the singular words are palindromes. As in the fifth case, we raise a contradiction.
Case 7. Suppose that is a factor of , overlapping and . In this case, is a factor of since the singular words are palindromes. As in the fourth case, we reach a contradiction. ∎
We prove a technical result before getting the palindromic -factorization of .
Proposition 11**.**
Let . Let be a non-empty common finite prefix of the infinite words
[TABLE]
and
[TABLE]
Then is not a palindrome.
Proof.
Let us define
[TABLE]
where is taken as in the statement. Using Proposition 4, since and , we know that . Now proceed by contradiction and suppose that is a palindrome. Then we have
[TABLE]
The bounds on the length of lead to an overlap between the occurrence of at position (in the leftmost word in (4)), and the occurrence at position (in the rightmost word in (4)). This is impossible due to either Proposition 4(4), or Lemma 9. ∎
Theorem 12**.**
Let denote the palindromic -factorization of the Fibonacci word . Then, we have , , and, for all ,
[TABLE]
Proof.
By definition of the palindromic -factorization of the Fibonacci word , we clearly have , and . For the second part of the result, proceed by induction on . Suppose . Let us find the factor of the palindromic -factorization of . We have
[TABLE]
and the longest palindrome starting with [math] and occurring before is
[TABLE]
as expected.
For the induction step, suppose and assume that, for all , we have . We show it is still true for . On the one hand, by the induction hypothesis, we have
[TABLE]
and the goal is to find the next factor of the palindromic -factorization of , i.e., the word . On the other hand, using (5) first and then (3) since is large enough, we get
[TABLE]
Using (6), it is clear that since is a palindrome occurring before. Therefore, there exists a word such that . We claim that is in fact the empty word and proceed by contradiction.
By Lemma 10, we know that there are exactly two occurrences of in : one starts at position , and the other at position .
Case 1. Let us deal with the occurrence of in at position . In this case, must be a common prefix of the infinite words
[TABLE]
and
[TABLE]
By Proposition 11, we know that is not a palindrome if is non-empty, a contradiction.
Case 2. Let us consider the occurrence of in at position . In this case, must be a common prefix of the infinite words
[TABLE]
and
[TABLE]
Using Proposition 4, we know that
[TABLE]
Consequently, , which violates Proposition 4 (items (4) or (6)).
As a conclusion, the longest palindrome starting with the first letter of and occurring before is
[TABLE]
as required. ∎
4 The -bonacci case
In this section, we extend the results obtained for the Fibonacci word to any -bonacci word, namely we get the palindromic - and -factorizations of any -bonacci word. The strategy is similar to the one adopted in the previous case: we define a particular sequence of finite words that we will call p-singular words, and we write the palindromic - and -factorizations of any -bonacci word in terms of this sequence. In the case , the words turn out to be the singular words (see Proposition 22).
4.1 Preliminaries
Definition 13**.**
Let . We define the morphism on by
[TABLE]
When , then , and we fall into the Fibonacci case above.
Let be the (infinite) -bonacci word, i.e., the fixed point of the morphism , starting with [math]. For all , define to be the th iteration of on [math]. It is well known that the -bonacci word is the limit of . For the sake of simplicity, when the context is clear, we write instead of .
From now on, is a fixed integer greater than unless otherwise specified.
Example 14**.**
If , then is the Fibonacci word. See also Definition 3. If , then , and the infinite word is called the Tribonacci word. If , then , and the infinite word is called the Quadribonacci word. In Table 1, the first few words of the sequences are given for .
Lemma 15**.**
The set of non-empty words is a code on the finite alphabet .
Proof.
It directly follows from Proposition 1, the fact that is an injective morphism, and is a code on . ∎
Remark 16*.*
Observe that not all words over have a factorization in terms of blocks of . For instance, the word does not have any such factorization. However if a word has such a factorization, then it is unique.
The following lemma will be useful to prove properties similar to those given in Proposition 4.
Lemma 17**.**
Let be two finite words.
- (1)
If is a factor of , then is a factor of .
- (2)
If is a factor of and does not end with the letter , then is a factor of .
Proof.
If is the empty word, then both items are true. Now assume that is non-empty, so is . From Lemma 15, the set of words is a code, and thus the words and respectively admit a unique factorization in terms of blocks belonging to . There exist positive integers and words such that and .
Let us prove item . Note that and . By assumption, there exist words such that . Using the form of the blocks in , there exist such that and (since no words in start with a letter different from [math] and since implies that the first letter [math] is a block in ). By uniqueness of the factorization, we also have for all and . Consequently, and . From the form of the words and , we deduce that there exist words such that and . Thus,
[TABLE]
By injectivity of , , and is a factor of , as desired.
Let us show that item also holds. By hypothesis, is a factor of and the last letter of is not (i.e., the block is of length and ends with a letter different from [math]). By an analogous reasoning, there exists such that for all . Now let and . We have
[TABLE]
As before, there exist such that and . By injectivity of , is again a factor of . ∎
Example 18**.**
Let and be words in ( here). We see that is a factor of while is not a factor of . This is due to the fact that ends with .
4.2 Properties of p-singular words
In [5, 13], the Fibonacci word is factorized into singular words (see Proposition 5). In [11], this notion of singular words is extended to cover the case of characteristic Sturmian words. In particular, any characteristic Sturmian word has a singular decomposition and those singular words are useful to find the palindromic factors of . Leaving the framework of a two-letter alphabet, it is shown in [12] that there are two kinds of singular words in the Tribonacci case (), and that the Tribonacci word possesses a decomposition into singular words. Afterwards, the study of singular words has been extended to include standard episturmian words. More particularly, a standard episturmian word over is -strict if every letter , , occurs infinitely many times in its directive word. In fact, the -strict standard episturmian words are exactly the -letter Arnoux–Rauzy sequences. To learn more about the subject, we refer the reader to [7]. In [7, Chapter 7], it is shown that any word in a class of specific -strict standard episturmian words has several kinds of generalized singular words. Roughly, those singular words turn out to be notably useful to study factors of (e.g., squares, cubes and other powers), and can also be used to factorize (this particular factorization is referred to as a partition in [7, Chapter 7]).
Following the same lead, we define the p-singular words in the general case of the -bonacci word . Those particular words are useful to obtain the palindromic - and -factorizations of . In this section, we study some of their properties.
Definition 19**.**
Define the sequence of finite words over the alphabet by , , and
- (1)
For all , ;
- (2)
For all , .
Note that for (resp. ), is centered at (resp., ).
In the Fibonacci case when , we will show in Proposition 22 that the corresponding words are the singular words . For that reason, the sequence is the sequence of words called p-singular words. The p-singular words satisfy a number of identities related to the standard and central words; for instance, the following result gives another way we could have chosen to define the p-singular words; see [6], where the ordinary - and -factorizations of episturmian words are best described in terms of the words .
Lemma 20**.**
For all , we have
[TABLE]
The lemma can be proved directly from the definitions, but we are able to give a much more elegant proof after first proving some preliminary results. We therefore postpone the proof until after Lemma 29.
Example 21**.**
In Table 2, the first few p-singular words are displayed for .
In fact, in the context of the Fibonacci word (), the word is the st singular word, as shown below. As a consequence, the palindromic - and -factorizations of the Fibonacci word can be rewritten in terms of the sequence of p-singular words; see Theorems 8 and 12. In the same way, we will show that the palindromic - and -factorizations of any -bonacci word involve the p-singular words.
Proposition 22**.**
For all , we have .
Proof.
We proceed by induction on . The result is true for . Now suppose that and the result holds up to . We show it is still true for . Using Proposition 4, then the induction hypothesis and finally Definition 19, we get the result
[TABLE]
Again, for the sake of simplicity, when the context is clear, we write instead of . By induction and Definition 19, it is clear that the p-singular words are palindromes.
Proposition 23**.**
For all , is a palindrome.
Also from Definition 19, we know the prefixes and suffixes of length at most of the p-singular words.
Proposition 24**.**
For all , starts and ends with the letter [math] (resp., ) if is even (resp., odd). Moreover, for all even , starts and ends with if , or if ; for all odd , starts with if , or if , or if , and ends with if , or if , or if .
Proof.
For the first part of the result, we proceed by induction on . From Definition 19, and , so the result is true for . Now suppose that , and that the result holds for values less than . If (resp., ), then Definition 19(1) (resp., Definition 19(2)) shows that ends and starts with . Using the induction hypothesis since , we know that starts and ends with [math] (resp., ) if is even (resp., odd). Consequently, starts and ends with the letter [math] (resp., ) if is even (resp., odd).
The proof of the second part of the statement is obtained in the same manner by first observing that Definition 19 (or Table 2) gives if , or if , and
[TABLE]
∎
In the following two corollaries, resulting from Definition 19, we study the length of p-singular words.
Corollary 25**.**
We have , and for all , . In particular, for all , .
Proof.
From Definition 19, we have . Now let . From Definition 19(1), we get
[TABLE]
which proves the first part of the statement. Let us show the second part of the statement. The case is easily handled. Suppose that . From the first part of the result with , we know that
[TABLE]
In the following corollary, when is big enough, the length of the p-singular word is expressed in terms of the length of the previous p-singular words . Note that, when , then the following result is implied by Propositions 4(2) and 22. Also observe that, when is even, the sequence of positive integers satisfies a -bonacci type recurrence relation. However, that is not the case when is odd.
Corollary 26**.**
If is even, then, for all , we have
[TABLE]
If is odd, then, for all , we have
[TABLE]
Proof.
If , the result follows from Propositions 4(2) and 22. Now suppose that , and, as a first case, suppose that is even. Proceed by induction on . If , then using Corollary 25 several times, we have
[TABLE]
since , and . Now suppose that and the result holds for values less than . From Definition 19, we obtain
[TABLE]
and using the induction hypothesis, we find
[TABLE]
Secondly assume that is odd, and as is the previous case, proceed by induction on . If , then using Corollary 25 several times, we have
[TABLE]
since and . Now suppose that , and assume that result holds for all values less than . From Definition 19, we have
[TABLE]
The induction hypothesis allows us to conclude that
[TABLE]
as desired. ∎
The following inequalities on the lengths of p-singular words will be useful later on.
Proposition 27**.**
We have the following inequalities.
- (1)
For all ,
[TABLE]
- (2)
For all , .
- (3)
*For all , . *
Proof.
Let us prove (1) by induction on . If , then . If , then . Now suppose that , and that the result is true for values less than . By the induction hypothesis, we have
[TABLE]
and by Corollary 25, we have
[TABLE]
Let us prove (2). First, suppose that . Then Corollary 25 implies that since . Suppose that . If is even, then by Corollary 26, we know that
[TABLE]
since (when , the inequality above is an equality). If is odd, then by Corollary 26, we have
[TABLE]
When is even, then clearly . When is odd, then , so we have .
Let us show that (3) holds. Suppose that . Since , the result can easily be deduced as a corollary of (2). ∎
From Table 2, one can observe that the first few words in two consecutive sequences of p-singular words are the same. In the following proposition, we compare the first terms of the sequences and by showing that they are equal. Also notice the words differ after that.
Proposition 28**.**
For all , we have .
Proof.
We proceed by induction on , with . It is clear that for all , we have and , so the base case is true. Now suppose that , and that for . From Definition 19(1), we have
[TABLE]
Now using the induction hypothesis, we have
[TABLE]
and using Definition 19(1) again, we get . ∎
The idea to obtain the palindromic - and -factorizations of the -bonacci word is to mimic the reasoning in the previous case. Namely, we establish results similar to Propositions 4, 5 and 7. Before getting those properties in the more general -bonacci case, a few preliminaries are necessary. In the following lemma, we get a formula for the p-singular word in terms of the morphism and the p-singular word .
Lemma 29**.**
For all ,
[TABLE]
Proof.
We proceed by induction on . If , then . If , then
[TABLE]
Now suppose that and that the result is true for values less than . As a first case, suppose that . In particular, , and we deduce from Definition 19(1) that
[TABLE]
If is even, then
[TABLE]
By the induction hypothesis, we obtain
[TABLE]
and the last equality holds because . From Definition 19(1), we have . If is odd, then
[TABLE]
By the induction hypothesis, we obtain
[TABLE]
and the last equality is true because . From Definition 19(1), we have .
Now suppose that (this implies that ). By Definition 19(1), we have
[TABLE]
If is even, then
[TABLE]
By the induction hypothesis and since , we obtain
[TABLE]
From Definition 19(2), we have . If is odd, then
[TABLE]
By the induction hypothesis and since , we obtain
[TABLE]
From Definition 19(2), we have .
Finally, assume that . By Definition 19(2), we have
[TABLE]
If is even, then
[TABLE]
Inserting where needed (places where to insert it differ when is even or odd) and using the induction hypothesis , we obtain
[TABLE]
From Definition 19(2), we have as desired. If is odd, then
[TABLE]
As before, inserting where needed and making use of the induction hypothesis, we get
[TABLE]
From Definition 19(2), we have . This ends the proof. ∎
Using the previous lemma, we are able to prove Lemma 20.
Proof of Lemma 20.
For the sake of simplicity, let us drop the exponent in this proof. We equivalently show that, if is odd,
[TABLE]
and if is even,
[TABLE]
We proceed by induction on . If , then holds. If , then . Now suppose that and that the result is true for values less than .
If is odd, then is even and the induction hypothesis yields
[TABLE]
Applying on both sides, we get
[TABLE]
We may now insert before in the left-hand side of the previous equality to obtain
[TABLE]
We conclude by using the fact that and thanks to Lemma 29.
If is even, then is odd and the induction assumption gives
[TABLE]
Applying on both sides and appending a letter [math], we obtain
[TABLE]
We end this case by using the fact that and thanks to Lemma 29. ∎
The following result matches Proposition 4(4) in the Fibonacci case.
Proposition 30**.**
For all , is not a factor of .
Proof.
Observe first that the case is covered using Propositions 4(4) and 22. So we can suppose that , and we proceed by induction on . The result can be checked by hand for since , , and (see Definition 19). Suppose that and assume that is not a factor of for . We show it still holds for , i.e., is not a factor of . Proceed by contradiction and suppose that is a factor of . We divide the proof into two cases according to the parity of .
Case 1. Suppose that is odd. From Lemma 29, we know that and . We prove that is a factor of . By hypothesis, there exist words such that . If , then the statement is true. If , write with for all . By definition of the morphism , we find since starts with a positive letter, which ends the intermediate result. Now, we claim that is in fact a factor of . First, using Proposition 24,
[TABLE]
for non-empty words (recall that ). Thus (resp., ) starts and ends with (resp., ). Consequently, is a factor of , as claimed. From Lemma 17, is a factor of , which contradicts the induction hypothesis.
Case 2. Assume that is even. From Lemma 29, we know that and . By hypothesis, is a factor of . Thus, is in fact a factor of . Using Proposition 24,
[TABLE]
for non-empty words (recall that ). From Lemma 17, is a factor of , which again contradicts the induction hypothesis. ∎
The following result is the counterpart to Proposition 4(5) in the Fibonacci case.
Proposition 31**.**
For all , is not a factor of the product .
Proof.
For the sake of simplicity, we define, for all ,
[TABLE]
Observe first that the case follows from Propositions 4(5) and 22. So we can suppose that . To prove the result, we proceed by induction on .
If , then by Proposition 27, we have . If the inequality is strict, then we are done. If we actually have an equality, then would be a factor of , which contradicts Proposition 30.
Suppose that and assume that is not a factor of for . We show it still holds for , i.e., is not a factor of . Proceed by contradiction and suppose that is a factor of . We divide the proof into two cases according to the parity of .
Case 1. Suppose that is odd. From Lemma 29, we get
[TABLE]
By hypothesis, is a factor of , so is also a factor of (the reasoning is similar to the one developed in the previous proof). We claim that is in fact a factor of . First, using Proposition 24, we have
[TABLE]
for two non-empty words . Consequently, starts and ends with , and starts with and ends with . Thus, is a factor of , as expected. From Lemma 17, is a factor of , which contradicts the induction hypothesis.
Case 2. Assume that is even. From Lemma 29, we get
[TABLE]
By hypothesis, is a factor of , so is also a factor of . Using Proposition 24, we have
[TABLE]
for two non-empty words . From Lemma 17, is a factor of , which also contradicts the induction hypothesis. ∎
Proposition 32**.**
For all , the words do not contain the letter .
Proof.
If , then is the empty word, and we are done. If , each of the words and does not contain the letter . Let . Then the words are well defined (see Definition 19). Iteratively applying Proposition 28, we obtain for all . Since is a word defined over the alphabet for any , the conclusion follows. ∎
The following result compares the prefixes of to suffixes of . Its proof follows the same lines as the proof of Lemma 9 in the Fibonacci case.
Lemma 33**.**
For all , the only suffix of that is also a prefix of is the empty word.
Proof.
We proceed by induction on . From Definition 19, we have , and , so the result can be checked by hand for .
Now suppose that , and that the only suffix of that is also a prefix of is the empty word, for all . We show that the result still holds for . Proceed by contradiction and suppose there exists a word which is a non-empty suffix of and a non-empty prefix of . We have . Using Definition 19, starts and ends with .
If , then is a prefix of (recall that is a prefix of ). Consequently, is a non-empty suffix of and a non-empty prefix of . This contradicts the inductive assumption.
If , then is a prefix of (recall that is a prefix of ). In particular, is a factor of , and also a factor of (recall that is a suffix of ). This contradicts Proposition 30. ∎
4.3 Two particular factorizations of the -bonacci word
In this section, we study two different factorizations of the -bonacci word in terms of p-singular words (see Propositions 34 and 37), extending Proposition 5. The first one is similar to the factorization (5) of the Fibonacci word given in Proposition 5. To see this, simply put (5) and Proposition 22 altogether.
Proposition 34**.**
We have the following factorization of the -bonacci word
[TABLE]
Proof.
For all , set (when , is the empty word). To prove the statement, we show two things:
- (1)
For all , ,
- (2)
is a sequence of prefixes of .
Then, the mentioned factorization easily follows. For all , we trivially have
[TABLE]
since implies . Thus (1) is proved. For (2), we proceed by induction on . The -bonacci word starts with , so it is clear that is a prefix of for . Suppose that and that is a prefix of . The proof is again divided into two parts, according to the parity of .
Case 1. Suppose that is odd. From (7) (which is valid for any odd ), we know that , and using Proposition 24, ends with . By the induction hypothesis, is a prefix of ending with . Thus, there exists an infinite word over such that . Since is a fixed point of , we get
[TABLE]
showing that is also a prefix of .
Case 2. Assume that is even. From (8) (which is valid for any even ), we already have . By the induction hypothesis, there exists an infinite word over such that . Since is a fixed point of , we get
[TABLE]
as desired. ∎
The factorization of the -bonacci word in Proposition 37 is similar to the factorization (5) of the Fibonacci word given in Proposition 5. We first need some notations.
Definition 35**.**
Let be the finite word over defined by
[TABLE]
For all , define the finite word over by
[TABLE]
Note that the word is centered at .
Example 36**.**
If , and for all . When , we find , and for all , we have .
Proposition 37**.**
We have the following factorization of the -bonacci word
[TABLE]
Proof.
From Proposition 34, we get
[TABLE]
Using Definition 19(1), we have
[TABLE]
and for all , Definition 19(2) shows that
[TABLE]
Plugging these equalities into (9), we find
[TABLE]
using Definition 35. Since , we finally get
[TABLE]
In the following proposition, we get a particular factorization of the prefix of the -bonacci word . This factorization is a step forward to obtain the palindromic -factorization of the -bonacci word .
Proposition 38**.**
The word can be factorized as
[TABLE]
In particular, this factorization contains factors and all of them are palindromes. Moreover, if this factorization is written as
[TABLE]
*then, for all and for any infinite word , is the longest palindromic prefix of with a previous occurrence in , or if this prefix does not exist, the factor is a single letter. *
Proof.
To prove this result, we proceed by induction on . To avoid any confusion, from Definition 35, write
[TABLE]
The case is easily checked for we have . Now suppose that and assume that the result holds for values less than . Let us prove the first part of the statement. Using Definition 19(1) to rewrite , we first have
[TABLE]
From Proposition 28, the two finite words
[TABLE]
and
[TABLE]
are equal since for all . Since the latest word is by Definition 35, we deduce that
[TABLE]
By the induction hypothesis, we know that
[TABLE]
Proposition 28 finally gives
[TABLE]
as expected. Moreover, using the induction hypothesis, this factorization contains factors, which are all palindromes. Note that we have
[TABLE]
and . This ends the proof of the first part of the statement.
Let us show that the second part of the statement also holds. The proof is divided into three cases according to the value of the index of the considered factor .
Case 1. Suppose that . For all infinite word , is the longest palindromic prefix of with a previous occurrence in , or if this prefix does not exist, is limited to a single letter. Indeed, by the induction hypothesis and since is a particular infinite word, is the longest palindromic prefix of with a previous occurrence in , or if this prefix does not exist, is limited to a single letter..
Case 2. Assume that , and let be any infinite word. Looking at (10), we get
[TABLE]
Using Proposition 32, we see that is the longest palindromic prefix of that has already occurred in .
Case 3. Suppose that , and let be any infinite word. Proposition 32 shows that does not appear previously in . Hence, also satisfies the second part of the statement. ∎
Since the idea is to adopt the same strategy as in the previous case, we define a sequence of specific prefixes of the -bonacci word . This definition gives the sequence of prefixes of Definition 6 in the Fibonacci case as proved in Remark 40.
Definition 39**.**
For all , define
[TABLE]
where the words and are given in Definition 35. Notice that . From Proposition 37, also observe that, for all , we have
[TABLE]
Remark 40*.*
Let denote the sequence of words of Definition 6. For all , Proposition 22 gives
[TABLE]
This shows that Definition 39 agrees with Definition 6 when .
As in the Fibonacci case (see Proposition 7), any word can be written using p-singular words.
Proposition 41**.**
For all , we have
[TABLE]
Proof.
We proceed by induction on . For the base case , Definitions 35 and 39 give
[TABLE]
By Definition 19(1) (), we have
[TABLE]
Using Definition 35, we thus have
[TABLE]
as expected. Assume that , and suppose the result holds up to and we show it still holds for . Using Definition 39, we have . By the induction hypothesis, we get
[TABLE]
Rewriting using Definition 35, we find
[TABLE]
Since , from Definition 19(2), we have
[TABLE]
and we deduce that
[TABLE]
Consequently, from Definition 35, we obtain
[TABLE]
which ends the proof. ∎
4.4 The palindromic -factorization of the -bonacci word
In this section, we obtain the palindromic -factorization of the -bonacci word.
Lemma 42**.**
Let be a non-empty palindromic factor of .
- •
If begins with the letter [math], then , where is a palindromic factor of .
- •
If begins with the letter , then , where is a palindromic factor of .
Proof.
First, let us write where (resp., ) is a finite (resp., infinite) word over . By definition, we have
[TABLE]
The proof is by induction on . The result is certainly true when is a single letter (for instance, combine Propositions 37 and 38), so suppose .
Case 1a. Suppose begins with . If , then , as required (observe that is indeed a palindromic factor of : for instance, make use of Propositions 37 and 38). Suppose . By the induction hypothesis, we have , where is a palindromic factor of . We get . It is clear that is a palindrome. Let us show that is also a factor of , then we are done. From (11), being a factor of implies that it is also a factor of . By Lemma 17, is a factor of , so of .
Case 1b. Suppose begins with , where . Then . By the induction hypothesis, we have , where is a palindromic factor of . We get , as required.
Case 2. Suppose begins with , where . Then , where is a palindromic factor of that begins with [math]. By the induction hypothesis, we have for a palindromic factor of . We get . Again, it is easy to see that is a palindrome. It remains to prove that is a factor of . From (11), we deduce that being a factor of implies that it is a factor of too. Thus, is a factor of . By Lemma 17, is a factor of , so also of . ∎
Lemma 43**.**
Let . The set of palindromic prefixes of is
[TABLE]
Proof.
The proof is by induction on . The result is clearly true for . Suppose is even, the case where is odd is similar. First, let be a palindromic prefix of . Since is even, the word , and hence , begins with [math] by Proposition 24. By Lemma 42 (and also Proposition 34), we have where is a palindrome. By Lemma 29, the word is a palindromic prefix of . By the induction hypothesis, we have ; i.e., for some . By Lemma 29 again, we have . We have just showed that .
Let us prove that the other inclusion holds too. We clearly have . Now let . By Lemma 29, . By the induction hypothesis, we know that , i.e., there exists a non-empty word such that . Using Lemma 29 again, we get . By definition of , we have with . As a consequence, we find and , as expected. ∎
The last result of this section establishes the -factorization of the -bonacci word in terms of p-singular words.
Theorem 44**.**
The palindromic -factorization of the -bonacci word is
[TABLE]
Proof.
If , one simply has to combine Theorem 8 and Proposition 22. Now assume that . Let be the palindromic -factorization of the -bonacci word . We proceed by induction on to show that . From Proposition 34, we have
[TABLE]
and we see that the first two factors of the -factorization of the bonacci word are and . Now suppose that . From (12), we deduce that
[TABLE]
To prove that , we need to show that every palindromic prefix of which is different from is a factor of
[TABLE]
This is clear from Lemma 43. ∎
4.5 The palindromic -factorization of the -bonacci word
In the following lemma, which is similar to Lemma 10, recall that we start indexing words at [math].
Lemma 45**.**
Let . There are exactly two occurrences of the factor inside the word : one at position , the other at position .
Proof.
First consider the case , and let . Using the notation introduced in Remark 40, Lemma 10 shows that there are exactly two occurrences of the factor inside the word , one occurring at position , the other at position , as desired.
Assume that . Let . Using Definition 35 and Proposition 41, let us write
[TABLE]
with and . Thanks to this factorization, we immediately see that occurs at least twice as a factor of : one starting at position , the other beginning at position . We now show that there are no other occurrences of as a factor of . There are several cases to consider.
Case 1. The word cannot be a factor of , otherwise it contradicts Proposition 31.
Case 2. The word cannot be a factor of , otherwise it contradicts Proposition 30.
Case 3. If the word were a factor of , then would be a factor of , since the p-singular words are palindromes. This is impossible due to Proposition 31.
Case 4. Suppose that is a factor of , overlapping and . Using Corollary 26 (), we know that
[TABLE]
Consequently, is a factor of , overlapping and .
If starts somewhere within , with , or if starts with the first letter of , then, in each case, is a factor of , which contradicts Proposition 30. Therefore the occurrence of must start after the first letter of in , i.e., there exist a non-empty suffix of and a non-empty prefix of such that
[TABLE]
Observe that, in this case, is also a non-empty prefix of . This contradicts Lemma 33.
Case 5. Suppose that is a factor of , overlapping and . There exist a non-empty suffix of and a non-empty prefix of such that
[TABLE]
In this case, notice that is also a non-empty suffix of . This violates Lemma 33.
Case 6. Suppose that is a factor of , overlapping and . Since the p-singular words are palindromes, we obtain that is a factor of , overlapping and . As in the fifth case, we raise a contradiction.
Case 7. Suppose that is a factor of , overlapping and . Then is a factor of , overlapping (at least) and . In the view of the fourth case, we also reach a contradiction. ∎
The following result is the counterpart to Proposition 11.
Proposition 46**.**
Let . Let be a non-empty common finite prefix of the infinite words
[TABLE]
and
[TABLE]
Then is not a palindrome.
Proof.
If , the result directly follows from Propositions 11 and 22. Suppose that . We proceed by induction on . First, suppose . Then and . Note that begins with and begins with . The only possibility for is and in this case is not a palindrome, as required.
We now suppose that the result holds for and . We proceed by contradiction and suppose that is a palindrome.
We first observe that by Definition 35 and Lemma 29 we have if begins with [math] and if begins with . Thus either
[TABLE]
and
[TABLE]
or
[TABLE]
and
[TABLE]
By Definition 39, observe that is a factor of . Now using Lemma 42, we have either or , where is a palindrome and is a common prefix of and . This contradicts the induction hypothesis. We conclude that is not a palindrome, as required. ∎
In this last result, we obtain the -factorization of the -bonacci word in terms of p-singular words.
Theorem 47**.**
Let denote the palindromic -factorization of the -bonacci word . Then, the first factors are given by the factorization of emphasized in Proposition 38, and, for all , .
Proof.
Consider first the case where . From Theorem 12, we know that , , and, for all ,
[TABLE]
We clearly have , and for all , we also get using Proposition 22. Assume now that . From Definition 35, Propositions 37 and 38, we have
[TABLE]
Using the definition of the -palindromic factorization and looking at (14), the first factors of are given by the factorization of emphasized in Proposition 38 (recall the second part of Proposition 38: each , for , is the longest palindromic prefix of that has already occurred in , or if this prefix does not exist, is a single letter).
For the second part of the result, proceed by induction on . If , we must find the factor of the palindromic -factorization of . Using Definition 35, we have
[TABLE]
and from (13), we get
[TABLE]
On the one hand, observe that starts with by Definition 35, and . On the other hand, does not contain the letter by Proposition 32. Thus, the longest palindrome occurring before is .
If , we must find the factor of the palindromic -factorization of . Using Definition 35, we have
[TABLE]
Starting from (15) and using Definition 19(1), we have
[TABLE]
On the one hand, starts with by Definition 35, and . On the other hand, Proposition 32 shows that never occurs in . So the longest palindrome occurring before is .
For the induction step, suppose and assume that, for all , we have . We show it is still true for . On the one hand, using the induction hypothesis, Proposition 38 and finally Definition 39, we have
[TABLE]
and the goal is to find the next factor of the palindromic -factorization of , i.e., the word . On the other hand, using Proposition 37 first, then Definition 39 and finally Proposition 41, we get
[TABLE]
Comparing (16) and (17), we see that since is a palindrome occurring before in . Therefore, there exists a word such that . We claim that is in fact the empty word. For the following argument, recall that Definition 35 gives
[TABLE]
By Lemma 45, we know that there are exactly two occurrences of in : one starting at position , the other at position .
Case 1. Let us deal with the first occurrence of in . In this case, must be a common prefix of the infinite words
[TABLE]
and
[TABLE]
But by Proposition 46, we know that is not a palindrome unless is the empty word.
Case 2. Let us examine the second occurrence of in . In this case, the suffix of starts at position in . In particular, is a prefix of the infinite word
[TABLE]
Since , Definition 19 gives
[TABLE]
We now show that
[TABLE]
and
[TABLE]
Let us show (18). We have
[TABLE]
since implies that . Let us prove (19). We get
[TABLE]
Indeed, to see this, we make use of Corollary 26. If is even, then the result is clear. If is odd, then since . As a consequence of (18) and (19), must end with a non-empty prefix of , which contradicts Lemma 33.
As a conclusion to both cases, the longest palindrome occurring before is as required. ∎
5 Open Problems
Problem 48**.**
Let be a finite alphabet of size , . Define the family of infinite words over such that . For instance, when , observe that (Proposition 5 and Theorem 8) and
[TABLE]
but the Thue–Morse word
[TABLE]
which is the fixed point of the morphism does not belong to . Give a characterization of this family. Also give a characterization of the set of automatic words among this family.
The -bonacci words belong to the family of episturmian words. When studying episturmian words, it is standard to introduce a particular sequence of finite words related to their directive word and palindromic prefixes. Justin and Pirillo [8] showed that the sequence of palindromic prefixes of a standard episturmian word verifies
[TABLE]
When it comes to -bonacci words, the sequence coincides with the one from Definition 13. From Lemma 20, we can also show that for all , we have
[TABLE]
which in particular gives another way of showing that the words are all palindromes. We observe that the sequences and are intimately bonded, so a natural question is the following open problem.
Problem 49**.**
Find the palindromic - and -factorizations of other infinite words such as the Thue–Morse word, or more specifically episturmian words, billiard words, or rich words.
As we observed in Section 4.2, Lemma 20 gives another definition of the p-singular words, which may possibly be the more useful one when trying to extend the results of this paper to episturmian words (see the ordinary - and -factorizations of episturmian words given by Ghareghani, Mohammad-Noori, and Sharifani [6]).
Acknowledgements
We thank the anonymous referee for their useful and thorough proofreading of a first draft of this paper. Their comments helped us significantly.
Morteza Mohammad-noori is supported in part by a grant from IPM No. 96050113. Narad Rampersad is supported by NSERC Discovery Grant 418646-2012. Manon Stipulanti is supported by FRIA Grant 1.E030.16.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] J.-P. Allouche, M. Baake, J. Cassaigne, D. Damanik, “Palindrome complexity”, Theoret. Comput. Sci. 292 (2003), 9–31.
- 2[2] J. Berstel and A. Savelli, “Crochemore factorization of Sturmian and other infinite words”. In Proc. MFCS’06, LNCS 4162, Springer, 2006, pp. 157–166.
- 3[3] S. Constantinescu, L. Ilie, “The Lempel–Ziv complexity of fixed points of morphisms”, SIAM J. Discrete Math. 21 (2007), 466–481.
- 4[4] M. Crochemore, “Recherche linéaire d’un carré dans un mot”, C. R. Math. Acad. Sci. Paris 296 (1983), 781–784.
- 5[5] G. Fici, “Factorizations of the Fibonacci infinite word”, J. Integer Seq. 18 (2015), no. 9, Article 15.9.3, 14 pp..
- 6[6] N. Ghareghani, M. Mohammad–Noori, P. Sharifani, “On z 𝑧 z -factorization and c 𝑐 c -factorization of standard episturmian words”, Theoret. Comput. Sci. 412 (2011), 5232–5238.
- 7[7] A. Glen, “On Sturmian and Episturmian Words, and Related Topics”, Ph.D. Thesis (2006).
- 8[8] J. Justin and G. Pirillo, “Episturmian words and episturmian morphisms”, Theoret. Comput. Sci. 276 (2002), no. 1-2, 281–313.
