On some interesting ternary formulas
Pascal Ochem, Matthieu Rosenfeld

TL;DR
This paper investigates the avoidance of specific ternary formulas in words and graphs, revealing new avoidability results, counterexamples to conjectures, and classifications of formulas based on their avoidability index.
Contribution
It provides new classifications of ternary formulas' avoidability, disproves a conjecture, and links pattern avoidance to graph properties.
Findings
Only certain infinite ternary words avoid specific formulas.
Some formulas are avoided by polynomially many binary words.
The pattern ABACADABCA is unavoidable in certain graph classes.
Abstract
We obtain the following results about the avoidance of ternary formulas. Up to renaming of the letters, the only infinite ternary words avoiding the formula (resp. ) have the same set of recurrent factors as the fixed point of , , . The formula is avoided by polynomially many binary words and there exist arbitrarily many infinite binary words with different sets of recurrent factors that avoid it. If every variable of a ternary formula appears at least twice in the same fragment, then the formula is -avoidable. The pattern is unavoidable for the class of -minor-free graphs with maximum degree~. This disproves a conjecture of Grytczuk. The formula , or equivalently the palindromic pattern , has…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCoding theory and cryptography · semigroups and automata theory · Advanced Combinatorial Mathematics
On some interesting ternary formulas
Pascal Ochem111LIRMM, CNRS, Université de Montpellier, France. [email protected]
Matthieu Rosenfeld222LIP, ENS de Lyon, CNRS, UCBL, Université de Lyon, France. [email protected]
Abstract
We obtain the following results about the avoidance of ternary formulas. Up to renaming of the letters, the only infinite ternary words avoiding the formula (resp. ) have the same set of recurrent factors as the fixed point of , , . The formula is avoided by polynomially many binary words and there exists arbitrarily many infinite binary words with different sets of recurrent factors that avoid it. If every variable of a ternary formula appears at least twice in the same fragment, then the formula is -avoidable. The pattern is unavoidable for the class of -minor-free graphs with maximum degree . This disproves a conjecture of Grytczuk. The formula , or equivalently the palindromic pattern , has avoidability index .
Acknowledgements: This work was partially supported by the ANR project CoCoGro (ANR-16-CE40-0005).
1 Introduction
A pattern is a non-empty finite word over an alphabet of capital letters called variables. An occurrence of in a word is a non-erasing morphism such that is a factor of . The avoidability index of a pattern is the size of the smallest alphabet such that there exists an infinite word over containing no occurrence of .
A variable that appears only once in a pattern is said to be isolated. Following Cassaigne [4], we associate a pattern with the formula obtained by replacing every isolated variable in by a dot. For example, the pattern gives the formula . The factors that are separated by dots are called fragments. So , , and are the fragments of .
An occurrence of a formula in a word is a non-erasing morphism such that the -image of every fragment of is a factor of . As for patterns, the avoidability index of a formula is the size of the smallest alphabet allowing the existence of an infinite word containing no occurrence of . Clearly, if a formula is associated with a pattern , every word avoiding also avoids , so . Recall that an infinite word is recurrent if every finite factor appears infinitely many times. If there exists an infinite word over avoiding , then there exists an infinite recurrent word over avoiding . This recurrent word also avoids , so that . Without loss of generality, a formula is such that no variable is isolated and no fragment is a factor of another fragment. We say that a formula is divisible by a formula if does not avoid , that is, there is a non-erasing morphism such that the image of any fragment of under is a factor of a fragment of . If is divisible by , then every word avoiding also avoids . Let denote the -letter alphabet. We denote by the words of length over .
A formula is binary if it has at most 2 variables. We have recently determined the avoidability index of every binary formula [13]. This exhaustive study led to the discovery of some interesting binary formulas that are avoided by only a few binary words. Determining the avoidability index of every ternary formula would be huge task. However, we have identified some interesting ternary formulas and this paper describes their properties.
We say that two infinite words are equivalent if they have the same set of factors. Let be the fixed point of , , . A famous result of Thue [2, 14, 15] can be stated as follows:
Theorem 1**.**
Every recurrent ternary word avoiding , 010, and 212 is equivalent to .*
In Section 2, we obtain a similar result for by forbidding one ternary formula but without forbidding explicit factors in . In Section 3, we describe the set of binary words avoiding and . We show that these formulas are avoided by polynomially many binary words and that there exist infinitely many recurrent binary words with different sets of recurrent factors that avoid them. In the terminology of [13], these formulas are not essentially avoided by a finite set of words. In Section 4, we consider nice formulas. A formula is nice if for every variable of , there exists a fragment of that contains at least twice. This notion generalizes to formulas the notion of a doubled pattern (that is, a pattern that contains every variable at least twice). Every doubled pattern is -avoidable [12]. We show that every ternary nice formula is -avoidable. In Section 5, we show that is a -avoidable pattern that is unavoidable on graphs with maximum degree . In Section 6, we show that there exists a palindromic pattern with index .
A preliminary version of this paper, without the results in Sections 4 and 6, has been presented at WORDS 2017.
2 Formulas closely related to
For every letter , is the morphism such that , , and with . So is the morphism that fixes and exchanges the two other letters.
We consider the following formulas.
- •
- •
- •
- •
- •
Notice that is divisible by , , , .
Theorem 2**.**
Let . Every ternary recurrent word avoiding is equivalent to , , or .
By considering divisibility, we can deduce that Theorem 2 holds for ternary formulas. Since , , and are equivalent to their reverse, Theorem 2 also holds for the reverse ternary formulas.
Proof.
Using Cassaigne’s algorithm [3], we have checked that avoids , for . By symmetry, and also avoid .
Let be a ternary recurrent word avoiding . Suppose to get a contradiction that contains a square . Then there exists a non-empty word such that is a factor of . Thus, contains an occurrence of given by the morphism . This contradiction shows that is square-free.
An occurrence of a ternary formula over is said to be basic if . As already noticed by Thue [2], no infinite ternary word avoids squares and 012. So, every infinite ternary square-free word contains the factors obtained by letter permutation of 012. Thus, an infinite ternary square-free word contains a basic occurrence of if and only if it contains the same basic occurrence of . Therefore, contains no basic occurrence of .
A computer check shows that the longest ternary words avoiding , squares, 021020120, 102101201, and 210212012 have length . So we assume without loss of generality that contains 021020120.
Suppose to get a contradiction that contains 010. Since is square-free, contains 20102. Moreover, contains the factor of 20120 of 021020120. So contains the basic occurrence , , of . This contradiction shows that avoids 010.
Suppose to get a contradiction that contains 212. Since is square-free, contains 02120. Moreover, contains the factor of 021020 of 021020120. So contains the basic occurrence , , of . This contradiction shows that avoids 212.
Since avoids squares, 010, and 212, Theorem 1 implies that is equivalent to . By symmetry, every ternary recurrent word avoiding is equivalent to , , or . ∎
3 Avoidability of and
Following the terminology in [13], we say that a finite set of infinite words essentially avoids a formula if every infinite word over avoiding has the same set of recurrent factors as a word in . In this terminology, Theorem 2 says that essentially avoids many ternary formulas. Let be the fixed point of , , , , let be obtained from by exchanging 0 and 1, and let be obtained from by exchanging 0 and 3. Then essentially avoids [1]. Finally, five binary formulas [13] are known to be essentially avoided by a finite set of binary morphic words.
Thus, every formula that is known to be avoided by polynomially many words is actually essentially avoided by a finite set of morphic words. In this section, we show in particular that this is not the case for and .
We consider the morphisms , and , . That is, and for every .
We construct the set of binary words as follows:
- •
.
- •
If , then and .
- •
If and is a factor of , then .
Theorem 3**.**
*Let . The set of words such that is recurrent in an infinite binary word avoiding is . *
Proof.
Let be the set of words such that is recurrent in an infinite binary word avoiding . Let be the set of words such that is recurrent in an infinite binary word avoiding . An occurrence of is also an occurrence of , so that .
Let us show that . We study the small factors of a recurrent binary word avoiding . Notice that avoid the pattern since it contains the occurrence , , of . Since contains recurrent factors only, also avoids .
A computer check shows that the longest binary words avoiding , , 1001101001, and 0110010110 have length . So we assume without loss of generality that contains 1001101001.
Suppose to get a contradiction that contains 1100. Since avoids , contains 011001. Then contains the occurrence of . This contradiction shows that avoids 1100.
Since contains 0110, the occurrence of shows that avoids 01010. Similarly, contains 1001 and avoids 10101.
Suppose to get a contradiction that contains 0101. Since avoids 01010 and 10101, contains 001011. Moreover, avoids , so contains 10010110. Then contains the occurrence of . This contradiction shows that avoids 0101.
So avoids every factor in . Thus, it is to check that if we extend any factor 01 in to three letters to the right, we get either 01001 or 01101, that is, with . This implies that is the -image of some binary word.
Obviously, the image by a non-erasing morphism of a word containing a formula also contains the formula. Thus, the pre-image of by also avoids . This shows that .
Let us show that , that is, every word in avoids . We suppose to get a contradiction that a finite word avoids and that contains an occurrence of .
If we write , then the word is such that:
- •
Every factor 00 occurs at position .
- •
Every factor 01 occurs at position .
- •
Every factor 11 occurs at position .
- •
Every factor 10 occurs at position [math] or , depending on whether a factor is 100 or 110.
We say that a factor is gentle if either or . By previous remarks, all the occurrences of the same gentle factor have the same position modulo 3.
First, we consider the case such that is gentle. This implies that the distance between two occurrences of is . Because of the repetitions , , and are contained in the formula, we deduce that
- •
.
- •
.
- •
.
This gives . Clearly, such an occurrence of the formula in implies an occurrence of the formula in , which is a contradiction.
Now we consider the case such that is gentle. If is also gentle, then the factors and imply that . Thus, is gentle and the first case applies. If is not gentle, then , that is, and . Thus, contains both and . Since is gentle, this implies that 01 and 10 have the same position modulo , which is impossible.
The case such that is gentle is symmetrical. If is gentle, then and imply that . If is not gentle, then and . Thus, contains both and . Since is gentle, this implies that 01 and 01 have the same position modulo , which is impossible.
Finally, if , , and are not gentle, then the length of the three fragments of the formula is . So it suffices to consider the factors of length at most in to check that no such occurrence exists.
This shows that . Since , we obtain , which proves Theorem 3. ∎
Corollary 4**.**
Neither nor is essentially avoided by a finite set of morphic words.
Proof.
Let denote the number of words of length in . By construction of ,
[TABLE]
Thus . Devyatov [7] has recently shown that the factor complexity (i.e. the number of factors of length ) of a morphic word is either or for some integer . Thus, cannot be the union of the factors of a finite number of morphic words. ∎
4 Ternary nice formulas
Clark [5] introduced the notion of -avoidance basis for formulas, which is the smallest set of formulas with the following property: for every , every avoidable formula with variables is divisible by at least one formula with at most variables in the -avoidance basis. See [5, 8] for more discussions about the -avoidance basis. The avoidability index of every formula in the -avoidance basis has been determined:
- •
( [14])
- •
( [4])
- •
( [8])
- •
( [8])
- •
( [8])
- •
(, reverse of )
- •
( [1])
Recall that a formula is nice if for every variable of , there exists a fragment of that contains at least twice. Every formula in the -avoidance basis except is both nice and -avoidable. This raised the question in [8] whether every nice formula is -avoidable, which would generalize the -avoidability of doubled patterns. In this section, we answer this question positively for ternary formulas.
Theorem 5**.**
Every nice formula with at most variables is -avoidable.
We say that a nice formula is minimal if it is not divisible by another nice formula with at most the same number of variables. The following property of every minimal nice formula is easy to derive. If a variable appears as a prefix of a fragment , then
- •
is also a suffix of ,
- •
contains exactly two occurrences of ,
- •
is neither a prefix nor a suffix of any fragment other than ,
- •
Every fragment other than contains at most one occurrence of .
Thus, if is a minimal nice formula with variables, then has at most fragments. Moreover, every fragment has length at most , since otherwise it would contain a doubled pattern as a factor.
This implies an algorithm to list the minimal nice formulas with at most variables. The table below lists the formulas that need to be shown -avoidable, that is, the minimal nice formulas with at most variables that do not belong to the -avoidance basis. Also, if two distinct formulas are the reverse of each other, then only one of them appears in the table and the given avoiding word avoids both formulas. Some of these formulas are avoided by and the proof uses Cassaigne’s algorithm [3] as in Section 2. The other formulas are each avoided by the image by a uniform morphism of either any infinite -free word over or any infinite -free word over . We refer to [11, 12] for details about the technique to prove avoidance with morphic images of Dejean words.
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
[TABLE]
5 A counter-example to a conjecture of Grytczuk
Grytczuk [9] considered the notion of pattern avoidance on graphs. This generalizes the definition of nonrepetitive coloring, which corresponds to the pattern . Given a pattern and a graph , the avoidability index is the smallest number of colors needed to color the vertices of such that every non-intersecting path in induces a word avoiding .
We think that the natural framework is that of directed graphs, and we consider only non-intersecting paths that are oriented from a starting vertex to an ending vertex. This way, where is the infinite oriented path with vertices and arcs , for every . The directed graphs that we consider have no loops and no multiple arcs, since they do not modify the set of non-intersecting oriented paths. However, opposite arcs (i.e., digons) are allowed. Thus, an undirected graph is viewed as a symmetric directed graph: for every pair of distinct vertices and , either there exists no arc between and , or there exist both the arcs and . Let denote the infinite undirected path. We are nitpicking about directed graphs because, even though , there exist patterns such that . For example, and .
We do not attempt the hazardous task of defining a notion of avoidance for formulas on graphs.
A conjecture of Grytczuk [9] says that for every avoidable pattern , there exists a function such that , where is an undirected graph and denotes its maximum degree. Grytczuk [9] obtained that his conjecture holds for doubled patterns.
As a counterexample, we consider the pattern which is -avoidable by the result in Section 3. Of course, is not doubled because of the isolated variable . Let us show that is unavoidable on the infinite oriented graph with vertices and arcs and , for every . Notice that is obtained from by adding the arcs . The constant in the construction is arbitrary and can be replaced by any constant.
Suppose that is colored with colors. Consider the factors in the subgraph induced by the paths from to , for every . Since these factors have bounded length, the same factor appears on two disjoint such paths and (such that is on the left of ). Notice that contains vertices with index . By the pigeon-hole principle, contains three such vertices with the same color . Thus, contains an occurrence of such that on vertices with index . The same is true for . In , the occurrences of in and imply an occurrence of since we can skip an occurrence of the variable in thanks to some arc of the form .
This shows that is unavoidable on . So Grytczuk’s conjecture is disproved since has maximum degree . It is also a counterexample to Conjecture 6 in [6] which states that every avoidable pattern is avoidable on the infinite graph with vertices and the arcs and for every .
6 A palindrome with index
Mikhailova [10] considered the largest avoidability index of an avoidable pattern that is a palindrome. She proved that . An obvious lower bound is . For a better lower bound, we consider the palindromic pattern or, equivalently, the ternary formula . Since it is a ternary formula, is -avoidable. It remains to show that is not -avoidable. Let be a ternary recurrent word avoiding . Suppose to get a contradiction that contains a square . Then there exists a non-empty word such that is a factor of . Thus, contains an occurrence of given by the morphism . This contradiction shows that is square-free. A computer check shows that no infinite ternary square-free word avoids . This holds even if we forbid only squares and every occurrence of such that and . Thus, .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] K. A. Baker, G. F. Mc Nulty, and W. Taylor. Growth problems for avoidable words. Theoret. Comput. Sci. , 69(3):319–345, 1989.
- 2[2] J. Berstel. Axel Thue’s papers on repetitions in words: a translation , volume 20 of Publications du LACIM . Université du Québec à Montréal, 1994.
- 3[3] J. Cassaigne. An Algorithm to Test if a Given Circular HDOL-Language Avoids a Pattern. IFIP Congress , pages 459–464, 1994.
- 4[4] J. Cassaigne. Motifs évitables et régularité dans les mots. Ph D thesis, Université Paris VI, 1994.
- 5[5] R. J. Clark. Avoidable formulas in combinatorics on words . Ph D thesis, University of California, Los Angeles, 2001.
- 6[6] M. Debski, U. Pastwa, and K. Wesek. Grasshopper avoidance of patterns. Electron. J. Combinatorics. , 23(4):#P 4.17, 2016.
- 7[7] R. Devyatov. On subword complexity of morphic sequences. Math. USSR Sbornik .
- 8[8] G. Gamard, P. Ochem, G. Richomme, and P. Séébold. Avoidability of circular formulas. Theor. Comput. Sci. , 726:1–4, 2018.
