Characteristic Parameters and Special Trapezoidal Words
Alma D'Aniello, Alessandro De Luca

TL;DR
This paper investigates the properties of trapezoidal words, focusing on their characteristic parameters and classifications, including their prefixes, symmetry, and special types, extending previous work on Sturmian words.
Contribution
It provides new characterizations of right special and strictly bispecial trapezoidal words, expanding understanding of their structure and parameters.
Findings
Characterization of right special trapezoidal words
Identification of strictly bispecial trapezoidal words
Analysis of prefixes' symmetry and periodicity
Abstract
Following earlier work by Aldo de Luca and others, we study trapezoidal words and their prefixes, with respect to their characteristic parameters and (length of shortest unrepeated suffix, and shortest length without right special factors, respectively), as well as their symmetric versions and . We consider the distinction between closed (i.e., periodic-like) and open prefixes, and between Sturmian and non-Sturmian ones. Our main results characterize right special and strictly bispecial trapezoidal words, as done by de Luca and Mignosi for Sturmian words.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topicssemigroups and automata theory · Algorithms and Data Compression · Natural Language Processing Techniques
11institutetext: Dipartimento di Matematica e Applicazioni “R. Caccioppoli”,
Università degli Studi di Napoli Federico II, Italy
11email: [email protected] 22institutetext: DIETI, Università degli Studi di Napoli Federico II
via Claudio 21, 80125 Napoli, Italy
22email: [email protected]
Characteristic Parameters and Special Trapezoidal Words
Alma D’Aniello 11
Alessandro De Luca 22
Abstract
Following earlier work by Aldo de Luca and others, we study trapezoidal words and their prefixes, with respect to their characteristic parameters and (length of shortest unrepeated suffix, and shortest length without right special factors, respectively), as well as their symmetric versions and . We consider the distinction between closed (i.e., periodic-like) and open prefixes, and between Sturmian and non-Sturmian ones. Our main results characterize right special and strictly bispecial trapezoidal words, as done by de Luca and Mignosi for Sturmian words.
Keywords:
Trapezoidal word Closed word Periodic-like word.
1 Introduction
Sturmian words are certainly among the most studied objects in combinatorics on words, thanks to their natural definition, interesting characterizations, and numerous applications in several fields; see [1, 11] for surveys. An infinite word is Sturmian if it has exactly distinct factors (blocks of consecutive letters) for each length .
Trapezoidal words, introduced in [4, 8], are a natural finite analogue of Sturmian words. They have at most factors of each length , so that the graph of their factor complexity function is in the shape of an isosceles trapezoid (or triangle), whence their name.
The original definition of trapezoidal words, however, uses characteristic parameters and . For a finite word , denotes the length of the shortest unrepeated suffix of , whereas denotes the smallest integer such that has no right special factor of length . A word is then trapezoidal if and only if its length verifies (with true in general, see [8]). Finite Sturmian words, i.e., factors of Sturmian words, are trapezoidal, but there exist non-Sturmian trapezoidal words such as .
In [2], the property of being closed (aka periodic-like) or open is considered for trapezoidal words. In particular, the case of (prefixes of) the Fibonacci word was completely characterized, and this was later (cf. [5]) extended to all characteristic Sturmian words. Our first aim, explored in Section 3, is to extend some of those arguments to the general trapezoidal case, with respect to the values of characteristic parameters for closed and open prefixes.
In [7], a characterization was given for right (resp. left) special Sturmian words, i.e., finite words over such that both extensions (resp. ) are Sturmian. Previously, in [9], strictly bispecial ones (that is, words such that are all Sturmian) had been characterized; in particular, these turn out to be the noteworthy family of central words. In [6], bispecial (i.e., simultaneously right and left special) Sturmian words were characterized. Our main objective in Section 4 is to give similar characterizations for special trapezoidal words. Special words in a language are often useful for dealing with enumerative and structural questions.
2 Notation and Preliminaries
Let be an alphabet. The free monoid of all words over under concatenation is denoted by ; its neutral element is the empty word . For and , denotes the number of occurrences of in .
If , we say that is a factor of , is a prefix, and is a suffix. A border of is a word that is simultaneously a proper prefix and a suffix of . The definitions of factor and prefix also apply to (right-)infinite words over . A word is a right (resp. left) special factor of a finite or infinite word over if and (resp. , ) are both factors of .
As anticipated above, an infinite word is Sturmian if it has exactly distinct factors of each length . Therefore Sturmian words are the simplest aperiodic words in terms of factor complexity, and this is one of the reasons of interest in their study (see [1, 11]). Equivalently, a binary infinite word is Sturmian if and only if it has exactly one right (resp. left) special factor of each length. In particular, a Sturmian word is called standard or characteristic if all its left special factors occur as prefixes.
Among the many known characterizations of factors of Sturmian words, or finite Stumian words, perhaps the most famous and widely used one deals with balance; is Sturmian if and only if there is no word such that and are both factors of . Such a pair of factors for a non-Sturmian word is called a pathological pair (cf. [2, 4]).
Central words are palindromic prefixes of characteristic Sturmian words. They enjoy many equivalent definitions and interesting properties (cf. [7, 1]). In particular, a word is central if and only if it can be written as , , or for some integer and ; in the latter case, and are central words themselves.
The parameters and defined in the previous section were introduced in [8], along with their “left” counterparts (length of the shortest unrepeated prefix) and (smallest such that has no left special factor of length ). As already stated, a finite word is trapezoidal if , or equivalently if (cf. [8, 4]).
Another noteworthy parameter is the (minimal) period of a word , which can be defined by where is the longest border of . Central words can also be characterized in terms of periods; is central if and only if (cf. [7]). It is known (see for example [10]) that periodic extensions of a finite Sturmian word , i.e., words such that is a factor of and , are still Sturmian; however, this property does not extend to trapezoidal words.
Example 1
Let . Then and , so that is trapezoidal. Its period is , but the periodic extension is not trapezoidal, as .
The following theorem is essentially a restatement of [4, Theorem 5]. It characterizes non-Sturmian trapezoidal words as products of two periodic extensions of the elements in a pathological pair.
Theorem 2.1
A word is trapezoidal non-Sturmian if and only if it can be written as
[TABLE]
where is a central word, , , and . Furthermore, and .
Note that for such a word, is actually the shortest pathological pair (cf. [2]).
Example 2
The trapezoidal word considered in Example 1 can be written as , where and are periodic extensions (to the left and to the right, respectively) of the elements of the pathological pair .
The non-trapezoidal word does not verify the condition in Theorem 2.1. Indeed, its only pathological pair is , and writing we obtain .
3 Closed and Open Trapezoidal Words
A finite, nonempty word is said to be closed (or periodic-like in earlier works) if it has a border with no internal occurrences, that is, a factor occurring exclusively as a prefix and as a suffix; another common terminology for describing this situation is that is a complete (first) return to . In particular, single letters are closed, their border being the empty word.
A non-closed word is said to be open. Equivalently, is open if and only if its longest repeated prefix (resp. suffix) is a right (resp. left) special factor (cf. [8]).
The following result was proved in [3, Proposition 3.6].
Proposition 1
All closed trapezoidal words are Sturmian.
The following result, showing a basic connection between the behavior of and the property of being closed or open, is essentially known (see [5, Lemma 6 and Remark 8]). We report a proof for the sake of completeness.
Lemma 1
Let and . Then if is closed, and if is open.
Proof
Trivial if . Let then be nonempty, and () be its shortest unrepeated prefix, so that . If is closed, then its longest border has to be longer than since has internal occurrences in , and not longer than since otherwise would reoccur in . Hence and as desired. If is open, then cannot have internal occurrences in , since it is unrepeated in , and it cannot be a suffix either, otherwise would be closed. Hence is unrepeated in , i.e., . ∎
Lemma 2
Let be a trapezoidal word, . Then if is closed, and if is open.
Proof
Follows from Lemma 1 as and . ∎
Clearly, the following symmetric statement holds for left extensions .
Lemma 3
Let and . Then if is closed, and if is open. Moreover, if is trapezoidal, then if is closed, and otherwise.
For a trapezoidal word , the equality holds (cf. [8]). The following theorem, proved in [2, Proposition 4.4], is more precise.
Theorem 3.1
Let be a trapezoidal word. Then and if is closed, whereas and if is open.
Corollary 1
Let be a trapezoidal word and . Then and , unless
- •
* is closed and is open, or*
- •
* is open and is closed,*
in which cases we have and instead.
Proof
Consequence of Lemmas 1, 2, and Theorem 3.1.
Proposition 2
Let , . If is trapezoidal but not Sturmian, then is open.
Proof
If is not Sturmian, by Proposition 1 we are done. Let then be Sturmian and assume it is closed, by contradiction. By Proposition 1, is open. Writing as in Theorem 2.1, we have as is Sturmian, and by Lemma 1 and Theorems 3.1 and 2.1. It follows that is the longest border of . As and ends in , this is clearly absurd. ∎
Let denote the prefix of of length . The oc-sequence of a word is the characteristic sequence of its closed prefixes. In other terms, it is the binary word such that
[TABLE]
The oc-sequence is a useful tool in studying the structure of finite and infinite words. For example, in [5], the following was proved:
Theorem 3.2
Let be an infinite word, and let
[TABLE]
for suitable positive integers , with . Then for all , with equality holding for all if and only if is a characteristic Sturmian word.
In terms of oc-sequences, an immediate consequence of Lemmas 1 and 2 is the following (see also [5, Remark 8]):
Proposition 3
For any word , is the number of closed nonempty prefixes of , i.e., . If is trapezoidal, then .
The following two results show the behavior of characteristic parameters and at the end of runs of 1 and 0 in the oc-sequence.
Proposition 4
Let and be such that is an open trapezoidal word, while is closed. Then .
Proof
Since by Lemma 2, the longest left special factor of occurs as a suffix, and is the longest left special factor of . Clearly, the suffix has internal occurrences in , so that it is strictly shorter than the longest border . This proves , whence the assertion. ∎
Proposition 5
Let and be such that is a closed trapezoidal word, while is open. Then .
Proof
Since by Lemma 1, the longest repeated prefix of occurs as a suffix, so that . The assertion follows by Theorem 3.1.∎
While Theorem 3.2 gives local constraints for an oc-sequence (namely, each run of 1s is followed by a longer or equal run of 0s), our last three results can be viewed as more global constraints in the case of trapezoidal words. Considering the integer parameter
[TABLE]
gives an interesting way to picture this situation. Indeed by Proposition 3, if is trapezoidal then , so that increases or decreases by 1 at each subsequent prefix, depending on whether it is closed or open; moreover by Propositions 4–5, is necessarily positive (resp. non-positive) when encountering the last closed (resp. open) prefix in a run.
Example 3
Let . Then is closed for and , and open otherwise; that is, . As predicted by Propositions 4–5, reaches its (positive) local maxima, respectively 1 and 4, at and , and its (non-positive) local minimum of for . Since is not Sturmian, by Proposition 1 any subsequent trapezoidal right extension will be open, leading to an indefinite decrease of .
4 Special Trapezoidal Words
In analogy with the case of finite Sturmian words (cf. [9, 7]), we say that a trapezoidal word is right (resp. left) special if and (resp. ) are both trapezoidal, and that is strictly bispecial if , and are all trapezoidal.
Proposition 6
A right special trapezoidal word is Sturmian.
Proof
Let be a non-Sturmian trapezoidal word, then open by Proposition 1. If and is trapezoidal, then it is also not Sturmian (like ) and hence open. By Corollary 1, and . By Theorem 2.1, it follows with . This shows that is uniquely determined, so that cannot be right special. ∎
Symmetrically, one can prove that
Proposition 7
A left special trapezoidal word is Sturmian.
Theorem 4.1
A trapezoidal word is right special if and only if either of the following holds:
* is a suffix of a central word, or* 2. 2.
* for a central word , distinct letters , and a word such that .*
Symmetrically, is a left special trapezoidal word if and only if it is either a prefix of a central word, or written as for , , and .
Proof
As is well known (cf. [7]), both extensions of a word are Sturmian if and only if is a suffix of a central word. Let now be right special and such that one extension is not Sturmian. By Proposition 6, is Sturmian. As a consequence of Theorem 2.1, we must have where , is some central word, and is such that .
Conversely, if with , central and , then is Sturmian as is the only pathological pair in the trapezoidal non-Sturmian word ; therefore, must be Sturmian (and then trapezoidal) too.
The left special case is similar. ∎
The following theorem is a restatement of results in [6, 7]; it characterizes Sturmian words that are bispecial (as Sturmian words).
Theorem 4.2
Let . Then are all Sturmian if and only if for some central word , and a nonnegative integer . Furthermore, are all Sturmian if and only if is central, i.e., , whereas for exactly one such bilateral extension is not Sturmian, namely .
Semicentral words were defined in [2] by the property of having their longest repeated prefix, longest repeated suffix, longest left special factor, and longest right special factor coincide. In the same paper, they were characterized as words such that for some central word over . Hence, they correspond to the case in the previous theorem.
Our final result is a characterization of strictly bispecial trapezoidal words.
Theorem 4.3
A trapezoidal word is strictly bispecial if and only if it is central or semicentral.
Proof
By Theorem 4.2, central words are strictly bispecial. Moreover, by the same theorem all bilateral extensions of a semicentral word are Sturmian, except for which is trapezoidal non-Sturmian by Theorem 2.1.
Conversely, if is a strictly bispecial trapezoidal word, then either all bilateral extensions are Sturmian, in which case is central by Theorem 4.2 and we are done, or at least one is not.
Assume, for instance, that is trapezoidal non-Sturmian, the other cases being similar. By Proposition 6, is Sturmian, so that must be too. Symmetrically, as a consequence of Proposition 7, must be Sturmian as well (where ). In all cases, are all Sturmian. By Theorem 4.2, it follows for some ; as a consequence of Theorem 2.1, we must have . ∎
5 Concluding Remarks
A few related problems remain open. In particular, in [5] the oc-sequence for (prefixes of) characteristic Sturmian words was characterized, see Theorem 3.2. The general trapezoidal case, and even the non-standard Sturmian one, is still open. We believe our results may shed some light on the matter, as illustrated at the end of Section 3.
Regarding the preceding section, a simple and elegant characterization of (not necessarily strictly) bispecial trapezoidal words, such as Theorem 4.2 is for the Sturmian case, is still missing. Theorem 4.1 might be an ingredient for such a result.
Acknowledgments
We thank the anonymous referees for their many helpful comments. This paper is dedicated to the memory of our dear colleague Aldo de Luca (1941–2018).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Berstel, J., Séébold, P.: Sturmian Words. In: Lothaire, M. (ed.) Algebraic Combinatorics on Words. Cambridge University Press, Cambridge UK (2002), chapter 2
- 2[2] Bucci, M., De Luca, A., Fici, G.: Enumeration and structure of trapezoidal words. Theoret. Comput. Sci. 468 , 12–22 (2013). https://doi.org/10.1016/j.tcs.2012.11.007
- 3[3] Bucci, M., de Luca, A., De Luca, A.: Rich and periodic-like words. In: Diekert, V., Nowotka, D. (eds.) Developments in Language Theory. Lecture Notes in Computer Science, vol. 5583, pp. 145–155. Springer (2009). https://doi.org/10.1007/978-3-642-02737-6_11
- 4[4] D’Alessandro, F.: A combinatorial problem on trapezoidal words. Theoret. Comput. Sci. 273 , 11–33 (2002). https://doi.org/10.1016/S 0304-3975(00)00431-X
- 5[5] De Luca, A., Fici, G., Zamboni, L.Q.: The sequence of open and closed prefixes of a Sturmian word. Advances in Applied Mathematics 90 , 27–45 (2017). https://doi.org/10.1016/j.aam.2017.04.007
- 6[6] Fici, G.: On the structure of bispecial Sturmian words. J. Comput. Syst. Sci. 80 (4), 711–719 (2014). https://doi.org/10.1016/j.jcss.2013.11.001
- 7[7] de Luca, A.: Sturmian words: structure, combinatorics, and their arithmetics. Theoret. Comput. Sci. 183 , 45–82 (1997). https://doi.org/10.1016/S 0304-3975(96)00310-6
- 8[8] de Luca, A.: On the combinatorics of finite words. Theoret. Comput. Sci. 218 , 13–39 (1999). https://doi.org/10.1016/S 0304-3975(98)00248-5
