Automatic sequences and generalised polynomials
Jakub Byszewski, Jakub Konieczny

TL;DR
This paper investigates the conjecture that bounded generalised polynomial functions are not generated by finite automata unless they are ultimately periodic, using ergodic theory to provide partial results and connections to automatic sequences.
Contribution
It proves that certain sequences derived from polynomials with irrational coefficients are not automatic and relates the conjecture to the nature of powers of integers as generalised polynomials.
Findings
Sequences from polynomials with irrational coefficients are not automatic.
The conjecture is equivalent to powers of integers not being generalised polynomials.
Partial resolution shows such sequences are periodic outside a sparse set.
Abstract
We conjecture that bounded generalised polynomial functions cannot be generated by finite automata, except for the trivial case when they are ultimately periodic. Using methods from ergodic theory, we are able to partially resolve this conjecture, proving that any hypothetical counterexample is periodic away from a very sparse and structured set. In particular, we show that for a polynomial with at least one irrational coefficient (except for the constant one) and integer , the sequence is never automatic. We also prove that the conjecture is equivalent to the claim that the set of powers of an integer is not given by a generalised polynomial.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Automatic sequences and generalised polynomials
Jakub Byszewski and Jakub Konieczny
Department of Mathematics and Computer Science
Institute of Mathematics
Jagiellonian University
ul. prof. Stanisława Łojasiewicza 6
30-348 Kraków
Mathematical Institute
University of Oxford
Andrew Wiles Building
Radcliffe Observatory Quarter
Woodstock Road
Oxford
OX2 6GG
Einstein Institute of Mathematics
Edmond J. Safra Campus
The Hebrew University of Jerusalem
Givat Ram
Jerusalem, 9190401
Israel
Abstract.
We conjecture that bounded generalised polynomial functions cannot be generated by finite automata, except for the trivial case when they are ultimately periodic.
Using methods from ergodic theory, we are able to partially resolve this conjecture, proving that any hypothetical counterexample is periodic away from a very sparse and structured set. In particular, we show that for a polynomial with at least one irrational coefficient (except for the constant one) and integer , the sequence is never automatic.
We also prove that the conjecture is equivalent to the claim that the set of powers of an integer is not given by a generalised polynomial.
Key words and phrases:
Generalised polynomials, automatic sequences, IP sets, nilmanifolds, linear recurrence sequences, regular sequences
2010 Mathematics Subject Classification:
Primary: 11B85, 37A45. Secondary: 37B05, 37B10, 11J71, 11B37, 05C20
Contents
- 1 Background
- 2 Density 1 results
- 3 Combinatorial structure of automatic sets
- 4 Examples and properties of automatic sets
- 5 Proof of Theorem D
- 6 Concluding remarks
Introduction
Automatic sequences are sequences whose -th term is produced by a finite-state machine from the base- digits of . (A precise definition is given below.) By definition, automatic sequences can take only finitely many values. Allouche and Shallit [AS92, AS03b] have generalised the notion of automatic sequences to a wider class of regular sequences and demonstrated its ubiquity and links with multiple branches of mathematics and computer science. The problem of demonstrating that a certain sequence is or is not automatic or regular has been widely studied, particularly for sequences of arithmetic origin (see, e.g., [AS92, AS03b, Bel07, SY11, MR15, SP11, Mos08, Row10]).
The aim of this article is to continue this study for sequences that arise from generalised polynomials, i.e., expressions involving algebraic operations and the floor function. Our methods rely on a number of dynamical and ergodic tools. A crucial ingredient in our work is one of the main results from the companion paper [BK16] concerning the combinatorial structure of the set of times at which an orbit on a nilmanifold hits a semialgebraic subset. This is possible because by the work of Bergelson and Leibman [BL07] generalised polynomials are closely related to dynamics on nilmanifolds.
In [AS03b, Theorem 6.2] it is proved that the sequence given by for real numbers is regular if and only if is rational. The method used there does not immediately generalise to higher degree polynomials in , but the proof implicitly uses rotation on a circle by an angle of . Replacing the rotation on a circle by a skew product transformation on a torus (as in Furstenberg’s proof of Weyl’s equidistribution theorem [Fur61]), we easily obtain the following result. (For more on regular sequences, see Section 1.)
Theorem A**.**
Let be a polynomial. Then the sequence , is regular if and only if all the coefficients of except possibly for the constant term are rational.
In fact, we show the stronger property that for any integer the sequence is not automatic unless all the coefficients of except for the constant term are rational, in which case the sequence is periodic. It is natural to inquire whether a similar result can be proven for more complicated expressions involving the floor function such as, e.g., . Such sequences are called generalised polynomial and have been intensely studied (see, e.g., [Hål93, Hål94, HK95, BL07, Lei12, GTZ12, GT12]).
Another closely related motivating example comes from the classical Fibonacci word111We will freely identify words in with functions . , whose systematic study was initiated by Berstel [Ber81, Ber85] (for historical notes, see [AS03a, Sec. 7.12]). There are several ways to define it, each shedding light from a different direction.
- (i)
Morphic word. Define the sequence of words , , and for . Then is the (coordinate-wise) limit of as . 2. (ii)
Sturmian word. Explicitly, . 3. (iii)
Fib-automatic sequence. If a positive integer is written in the form , where and there is no with , then .
The equivalence of (i) and (ii) is well-known, see, e.g., [Lot02, Chpt. 2]. The representation of as a sum of Fibonacci numbers in (iii) is known as the Zeckendorf representation; it exists for each and is unique. The notion of automaticity using Zeckendorf representation (or, for that matter, a representation from a much wider class) in place of the usual base- representation of the input was introduced and studied by Shallit in [Sha88] (see also [Rig00]), where among other things the equivalence of (i) and (iii) is shown. We return to this subject in Section 6.
Hence, gives a non-trivial example of a sequence which is given by a generalised polynomial and satisfies a variant of automaticity related to the Zeckendorf representation. It is natural to ask if similar examples exist for the usual notion of -automaticity. Motivated by Theorem A, we believe the answer is essentially negative, except for trivial examples. We say that a sequence is ultimately periodic if it coincides with a periodic sequence except on a finite set. The following conjecture was the initial motivation for the line of research pursued in this paper.
Conjecture A**.**
Suppose that a sequence is simultaneously automatic and generalised polynomial. Then is ultimately periodic.
In this paper, we prove several slightly weaker variants of Conjecture A. First of all, we prove that the conjecture holds except on a set of density zero. In fact, in order to obtain such a result, we only need a specific property of automatic sequences. For the purpose of stating the next theorem, let us say that a sequence is weakly periodic if for any restriction of to an arithmetic sequence given by , , there exist , with , such that . Of course, any periodic sequence is weakly periodic, but not conversely. All automatic sequences are weakly periodic (this follows from the fact that automatic sequences have finite kernels, see Lemma 2.1). Another non-trivial example of a weakly periodic sequence is the characteristic function of the square-free numbers.
Theorem B**.**
Suppose that a sequence is weakly periodic and generalised polynomial. Then there exists a periodic function and a set of upper Banach density zero such that for .
(For the definition of Banach density, see Section 1.)
Theorem B is already sufficient to rule out automaticity of many natural examples of generalised polynomials. In particular, sequences such as or are not automatic. For details and more examples, see Corollary 2.7.
To obtain stronger bounds on the size of the “exceptional” set , we restrict ourselves to automatic sequences and exploit some finer properties of generalised polynomials studied in the companion paper [BK16]. We use results concerning growth properties of automatic sequences to derive the following dichotomy: If is an automatic sequence, then the set of integers where takes the value is either combinatorially rich (it contains what we call an set) or extremely sparse (in particular, the number of its elements up to grows as for some integer ); see Theorem 3.10. This result is especially interesting for sparse automatic sequences, i.e., automatic sequences which take non-zero values on a set of integers of density [math]. Conversely, in [BK16] we show that sparse generalised polynomials must be free of similar combinatorial structures. As a consequence, we prove the following result.
Theorem C**.**
Suppose that a sequence is automatic and generalised polynomial. Then there exists a periodic function , a set , and a constant such that for and
[TABLE]
as for a certain constant (dependent on ).
In fact, we obtain a much more precise structural description of the exceptional set (see Theorem 3.7 for details). Similar techniques allow us to show non-automaticity of some sparse generalised polynomials. For instance, the sequence given by
[TABLE]
is not automatic provided that is small enough. (Here, denotes the distance of from .) For details, see Example 4.7.
While Theorem C does not resolve Conjecture A, our proof thereof greatly restricts the number of possible counterexamples. In fact, in order to prove Conjecture A, it would suffice to prove that the characteristic sequence of powers of an integer given by
[TABLE]
is not a generalised polynomial.
Theorem D**.**
Let be an integer. Then exactly one of the following statements holds:
- (i)
All sequences that are simultaneously -automatic and generalised polynomial are ultimately periodic. 2. (ii)
The characteristic sequence of the powers of is generalised polynomial.
Unfortunately, we are currently unable to decide which of the two possibilities in Theorem D holds. Although we expect that should not be a generalised polynomial, in [BK16] we obtain several examples of algebraic numbers such that the characteristic function of the set is generalised polynomial, where denotes the closest integer to . All our examples are Pisot units (a Pisot number is an algebraic integer all of whose conjugates have modulus ; a Pisot unit is a Pisot number whose minimal polynomial has constant term ). Conversely, there is no for which we can prove that the characteristic function of is not given by a generalised polynomial. This prompts us to propose the following question.
Question A**.**
Suppose that is such that the characteristic function of the set is given by a generalised polynomial. Is it then necessarily the case that is a Pisot unit?
For a more detailed discussion of this question, see [BK16, Section 6]. If is a Pisot number, then obeys a linear recurrence. We show that for such , the characteristic function of cannot be a counterexample to Conjecture A (see Proposition 4.9) except possibly if is an integer.
By Theorem D, determining the validity of Conjecture A is equivalent to answering Question A in the special case when is an integer.
Contents
In Section 1, we discuss some basic notions and results concerning automatic sequences and dynamical systems. We intended this section to be accessible to readers familiar with only one (or neither) of these topics. In Section 2, we prove Theorem A and Theorem B using methods from topological dynamics. In Section 3, we use known results on growth and structure of automatic sequences to prove that they are either very sparse and structured (in which case we call them arid) or are combinatorially rich. Together with a result about dynamics on nilmanifolds, this allows us to obtain Theorem C. Section 4 contains four seperate topics concerning examples and non-examples of automatic sets and uniform density of symbols in automatic sequences. Section 5 is devoted to the proof of Theorem D. Finally, Section 6 discusses some open problems and future research topics.
Acknowledgements
The authors thank Ben Green for much useful advice during the work on this project, Vitaly Bergelson and Inger Håland-Knutson for valuable comments on the distribution of generalised polynomials, and Jean-Paul Allouche and Narad Rampersad for information about related results on automatic sequences.
Thanks also go to Sean Eberhard, Dominik Kwietniak, Freddie Manners, Rudi Mrazović, Przemek Mazur, Sofia Lindqvist, and Aled Walker for many informal discussions.
This research was supported by the National Science Centre, Poland (NCN) under grant no. DEC-2012/07/E/ST1/00185.
Finally, we would like to express our gratitude to the organisers of the conference New developments around conjecture and other classical problems in Ergodic Theory in Cieplice, Poland in May 2016 where we began our project.
1. Background
Notations and generalities
We denote the sets of positive integers and of nonnegative integers by and . We denote by the set We use the Iverson convention: whenever is any sentence, we denote by \left\llbracket\varphi\right\rrbracket its logical value ( if is true and [math] otherwise). We denote the number of elements in a finite set by .
For a real number , we denote its integer part by , its fractional part by , the nearest integer to by , and the distance from to the nearest integer by .
We use some standard asymptotic notation. Let and be two functions defined for sufficiently large integers. We say that or if there exists such that for sufficiently large . We say that if for every we have for sufficiently large .
For a subset , we say that has natural density if
[TABLE]
We say that has upper Banach density if
[TABLE]
We now formally define generalised polynomials.
Definition 1.1** (Generalised polynomial).**
The family of generalised polynomials is the smallest set of functions containing the polynomial maps and closed under addition, multiplication, and the operation of taking the integer part. Whenever it is more convenient, we regard generalised polynomials as functions on .
A set (or ) is called generalised polynomial if its characteristic function given by f(n)=\left\llbracket n\in E\right\rrbracket is a generalised polynomial. (Note that this definition depends on whether we are regarding the generalised polynomial as a function on or on and a generalised polynomial set might a priori not be generalised polynomial when considered as a subset of . It will always be clear from the context which meaning we have in mind.)
An example of a generalised polynomial is therefore a function given by the formula .
Automatic sequences
Whenever is a (finite) set, we denote the free monoid with basis by . It consists of finite words in , including the empty word , with the operation of concatenation. We denote the concatenation of two words by and we denote the length of a word by . In particular, . We say that a word is a factor of a word if there exist words such that . We denote by the reversal of the word (the word in which the elements of are written in the opposite order).
Let be an integer and denote by the set of digits in base . For , we denote by the integer whose expansion in base is , i.e., if , , then . Conversely, for an integer , we write for the base- representation of (without an initial zero). In particular, .
The class of automatic sequences consists, informally speaking, of finite-valued sequences whose values are obtained via a finite procedure from the digits of base- expansion of an integer .
The most famous example of an automatic sequence is arguably the Thue–Morse sequence, first discovered by Prouhet in 1851. Let denote the sum of digits of the base 2 expansion of an integer . Then the Thue–Morse sequence is given by if is odd and if is even.
We will introduce the basic properties of automatic sequences. For more information, we refer the reader to the canonical book of Allouche and Shallit [AS03a]. To formally introduce the notion of automatic sequences, we begin by discussing finite automata.
Definition 1.2**.**
A deterministic finite -automaton with output (which we will just call a -automaton) consists of the following data:
- (i)
a finite set of states ; 2. (ii)
an initial state ; 3. (iii)
a transition map ; 4. (iv)
an output set ; 5. (v)
an output map .
We extend the map to a map (denoted by the same letter) by the recurrence formula
[TABLE]
We call a sequence -automatic if it can be produced by a -automaton in the following manner: one starts at the initial state of the automaton, follows the digits of the base- expansion of an integer , and then uses the output function to print the -th term of the sequence. This is stated more precisely in the following definition.
Definition 1.3**.**
A sequence with values in a finite set is -automatic if there exists a -automaton such that . We call a set of nonnegative integers automatic if the characteristic sequence of given by a_{n}=\left\llbracket n\in E\right\rrbracket is automatic.
For some applications, it will be useful to consider the following variant of the definition. A function is automatic if there exists a -automaton such that for .
The values of the Thue–Morse sequence are given by the -automaton
s_{0}$$s_{1}0110
with nodes depicting the states of the automaton, edges describing the transition map, , and . Thus, the Thue–Morse sequence is -automatic.
In the definition above, the automaton reads the digits starting with the most significant one. In fact, we might equally well demand that the digits be read starting with the least significant digit or that the automaton produce the correct answer even if the input contains some leading zeros. Neither of these modifications changes the notion of automatic sequence [AS03a, Theorem 5.2.3] (though of course for most sequences we would need to use a different automaton to produce a given automatic sequence).
There is a number of equivalent definitions of the notion of automatic sequence connecting them to different branches of mathematics (stated for example in terms of algebraic power series over finite fields or letter-to-letter projections of fixed points of uniform morphisms of free monoids). We will need one such definition that has a combinatorial flavour and is expressed in terms of the -kernel.
Definition 1.4**.**
The -kernel of a sequence is the set of its subsequences of the form
[TABLE]
Automaticity of a sequence is equivalent to finiteness of its kernel, originally due to Eilenberg [Eil74].
Proposition 1.5**.**
[AS03a, Theorem 6.6.2]** Let be a sequence. Then the following conditions are equivalent:
- (i)
The sequence is -automatic. 2. (ii)
The -kernel is finite.
For the Thue–Morse sequence we have the relations , , and hence one easily sees that the -kernel consists of only two sequences . This gives another argument for the -automaticity of the Thue–Morse sequence.
An automatic sequence by definition takes only finitely many values. In 1992 Allouche and Shalit [AS92] generalised the notion of automatic sequences to the wider class of -regular sequences that are allowed to take values in a possibly infinite set. The definition of regular sequences is stated in terms of the -kernel. For simplicity, we state the definition over the ring of integers, though it could also be introduced over a general (noetherian) ring.
Definition 1.6**.**
Let be a sequence of integers. We say that the sequence is -regular if its -kernel spans a finitely generated abelian subgroup of .
For example, the following sequences are easily seen to be -regular: , , . (The corresponding subgroups spanned by the -kernel have rank , , and , respectively. In the case of , the subgroup spanned by the -kernel is free abelian with basis consisting of and the constant sequence .) In fact, every -automatic (integer-valued) sequence is obviously -regular, and the following converse result holds.
Theorem 1.7**.**
[AS03a, Theorem 16.1.5]** Let be a sequence of integers. Then the following conditions are equivalent:
- (i)
The sequence is -automatic. 2. (ii)
The sequence is -regular and takes only finitely many values.
Corollary 1.8**.**
[AS03a, Corollary 16.1.6]** Let be a sequence of integers that is -regular and let be an integer. Then the sequence is -automatic.
A convenient tool for ruling out that a given sequence is automatic is provided by the pumping lemma.
Lemma 1.9**.**
[AS03a, Lemma 4.2.1]** Let be a -automatic sequence. Then there exists a constant such that for any with and any integer there exist such that , , , and takes the same value for all .
The final issue that we need to discuss is the dependence of the notion of -automaticity on the base . While the Thue–Morse sequence is -regular, and is also easily seen to be -regular, it is not -regular. This follows from the celebrated result of Cobham [Cob69]. We say that two integers are multiplicatively independent if they are not both powers of the same integer (equivalently, ).
Theorem 1.10**.**
[AS03a, Theorem 11.2.2]** Let be a sequence with values in a finite set . Assume that the sequence is simultaneously -automatic and -automatic with respect to two multiplicatively independent integers . Then is eventually periodic.
We will have no use for Cobham’s theorem. We will, however, use the following much easier related result.
Theorem 1.11**.**
[AS03a, Theorem 6.6.4]** Let be a sequences with values in a finite set . Let be two multiplicatively dependent integers. Then the sequence is -automatic if and only if it is -automatic.
Let denote a finite alphabet and let and be languages, i.e., subsets of . We denote by the concatenation of and . For an integer , we denote by the concatenation of copies of with the understanding that . The Kleene closure of is . A language is regular if it can be obtained from the empty set and the letters of the alphabet using the operations of union, concatenation, and the Kleene closure.
Regular languages are intimately connected with automatic sequences via Kleene’s theorem [Kle56] (see also [AS03a, Thm. 4.1.5]), which says that a language over the alphabet is regular if and only if the sequence given by a_{n}=\left\llbracket(n)_{k}\in L\right\rrbracket is -automatic.
Dynamical systems
An (invertible, topological) dynamical system is given by a compact metrisable space and a continuous homeomorphism . We say that is minimal if for every point the orbit is dense in . (Equivalently, the only closed subsets such that are or .) We say that is totally minimal if the system is minimal for all .
Let be a dynamical system. We say that a Borel measure on is invariant if for every Borel subset we have . By the Krylov–Bogoliubov theorem (see, e.g., [EW11, Thm. 4.1]), each dynamical system has at least one invariant measure. We say that a dynamical system in uniquely ergodic if it has exactly one invariant measure.
If is minimal, , and is open, then the set is syndetic, i.e., has bounded gaps [Fur81, Thm. 1.15].
We will need the following standard consequence of the ergodic theorem [EW11, Thm 4.10], which we also note in [BK16, Corollary 1.4]. (Below and elsewhere, denotes the boundary of the set .)
Corollary 1.12**.**
Let be a uniquely ergodic dynamical system with the invariant measure . Then for any and any with , the set has upper Banach density .
In fact, in this case the limit superior in the definition of upper Banach density can be replaced by a limit.
The connection between generalised polynomials and dynamics of nilsystems has been intensely studied by Bergelson and Leibman in [BL07] (see also [Lei12]). Nilsystems are a widely studied class of dynamical systems of algebraic origin. Here, we only need several properties which these systems enjoy; in particular, we shall spare the reader the definition of a nilsystem. A good introduction to nilsystems may be found in the initial sections of [BL07].
A nilsystem is minimal if and only if it is uniquely ergodic; the unique invariant measure has then full support. If is minimal but not totally minimal, then splits into finitely many connected components , each is preserved by , and each is a totally minimal nilsystem.
As a special case of the aforementioned connection between nilsystems and generalised polynomials [BL07, Thm. A], we have the following result. (For more details, see also [BK16].)
Theorem 1.13** (Bergelson–Leibman).**
Let be a generalised polynomial taking finitely many values . Then there exists a minimal nilsystem as well as a point and a partition such that and
[TABLE]
for each .
Remark 1.14**.**
Let be a generalised polynomial taking finitely many values. Then there exists such that for any the generalised polynomial has a representation as in Theorem 1.13 with totally minimal.
2. Density 1 results
Polynomial sequences
Our first purpose in this section is to prove Theorem A. Recall that we aim to show that the sequence is not regular if has at least one irrational coefficient other than the constant term. We will show more, namely that the sequence is not automatic for any . In fact, we will only need to work with the weaker property of weak periodicity, defined in the introduction.
Lemma 2.1**.**
Every automatic sequence is weakly periodic.
Proof.
Let be a -automatic sequence. Since the restriction of a -automatic sequence to an arithmetic progression is again -automatic [AS03a, Theorem 6.8.1], it will suffice to find and with such that .
The -kernel of , consisting of the functions for , is finite. Pick sufficiently large that . By the pigeonhole principle, there exist such that . ∎
The proof of the following proposition is closely analogous to Furstenberg’s proof [Fur61] of Weyl’s equidistribution theorem [Wey16] (see also [EW11, Section 4.4.3]).
Proposition 2.2**.**
Let be a polynomial, and let be an integer. Then the sequence is weakly periodic if and only if it is periodic. This happens precisely when all non-constant coefficients of are rational.
Proof.
If all coefficients of are rational (except possibly for the constant term) then the sequence is easily seen to be periodic, hence weakly periodic.
Now suppose that at least one non-constant coefficient of is irrational. Replacing with for multiplicatively large and , we may assume that the leading coefficient of is irrational. We will prove marginally more than claimed, namely that for any the sequence given by
[TABLE]
fails to be weakly periodic. For a proof by contradiction, suppose this claim is false for some choice of .
It will be convenient to expand , where , , and . Note that and
[TABLE]
We will represent the sequence dynamically. Let be the -dimensional torus and define the self-map by
[TABLE]
Put for . A direct computation shows that for and we have
[TABLE]
and in particular Putting , we thus find that
[TABLE]
Since is weakly periodic, we may find and such that .
The dynamical system can be obtained as a sequence of iterated group extension over an irrational rotation, and hence is totally minimal (this follows easily from the results in, e.g., [EW11, Section 4.4.3]). In particular, for any point we may find a sequence such that and . It follows that the points converge to and lie in . Thus, . In light of total minimality of , this is only possible if or — but this is absurd. ∎
Corollary 2.3**.**
With the notation of Proposition 2.2, the sequence is automatic if and only if it is periodic, and if and only if all the non-constant coefficients of are rational.
Proof.
Immediate from Proposition 2.2 and Lemma 2.1. ∎
Proof of Theorem A.
Suppose first that all non-constant coefficients of are rational, and fix an integer . Let be such that has integer coefficients, except possibly for the constant term. Then is an integer-valued polynomial, hence is -regular ( is contained in the -dimensional -module consisting of integer-valued polynomials of degree ). Also, is periodic, hence -automatic, hence -regular. It follows that is regular.
Conversely, suppose that is regular. Then by Theorem 1.7 for any choice of the sequence is automatic. Now, it follows from Corollary 2.3 that all non-constant coefficients of are rational. ∎
Generalised polynomials
Having dealt with the case of polynomial maps, we move on to a more general context. Our next goal is to prove Theorem B. We begin by abstracting and generalising some of the key steps from the proof of Theorem A.
Recall that a set of integers is thick if it contains arbitrarily long segments of consecutive integers, and syndetic if it has bounded gaps; every thick set intersects every syndetic set.
Lemma 2.4**.**
Let be a totally minimal dynamical system. Let be a set which is neither empty nor dense and such that . Let . Suppose that is a sequence such that the set of with f(n)=\left\llbracket T^{n}z\in A\right\rrbracket is thick. Then is not weakly periodic.
Proof.
Suppose for the sake of contradiction that is weakly periodic. In particular there exist , with such that . Put .
We will show that . Since is continuous and , it will suffice to prove that . Once this is accomplished, the contradiction follows immediately, because is minimal, while .
Pick any and an open neighbourhood of ; we aim to show that . Put , and consider the set of those for which . Since is minimal and , the set is syndetic. Let be the set of those for which f(n)=\left\llbracket T^{n}z\in A\right\rrbracket and put and .
Since is thick, so is . Since is syndetic, is non-empty. Pick any and put . Since , we have , and so . Since , we have f(qn+r)=\left\llbracket x\in A\right\rrbracket=1, and hence also . Finally, since , we have 1=f(qn+r^{\prime})=\left\llbracket T^{d}x\in A\right\rrbracket, meaning that . In particular, , which was our goal. ∎
Remark 2.5**.**
Some mild topological restrictions on the target set are, of course, necessary in the above lemma. Note that any open, non-dense and non-empty subset of will satisfy the stated assumptions.
The assumption that the map is totally minimal is essential. Indeed, take to be the Thue–Morse shift, i.e., the closed orbit under the shift map of the Thue–Morse sequence. Let
[TABLE]
Since the Thue–Morse sequence has the property for all and since the Thue–Morse word contains no cubes (i.e., no occurences of factors of the form with , ), we see that , and and are clopen. Let be the Thue–Morse sequence. Then the function f(n)=\left\llbracket T^{n}z\in A\right\rrbracket is periodic with period , and while is minimal, it is not totally minimal.
The analogue of the representation of a polynomial sequence using a skew rotation on the torus in (5) is provided by the Bergelson–Leibman Theorem 1.13. We are now ready to state and prove the main result of this section, from which Theorem B easily follows.
Theorem 2.6**.**
Let be a generalised polynomial taking finitely many values, and let be a weakly periodic sequence which agrees with on a thick set . Then there exists a set with such that the common restriction of and to is periodic.
Proof.
Let the minimal nilsystem , , and a partition be as in Theorem 1.13, so that in particular
[TABLE]
If is not totally minimal, then (as in Remark 1.14) we may find such that for any , has a representation as in (6) on a totally minimal nilsystem. Clearly, is weakly periodic and agrees with on the thick set . Thus, it will suffice to prove the theorem under the additional assumption that is totally minimal.
We may write
[TABLE]
where unless . In particular (by Corollary 1.12), the set of with has upper Banach density [math]. Note that is then thick.
For , put g_{j}^{\prime}(n)=\left\llbracket T^{n}z\in\operatorname{int}S_{j}\right\rrbracket and f^{\prime}_{j}(n)=\left\llbracket f(n)=c_{j}\right\rrbracket. Then for . By Lemma 2.4, this is only possible if for each , the set is either empty or dense. Since , there is such that is dense, and for . Denoting by the set of with we have and for , as needed. ∎
Proof of Theorem B.
This is a direct application of Theorem 2.6 with and ∎
It is not a trivial matter to determine whether a given generalised polynomial is periodic away from a set of density [math], although it can be accomplished by the techniques in [BL07, Lei12]. In order to give explicit examples, we restrict ourselves to generalised polynomials of a specific form, which is somewhat more general than the one considered in Proposition 2.2.
Corollary 2.7**.**
Suppose that is a generalised polynomial with the property that is equidistributed in for any and , and let . Then the sequence is not automatic.
Proof.
Suppose were automatic. By Theorem B, there exist and with such that is constant for . Hence, there is some such that for , contradicting the equidistribution assumption. ∎
The uniform distribution of generalised polynomials has been extensively studied by Håland-Knutson [Hål93, Hål94, HK95], and later a very general theory was developed by Bergelson and Leibman [BL07, Lei12]. In view of the the results in [Hål93], it is fair to say that a “generic” generalised polynomial is equidistributed modulo . Hence, the assumptions on in Corollary 2.7 are not overly restrictive.
To make the last remark precise, let us define the (multi)set of coefficients of a generalised polynomial as follows. If is a polynomial, then the coefficients of are the non-zero terms among the . If or , then the coefficients of are the union of the coefficients of and . Finally, if , then the coefficients of are the union of the coefficients of and the coefficients of . The set of coefficients will depend on the choice of a representation of the generalised polynomial at hand; we fix one such choice. We cite a slightly simplified version of the main theorem of [Hål93].
Theorem 2.8**.**
Suppose that is a generalised polynomials, and all of the products of subsets of the coefficients of are -linearly independent. Then is equidistributed modulo .
As an example of an application, we conclude that is not an automatic sequence.
3. Combinatorial structure of automatic sets
In this section, we begin the investigation of sparse sequences. Here, we call a sequence sparse if it is the characteristic function of a set of density [math] (if such a sequence comes from a generalised polynomial or is automatic, it also has upper Banach density [math], cf. [BK16] and Lemma 4.8 below). Note that for such sparse sequences, Theorem B conveys no useful information. Conversely, to prove Conjecture A, it would suffice (in light of Theorem B) to verify it for sparse sequences; this observation will be made precise in the proof of Theorem C below.
Arid sets
To formulate our main result, it is convenient to introduce the following piece of terminology, inspired by Kedlaya [Ked06]. Such sets appear in the papers of Szilard–Yu–Zhang–Shallit [SYZS92], Gawrychowski–Krieger–Rampersad–Shallit [GKRS10], Derksen [Der07] and Adamczewski–Bell [AB08] (among many others) under different names (regular languages of polynomial growth/sparse/poly-slender/bounded) or without any name. A closely related class of sets known as -normal sets plays a significant rôle in the study of zero sets of linear recurrences in positive characteristic; see also [DM15, AB12]. Other related classes of sets include Saguaro sets of [AB08] and -sets of [MS02]. Since we will use the notation simultaneously for languages and for the associated sets of integers, and since some of the existing terminology might be confusing in our context, we have decided to use a different term.
Definition 3.1** (Arid sets).**
Let , be integers. A basic -arid set (of rank ) is a set of the form
[TABLE]
where and . A set is -arid (of rank ) if it is a finite union of basic arid sets (of rank ). If is clear from the context, we speak simply of (basic) arid sets.
We similarly define these notions for set of integers: A set is -arid (of rank ) if it has the form where is arid (of rank ). A sequence is arid if the set is arid.
Using the Kleene star notation, the -arid set in (8) can be alternatively written as
[TABLE]
In the following, we will not use this notation, and rather use the former notation which seems more appropriate for our context.
Lemma 3.2**.**
Any -arid sequence is -automatic.
Proof.
It is clear that any -arid set is given by a regular expression and hence it is -automatic by Kleene’s theorem. Alternatively, in this simple case one can construct the required automata by hand. ∎
Cobham [Cob72] proved that there is a gap in the growth rate of automatic sets.
Proposition 3.3**.**
Let be a non-empty automatic set. Then exactly one of the following two conditions holds:
- (i)
There exists an integer and a real number such that
[TABLE] 2. (ii)
There exists such that
[TABLE]
Proof.
This follows from [Cob72, Theorem 11 & 12]∎
According to the theorem above, automatic sets have either poly-logarithmic or polynomial rate of growth. Szilard–Yu–Zhang–Shallit [SYZS92] showed that the class of automatic sets of poly-logarithmic growth coincides with the class of arid sets. To state a more precise version of this result, we recall that a state in a -automaton with output is called accessible if there exists such that and is called is there exists such that .
Proposition 3.4**.**
Let be a -automatic set and let be a -automaton with output that produces , in the sense that an integer is in if and only . Then the following conditions are equivalent:
- (i)
The set is arid. 2. (ii)
There exists an integer such that . 3. (iii)
There does not exist an accessible and coaccessible state and such that and .
Moreover, if is arid of rank , then the limit exists and is finite.
Proof.
This is essentially proved in [SYZS92]; our formulation is influenced by [BHS17, Lemmas 2.1–2.3] (for more details and related results see references therein). ∎
Remark**.**
Some similar results are also implicit in [AB08, Lemma 6.7] and [Der07, Proposition 7.9]; see also [Ked06].
Remark 3.5**.**
Let be an integer. Then the notions of -arid sets and -arid sets coincide. This follows either from a direct argument or from Proposition 3.4. We will use this observation several times.
We will in fact need a slight improvement on the information on the rate of growth of arid sets from Proposition 3.4.
Lemma 3.6**.**
Let be arid of rank (exactly) . Then
[TABLE]
Proof.
It suffices to deal with basic arid sets given by
[TABLE]
We begin with some standard reductions. Replacing with suitably chosen powers, altering accordingly, and passing to basic arid subsets, we may assume that all have the same length . Replacing with and using Remark 3.5 enables us to assume that for each . If is minimal, we further know that if for some , then is not a power of . Finally, we may assume that is a large power of , and that is divisible by .
Since an element of is uniquely determined by its final digits, the bound follows immediately from counting the -tuples with . ∎
We are now ready to state the main theorem of this section in a more convenient language.
Theorem 3.7**.**
Suppose that a sparse set is simultaneously -automatic and generalised polynomial. Then is -arid.
For the proof of this result, we need to use the notion of IPS sets introduced in [BK16].
IPS sets and automatic sequences
The following notion generalises the classical notion of an set that is of importance in combinatorial number theory and ergodic theory (for origin of the term , which stands either for infinite-dimensional parallelepiped or idempotent, see, e.g., [BL16]). This notion is discussed in more detail in [BK16] (in particular, an equivalent definition of sets in terms of ultrafilters is given there).
Definition 3.8** ( and sets).**
For a sequence , the corresponding set of finite sums is
[TABLE]
where . Any set containing a set of the form for some is called an set.
For a sequence and shifts , the corresponding set of shifted finite sums is
[TABLE]
where again . Any set containing a set of the form for some is called an set.
Example 3.9**.**
Fix . Let be two distinct words with , and let be arbitrary. Consider the set
[TABLE]
Then is an set. Indeed, , where and = (assuming, as we may, that ). If , then is an set.
sets occur in our work due to the following result.
Theorem 3.10**.**
Let be an automatic set. Then either is arid or it is .
Proof.
Assume that is automatic but not arid; we need to show that is . Let be a -automaton with output which produces the characteristic sequence of when reading digits starting from the most significant one and ignoring the initial zeros.
Since is not arid, neither is the set . Hence, by Proposition 3.4, there exists an accessible and coaccessible state and such that and . Replacing and by their powers and interchanging them if necessary, we may assume that and are of equal length and . Pick so that , and .
The set contains all words of the form , where and . It follows that is (cf. Example 3.9). ∎
In order to prove Theorem 3.7, we need to recall one of the main results of [BK16] (Theorem A), whose proof uses ergodic theory and the machinery of ultrafilters.
Theorem 3.11**.**
Let be a sparse generalised polynomial set. Then is not .
Theorem 3.7 and Theorem C now follow quite easily.
Proof of Theorem 3.7.
Let be the set in Theorem 3.7. By Theorem 3.11, is not . Hence, by Theorem 3.10, it is arid. ∎
Proof of Theorem C.
Suppose that is automatic and generalised polynomial. Let be the periodic function such that the set has . (The existence of is guaranteed by Theorem B.)
Note that is generalised polynomial and automatic (automaticity is clear; to see that is generalised polynomial, compose with a polynomial such that and for , , ).
By Theorem 3.7, is arid. Hence, by Lemma 3.6 below, we have
[TABLE]
for some as . ∎
If Conjecture A is true then there are no nontrivial examples of arid generalised polynomial sets (indeed, by Theorem D non-existence of such sets is precisely equivalent to Conjecture A; see also Proposition 5.3). However, there are examples of generalised polynomial sets which exhibit some properties reminiscent of arid sets. We have already mentioned in this context that the set of Fibonacci numbers is a generalised polynomial set, and in [BK16, Theorems B & C] we have extended this to certain linear recurrences of order and as well as arbitrary sets whose size grows at a sublogarithmic rate.
It is important to note that in the statement of Theorem 3.10 it is not possible to replace sets with sets or their translates (cf. Example 4.3). We discuss this question further in the next section.
4. Examples and properties of automatic sets
-free sets
In this subsection, we will discuss a simple class of examples of automatic sets, the -free sets, which will allow us to show that in the statement of Theorem 3.10 it is in general not possible to replace sets with translates of sets (Example 4.3).
Example 4.1**.**
Let , and let be a finite set of ‘prohibited’ words of length . A word is -free if contains no as a factor. Accordingly, is -free if its base- expansion is -free. Denote the set of -free integers by .
- (i)
The set is -automatic. 2. (ii)
If , then is sparse. 3. (iii)
If , then is not arid. 4. (iv)
If each contains at least two non-zero digits, then is . 5. (v)
If some consists only of [math]’s, then is not for any .
Proof.
- (i)
It is not difficult to explicitly describe a -automaton which computes the characteristic function of ; alternatively, the claim follows immediately from Kleene’s theorem. 2. (ii)
We may assume that consists of a single string of length . Then the probability that a randomly chosen word of length does not contain is at most . The claim easily follows from this. 3. (iii)
We may assume . Construct an undirected graph (we allow to have loops), where , and if and are both -free. If is a walk in , then is -free. Assume that contains a walk of length with . With loss of generality, we may assume that (otherwise, switch and ). Then for any the word is -free. Hence, and we can see either directly or from Proposition 3.4 that is not arid. Thus, it remains to check that contains a length walk with distinct endpoints; for the sake of contradiction suppose that this is not the case.
Since each vertex has at most one neighbour (including itself if is an edge), the graph is a disjoint union of paths of length , loops, and vertices, and hence . On the other hand, given , the number of pairs such that appears in or is , so
[TABLE]
(note that the assumption implies that ), which gives a contradiction. 4. (iv)
Let . Then . 5. (v)
Suppose that contains for some set and integer . Replacing with a smaller set if necessary, we may assume that . Since is , for any there exists which is divisible by . If is large enough (it suffices that ) then is an element of whose base- expansion contains consecutive zeros, contradicting the assumption on .∎
Remark 4.2**.**
A similar example was considered by Miller [Mil12], who gave sufficient conditions for to be infinite.
Example 4.3**.**
The set
[TABLE]
is -automatic, sparse, not arid, and does not contain a translate of an set.
Proof.
We see that is not arid by Proposition 3.4 or by a simple modification of the proof of 4.1.(iii). The remaining claims follow directly from Example 4.1.∎
The following two examples can be verified similarly.
Example 4.4**.**
The set
[TABLE]
is -automatic, sparse, not arid, and .
Example 4.5**.**
The Baum–Sweet sequence ([BS76]) given by
[TABLE]
It takes the value on a set which is -automatic, sparse, not arid, and .
Translates of sets
Even though in general non-arid automatic sets need not contain translates of sets, this is nevertheless the case under certain stronger assumptions on the set.
Proposition 4.6**.**
Let be a -automatic set. Assume that for every there is an integer such that is a factor of . Then the set is for some .
Proof of Proposition 4.6.
Let be a -automaton that produces the characteristic sequence of by reading the digits of starting with the least significant one, allowing for leading zeros. We will denote the word with zeros by . We begin by proving the following claim.
Claim**.**
There exist states with , an integer , and a word that is not a power of [math] such that for we have , , , . This is portrayed below:
s$$s^{\prime}$$v$$z$$v$$z
Proof of the claim.
Let be the number of states in . We first show a weaker statement, namely that there is a state with such that if denotes the state reached from after reading zeros, then we can return from to along a path not consisting only of zeros, that is for some that is not a power of [math].
To prove this, we construct a word as follows. Enumerate all pairs in as for . In the first step, if is reachable from , let describe any path between the two, so that ; otherwise, let . In general, if have been defined, choose so that if possible (i.e., if is reachable from ), and otherwise.
By the assumption on the set , there exists some such that for we have . Applying the same assumption with in place of , we may ensure that is not a power of [math]. It remains to show that we can return from to . For , let denote the intermediate states on the path from to labelled , in particular . The construction of is arranged so that for any with , we have , provided that is reachable from .
Choose such that and . Since is reachable from and is reachable from , is reachable from . Hence, the construction of guarantees that . In particular, , where . Note that is not a power of [math] since neither is . This proves the weaker version of the claim.
To prove the stronger statement, note first that since has only states, there exist such that . Let be any integer divisible by and put . Since is divisible by , we have . Because is reachable from (actually, ), there is a word (equal to , hence not a power of [math]) such that . Take and . The states and the word (of length ) satisfy all the required conditions, namely , , , , and . ∎
To finish the proof of Proposition 4.6, we may assume that all states in are accessible. Choose states and and words and as in the statement of the claim. Let be such that . For any word , where for and , we have , whence . It follows that contains , where and , . ∎
Proposition 4.6 has the following amusing application which, however, does not require the full strength of Theorem 3.11. (Similar results can be shown in greater generality.)
Example 4.7**.**
There exists a constant such that for any sequence which is a rational power of a generalised polynomial such that as , the set
[TABLE]
is not automatic.
Proof.
It is shown in [BK16, Propositions 4.6 & 4.8] that is generalised polynomial, contains no translate of an set, and that for any , .
Suppose that were -automatic. Since intersects nontrivially any arithmetic progression, it would satisfy the assumptions of Proposition 4.6, and thus would contain a translate of an set, contradicting the previously mentioned results. ∎
Densities of symbols
In this subsection, we prove a lemma on densities of occurrences of symbols in automatic sequences. As a corollary, we obtain the claim that sparse automatic sequences take non-zero value at a set of Banach density [math].
The density of symbols for an automatic sequence is often uniform. A set has uniform density if as uniformly in . For an automaton , a strongly connected component is an automaton , where is non-empty, preserved under for all and minimal with respect to these properties, , and are the restrictions of and to , respectively.
Lemma 4.8**.**
Let be a -automatic sequence generated by an automaton reading input starting with the most significant digit, ignoring the initial zeros, and such that all the states are accessible. For , let . Then the following conditions are equivalent:
- (i)
For any , the set has density ; 2. (ii)
For any , the set has uniform density ; 3. (iii)
For any sequence produced by a strongly connected component of and for any we have
[TABLE]
Proof.
It is clear that (ii) implies (i). We will show that (i) implies (iii) and (iii) implies (ii). Throughout, it will be convenient to assume that , which we may do without loss of generality. We then write for .
Suppose that (i) holds, and take some as in (iii). There is some such that , whence
[TABLE]
as .
Now suppose that (iii) holds. For any and , we have
[TABLE]
uniformly in . For any , consider the sequence given by , so that if , then . Note that is produced by the automaton that is obtained from by changing the initial state to .
If lies in a strongly connected component of , then we may use (iii) to estimate the inner sums in (13):
[TABLE]
as (where the error term is uniform with respect to , since there are only finitely many possible sequences ). It is an easy exercise to check that the set of such that does not lie in a strongly connected component of has upper Banach density [math]. Estimating the inner sums in (13) corresponding to such trivially by , and letting slowly enough so that , we conclude that
[TABLE]
as uniformly in . Hence, (ii) holds. ∎
Linear recurrence sequences
We have already noted that the set of values of a linear recurrence sequence can be a generalised polynomial set. This is the case for the Fibonacci sequence; for more information, see [BK16, Theorem B]. In contrast, we show that the set of values of a linear recurrence sequence is not automatic, except for trivial examples. In the proof, we apply Theorem 3.10.
Proposition 4.9**.**
Let be an -valued sequence satisfying a linear recurrence of the form
[TABLE]
with integer coefficients . Suppose that for some the set is -automatic. Then is a finite union of the following standard sets: linear progressions with ; exponential progressions with and ; and finite sets.
Proof.
We first claim that there exists a representation of as a finite union
[TABLE]
where is finite, are arithmetic progressions, are value sets of polynomials with , and have exponential growth in the sense that .
In order to prove this claim, we begin by noting that any restriction of to an arithmetic progression obeys some (minimal length) linear recurrence
[TABLE]
with . Moreover, there exists a choice of such that each of that each is either identically zero or non-degenerate, in the sense that the associated characteristic polynomial has no pair of roots such that is a root of unity (see, e.g., [EvdPSW03, Theorem 1.2] for a much stronger statement). Hence, for the purpose of showing the existence of a representation of the form (15), we may assume that is non-degenerate. Suppose also that is minimal, and let be the roots of with . Note that either is finite or .
If , then by the result of Evertse [Eve84] and van der Poorten and Schlickewei [vdPS91] (see [EvdPSW03, Theorem 2.3]), we have as . Hence, has exponential growth, and we are done.
Otherwise, if , then for all we have or . Kronecker’s theorem [Kro57] (or a standard Galois theory argument) shows that if is an algebraic integer all of whose conjugates have absolute value , then is a root of unity. Using the general formula for the solution of a linear recurrence, we may write for sufficiently large
[TABLE]
where are polynomials and are periodic. Splitting into arithmetic progressions where are constant, we conclude that is a finite union of value sets of polynomials. This again produces a representation of the form (15).
Such a representation is not unique. Splitting into a finite number of subprogressions and discarding those which are redundant, we may assume that for any . Likewise, we may assume that for any . Fix one such representation subject to these restrictions. The set
[TABLE]
is again -automatic; it will suffice to show that is a union of the standard sets mentioned above.
We claim that , i.e., the representation of uses no polynomial progressions of degree . Suppose for the sake of contradiction that appears in one of the sets , and write , where . Replacing with for a suitably chosen , we may assume that for . For sufficiently large , we have , where is a constant and is the base- expansion of , padded by [math]’s so as to have . Since , from the pumping lemma 1.9 it follows that there is such that for any it holds that
[TABLE]
For sufficiently large and a small absolute constant to be determined later, consider the set
[TABLE]
and put (for large ). Note that and that . For a fixed and , we shall consider the cardinality of the set . By an elementary counting argument, we find
[TABLE]
To obtain an upper bound, we separately estimate and for each .
Suppose that with , so in particular and for some . We then have the chain of inequalities:
[TABLE]
which is a contradiction for sufficiently large , provided that (which will hold if we put ). Thus, .
As for , from the bounds on growth of we immediately have
[TABLE]
In total, using (16) and (17) we find that
[TABLE]
contradicting the previously obtained bound . It follows that indeed .
Since contains no polynomial or linear progressions, we have . It follows from Proposition 3.4 that must be -arid of rank . Since all basic arid sets of rank are of the form described in the statement of the theorem, we are done. ∎
5. Proof of Theorem D
In this section, we derive Theorem D from Theorem C. Our argument is purely combinatorial and can be entirely phrased in terms of finite automata with no further recourse to dynamics.
Proposition 5.1**.**
Let be an infinite arid set. Then there exists such that takes the form
[TABLE]
where , and . In particular, is arid of rank .
Likewise, there exists such that takes the form
[TABLE]
where , and .
Proof.
Since the notion of an arid set is preserved under the reversal operation, it is sufficient to prove the former statement. For and , put . If is arid of rank , then so is .
Claim**.**
Let be arid of rank , and let be such that and . Then for sufficiently large (depending on ), is arid of rank .
Proof.
Replacing with , we may assume that .
Let . In analogy with Remark 3.5, note that there is a natural way to identify with a subset of , and any arid set is a finite union of translates with of arid sets . Hence, it will suffice to show that if is arid of rank , then for sufficiently large , is arid of rank . We may now replace with and assume that .
It will suffice to prove the claim for of the form
[TABLE]
where for all (note that here are required to be strictly positive; any arid set of rank is a union of such sets and an arid set of rank ). Now, if then either (in which case we are trivially done) or and both and is a power of . In the latter case, we further conclude that appears in (else would have rank ), which is necessarily of the form with . Hence
[TABLE]
is arid of rank . ∎
The proof of the proposition is now a simple induction on the rank of . Since is infinite, we have .
If , then takes the form , where for at least one , say . Then takes the required form for for large enough.
If , then we may find a rank basic arid set
[TABLE]
contained in . Without loss of generality, we may assume that . Apply the above Claim with , and equal to the first symbols of , where . Note that , because otherwise by an elementary computation one could show that the rank of is . Then for large enough is arid of rank and infinite. By the inductive assumption, there exists such that takes the required form. It remains to take .∎
Corollary 5.2**.**
Let be an infinite -arid set. Then there exist integers , , , and words , such that
[TABLE]
Proof.
Follows immediately from the second part of Proposition 5.1.∎
Proposition 5.3**.**
If the set is not generalised polynomial, then neither is any infinite -arid set.
Proof.
Assume we know that is not generalised polynomial. Then neither is any set of the form for since .
Suppose that there exists an infinite -arid set which is generalised polynomial. Since the class of generalised polynomial sets contains all arithmetic progressions and is closed under finite intersections, Corollary 5.2 allows us to assume that
[TABLE]
for some , . Let and note that
[TABLE]
Let be a generalised polynomial such that and assume further that is a restriction of a generalised polynomial of a real variable that has no further zeros in . (To this end, replace by .) Then an easy computation shows that the polynomial
[TABLE]
has as its zero set
[TABLE]
where , .
The set is also generalised polynomial and it has the form
[TABLE]
with and , where is the smallest integer such that divides . (If there is no such integer, the corresponding term is not present.)
Let be such that for . Replacing the set by the union
[TABLE]
and replacing by , we may assume that
[TABLE]
with and .
Consider the set . The set is generalised polynomial and an integer can be an element of only if or . Since , this gives or and whether the latter possibility is realised or not, we have . This is a contradiction with our remark that no set of the form , , is generalised polynomial (note that during the proof we have replaced by its power). ∎
We are now ready to finish the proof of Theorem D.
Proof of Theorem D.
The two statements in Theorem D are of course mutually exclusive. Now assume that there exists a sequence which is -automatic, generalised polynomial, and not ultimately periodic. By Theorem C, it nevertheless coincides with a periodic sequence except at a set of density zero. Consider the set . This set is -automatic, generalised polynomial, sparse, and infinite. By Theorem 3.7, is then arid and hence by Proposition 5.3 the set is generalised polynomial as well. ∎
6. Concluding remarks
In this section, we gather some remarks and questions which arise naturally. The question with which we begin was already alluded to in the introduction and in [BK16]. As previously discussed, its resolution would suffice to decide if Conjecture A is true.
Question 1**.**
Let be an integer. Is the set generalised polynomial?
We find this question exceptionally pertinent because of its simple formulation.
Morphic words
The class of morphic words is a natural extension of the class of automatic sequences. Let be a finite set. Any morphism of the monoid extends naturally to . A word (which we identify with a function ) is a pure morphic word if it is a fixed point of a non-trivial morphism of . A morphic word is the image of a pure morphic word under a coding (i.e., any set-theoretic map, not necessarily injective). Morphic words are connected with automatic sequences via the fact that -automatic sequences are precisely the morphic words coming from -uniform morphisms. Here, a morphism is -uniform if for all .
We have already encountered possibly the most famous example of a non-uniform morphic word, the Fibonacci word. Recall from the introduction that the Fibonacci word was defined as the limit of the words , , and . Directly from this definition, it is easy to see that is fixed by the morphism given by and .
Recall also that is a Sturmian word. Here, a Sturmian word is one of the form , where and (for we may take ). Some (but not all) of these sequences give rise to morphic words; see [BS93] for details (cf. also [Yas99, Fag06, BEIR07]).
In analogy with Conjecture A, one could ask about a classification of all morphic words which are given by generalised polynomials. We believe that examples such as the Fibonacci word are essentially the only possible ones.
Question 2**.**
Assume that a sequence is both a morphic word and a generalised polynomial. Is it true that is a linear combination of a number of Sturmian morphic words and an eventually periodic sequence?
Regular sequences
We finish by presenting a generalisation of Conjecture A to regular sequences. We call a function a quasi-polynomial if there exists an integer such that the sequences given by , , are polynomials in . We say that a function is ultimately a quasi-polynomial if it coincides with a quasi-polynomial except on a finite set.
Question 3**.**
Assume that a sequence is both regular and generalised polynomial. Is it then true that is ultimately a quasi-polynomial?
If takes only finitely many values, then all the polynomials inducing are necessarily constant, and so in this case the question coincides with Conjecture A.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AB 08] Boris Adamczewski and Jason Bell. Function fields in positive characteristic: expansions and Cobham’s theorem. J. Algebra , 319(6):2337–2350, 2008.
- 2[AB 12] Boris Adamczewski and Jason P. Bell. On vanishing coefficients of algebraic power series over fields of positive characteristic. Invent. Math. , 187(2):343–393, 2012.
- 3[AS 92] Jean-Paul Allouche and Jeffrey Shallit. The ring of k 𝑘 k -regular sequences. Theoret. Comput. Sci. , 98(2):163–197, 1992.
- 4[AS 03a] Jean-Paul Allouche and Jeffrey Shallit. Automatic sequences . Cambridge University Press, Cambridge, 2003.
- 5[AS 03b] Jean-Paul Allouche and Jeffrey Shallit. The ring of k 𝑘 k -regular sequences. II. Theoret. Comput. Sci. , 307(1):3–29, 2003.
- 6[BEIR 07] Valérie Berthé, Hiromi Ei, Shunji Ito, and Hui Rao. On substitution invariant Sturmian words: an application of Rauzy fractals. Theor. Inform. Appl. , 41(3):329–349, 2007.
- 7[Bel 07] Jason P. Bell. p 𝑝 p -adic valuations and k 𝑘 k -regular sequences. Discrete Math. , 307(23):3070–3075, 2007.
- 8[Ber 81] Jean Berstel. Mots de Fibonacci. In Séminaire d’Informatique Théorique , pages 57–78, Paris, 1980–1981.
