Iteration of Quadratic Polynomials Over Finite Fields
D. R. Heath-Brown

TL;DR
This paper investigates the iteration behavior of quadratic polynomials over finite fields, demonstrating that such sequences typically recur after a logarithmic number of steps relative to the field size, with specific results for certain polynomials.
Contribution
It provides bounds on the recurrence time of quadratic polynomial iterates over finite fields and discusses limitations of the Birthday Paradox model for cubic polynomials.
Findings
Quadratic polynomial iterates recur after O(q/log log q) steps
For X^2+1, recurrence occurs for any starting value
The Birthday Paradox model is unsuitable for X^3+c when q ≡ 2 mod 3
Abstract
For a finite field of odd cardinality , we show that the sequence of iterates of , starting at , always recurs after steps. For the same is true for any starting value. We suggest that the traditional "Birthday Paradox" model is inappropriate for iterates of , when is 2 mod 3.
| Prime | 100019 | 100043 |
|---|---|---|
| 10030 | 9936 | |
| 9944 | 9730 | |
| 9992 | 9976 | |
| 10122 | 10232 | |
| 10212 | 10034 | |
| 9830 | 10000 | |
| 9902 | 10086 | |
| 9904 | 10012 | |
| 10070 | 9946 | |
| 10012 | 10090 |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgebraic Geometry and Number Theory · Polynomial and algebraic computation · Coding theory and cryptography
Iteration of Quadratic Polynomials Over Finite Fields
D.R. Heath-Brown
Mathematical Institute, Oxford
1 Introduction
Let and define the iterates by setting and . Let , and consider the sequence of values . Since the field is finite, the sequence eventually recurs, and one enters a closed cycle. We are interested in the questions:- How long is it before one enters the cycle? How long is the cycle? In general we can construct a directed graph , whose vertices are the elements of , and with edges . The trajectory then consists of a pre-cyclic “tail”, followed by a cycle.
Linear polynomials are easily handled. When one has , so that if the cycles are singleton sets, and if then is a union of cycles of length , the characteristic of the field. For linear polynomials with one finds that
[TABLE]
Thus consists of cycles of length ord together with a cycle of length 1.
The situation is much more interesting for higher degree polynomials, and forms the basis for Pollard’s famous “Rho Algorithm” for integer factorization [4]. If one wishes to factor the algorithm calculates successive iterates and modulo , until one reaches a value for which . If this highest common factor is different from then one has obtained a non-trivial factor of . When is a prime divisor of , the sequence of iterates modulo will have an initial segment of length say, (the “tail” of the letter rho) followed by a cycle of length say. Thus when is the smallest multiple of for which . In particular the first such is at most . If is some other prime divisor of there will be a corresponding value for which . Unless the two values and are the same, the method will produce a nontrivial divisor of . The efficiency of the algorithm depends on and being small.
A crude probabilistic argument predicts that, over the field , the sequence is likely to complete a cycle after roughly steps. This is a version of the “Birthday Paradox”. Specifically, if one imagines the sequence as taking values in independently and uniformly at random, then the chance of having a repetition within steps, say, is
[TABLE]
and when is of order this is roughly . Thus there is a positive probability of a repetition as soon as .
Unfortunately there are examples in which this heuristic clearly fails. Thus if one has , and if has odd order one gets a pure cycle of length , where is the order of 2 modulo . Thus if is a prime of the shape , with a prime for which 2 is a primitive root, then the cycle length will be whenever has order modulo . While it is not known that infinitely many such primes exist it is certainly conjectured to be so. Thus we will expect to get cycles of length for a positive proportion of initial values .
A second example is provided by the polynomial . If for some , then , and we have a situation similar to that described above. If with a prime for which 2 is a primitive root, then again we will have cycles of length for a positive proportion of initial values .
Thirdly one can consider polynomials of the shape , in the case in which is a prime with . Here one sees that induces a permutation of , since has a unique solution in , for every . If is given, the trajectory is therefore completely cyclic, and our question merely concerns the length of the cycle. However the proportion of permutations in the symmetric group for which belongs to a cycle of given length , is exactly . Thus one might expect all cycle lengths to occur equally often, and that one should get a cycle of length at least , say, with probability around . The numerical evidence seems to support this. For a given prime and every we compute the length, say, of the cycle which starts at . We then see for how many values of the scaled cycle length falls into each of the intervals , for . If the permutations induced by the various polynomials were genuinely random we would expect roughly the same number of scaled cycle lengths in each such interval. The data for the first two primes beyond are presented in Table 1. The figures appear to support the random permutation model well.
The main goal of the present paper is to describe a quite different theory for the iterates of quadratic polynomials in odd characteristic, in which it is clear why the anomalous cases above must be excluded. In contrast to the situation with , when , the equation typically has either 2 solutions or none at all, the latter case holding for roughly half the possible values of (those for which is a non-square). When has two solutions and , the equations and will again typically have either 2 solutions or none. In this way, considering solutions of , one sees that is potentially much more complicated than a series of cycles.
Our main result demonstrates this distinction clearly.
Theorem 1
Let be a finite field of characteristic , and let with . Suppose that for . Then
[TABLE]
uniformly in and , where the constant is defined recursively by taking and
[TABLE]
Moreover we have as .
At this point we should mention some closely related work. Shao [5, Theorem 1.6] handles the case by a method which generalizes readily to other quadratics. The condition in his theorem is stronger than ours (that for ) but an examination of the proof shows that he only needs something like our condition. His result does not include an explicit dependence on . Juul, Kurlberg, Madhu and Tucker [3] handle general rational functions rather than restricting to quadratic polynomials. Their emphasis is on the reductions of a given rational function modulo different primes, but they show under quite general conditions that the sum of all cycle lengths is as . (See Corollary 2 below.)
Before discussing the implications of the theorem, let us examine the condition that for . The critical points of a polynomial are the roots of , and is said to be “post-critically finite” if the iterates eventually enter a cycle, for every critical point . In dynamics in general post-critically finite maps are a very important subclass. Of course, over a finite field every polynomial is post-critically finite. However our condition can be viewed as saying that, in an approximate sense, fails to be post-critically finite. (When the only critical point is .)
Certainly the condition that for fails for the polynomials and , with and respectively. Suppose next that is the reduction of a polynomial , with , then the sequence is strictly monotonic, with . Thus if we cannot have with . The condition of the theorem will therefore hold when
[TABLE]
In following this paper the reader may wish to bear in mind the archetypal example , for which suffices.
Our main theorem above has the following immediate consequences.
Corollary 1
Let be a finite field of characteristic , and let with . Then for some with
[TABLE]
Corollary 2
Let be a finite field with prime, and let be the reduction of , where . Then the sum of all the cycle lengths in will be . Similarly the length of any pre-cyclic path in will be .
The first corollary gives an unconditional bound for the first recurrence in the sequence . The second corollary proves a similar result for arbitrary initial values for the reductions of fixed positive definite quadratic polynomials . Moreover it highlights the difference in behaviour between such polynomials and the cubic case , where the cycle lengths can sum to .
To prove Corollary 1 we choose , so that . Then, according to Theorem 1, we have either for some , or . Writing the latter bound as for an appropriate constant we deduce in the latter case that if then the values cannot be distinct, since they all lie in and . In either case there must therefore be acceptable values . The claim then follows.
For Corollary 2 we observe as above that the condition of the theorem holds under the assumption (3). The choice , will satisfy (3) when , and the theorem then yields . All cycles lie inside , giving the first assertion of the corollary. Moreover if is a pre-cyclic path then are distinct elements in , so that . We then see that , from which the second assertion follows.
We should explain the restriction to polynomials . For an arbitrary polynomial , if we define , then we will have . Thus may be obtained from by relabelling each vertex as . Since the two graphs are isomorphic in this sense, it suffices to study for a suitably chosen . In the case in which (and has odd characteristic) we can choose to produce a polynomial of the shape . Thus we may translate our results into statements about general quadratic polynomials as follows.
Corollary 3
Let be a finite field of characteristic , and let with . Suppose that for . Then
[TABLE]
uniformly in and , with the same as before.
In particular for some with
[TABLE]
If is prime, and is the reduction of a positive definite quadratic polynomial , then the sum of all the cycle lengths in will be . Similarly the length of any pre-cyclic path in will be .
In much the same way one can show that it would suffice to prove our theorem for polynomials . One could then deduce the corresponding result for by considering iterates of .
Theorem 1 gives us an asymptotic formula . We proceed to give a probabilistic argument showing why one might expect this, and how the recurrence relation (2) arises. When we have , so that . Suppose now that we have a relation . We will use an inductive argument to produce the corresponding result for .
To have it is necessary and sufficient that and that for at least one solution of . Since contains squares one has in exactly cases, and except for the value there will then be precisely two possible values of . Let these be and . If the probability of these lying in were each, independently, one might expect that the probability of at least one being in should be , by the inclusion-exclusion principle. It would then follow that belongs to with probability around . One would therefore produce an asymptotic expression with as in (2).
We next explain why , as claimed in Theorem
- Writing we see that and
[TABLE]
An easy induction then shows that for all , whence . Another induction shows that
[TABLE]
so that for . Together with the lower bound this shows that and hence .
Acknowledgments The author would particularly like to extend his thanks to Giacomo Micheli, for a number of interesting conversations introducing the author to the subject of polynomial iteration. Joe Silverman also provided a number of helful comments. Thanks are also due to Tim Browning, for elucidating a technical point in Section 3, to Maksym Radziwiłł for some preliminary computational results, and to Ben Green, Rafe Jones, Tom Tucker and Michael Zieve for some useful references.
2 A Second Moment Calculation
Fundamental to our treatment of Theorem 1 will be moments of the functions
[TABLE]
Our first task is to estimate the moments
[TABLE]
for and . Trivially we have for all so that for every . Moreover it is also clear that for every .
Before moving to the general situation it may be helpful to think first about the case , for which
[TABLE]
The equation defines a curve in . An absolutely irreducible curve over will have points, by Weil’s “Riemann Hypothesis”. However our curve is far from being irreducible.
Indeed
[TABLE]
whence a trivial induction produces
[TABLE]
Thus we obtain factors. However it is not immediately clear when polynomials of the form are absolutely irreducible over .
In general, suppose that is a polynomial of degree , over a field , and let be the corresponding form. If factors as over the algebraic completion then there will necessarily be triple such that . For any such triple we then have . This gives us a simple criterion for absolute irreducibility, which is sufficient, though not necessary: If vanishes only at the origin in , then must be absolutely irreducible.
We apply this criterion to . Writing for convenience, and
[TABLE]
we have
[TABLE]
If then . It then follows by induction that
[TABLE]
In particular, if vanishes, then there are indices for which . Since
[TABLE]
we see that would imply , which is excluded. We then see that we would have for some such that . However , and similarly for . It follows that if fails to be absolutely irreducible, then for some pair of non-negative integers . If then since has odd characteristic we have with . Otherwise with distinct positive integers . Since Theorem 1 assumes that the values are distinct we therefore conclude that the polynomial is irreducible over the algebraic completion , for every .
We are now ready to estimate . In view of (4) and (5) we have
[TABLE]
there being solutions to . To get a corresponding lower bound we may use the inclusion-exclusion principle to show that
[TABLE]
where is the number of common solutions to
[TABLE]
and is the number of common solutions to
[TABLE]
However if with then , which has at most solutions. Thus . Similarly, if were to lie on two distinct curves and with , then
[TABLE]
since is an even polynomial. We would then have so that , and similarly , would be a root of . There are therefore at most choices for , and since then satisfies there are at most choices of for each possible . Thus . It follows that
[TABLE]
We therefore conclude that
[TABLE]
It remains to count points on the curves . We have already shown that these are absolutely irreducible, and indeed nonsingular, under the assumptions of Theorem 1. If we write for the number of projective points on the curve, and for its degree, then Weil’s “Riemann Hypothesis” tells us that
[TABLE]
There are at most points at infinity, so that
[TABLE]
Finally, summing for we find that
[TABLE]
We may therefore summarize the conclusions of this section as follows.
Lemma 1
Under the assumptions of Theorem 1 we have
[TABLE]
3 Higher Moments — Irreducible Curves
We now develop the ideas of the previous section to estimate for . Here is the number of solutions of
[TABLE]
in . These equations define a curve, but, as in the previous section, it is far from being an irreducible curve. Our task in this section is to identify the absolutely irreducible components, and to show that they are all defined over .
In view of (5), for any solution of (7) and any pair of distinct indices , there is a corresponding
[TABLE]
such that , where
[TABLE]
If there is more than one choice for we choose the smallest.
We now make the following definition.
Definition 1
A “-graph” is a weighted graph on vertices, for which any edge has integral weight in the range . If some edge has weight equal to we say that we have a “strict -graph”. If there is an edge between every pair of vertices we say we have a “complete -graph”.
Thus each solution of (7) produces a complete -weighted graph. We now introduce the following further definition.
Definition 2
Let be a complete -graph. Then we say is “proper” if, whenever are distinct vertices, with , then either or .
We then have the following lemma.
Lemma 2
The graph associated to a solution of (7) is proper.
To prove the claim, observe firstly that if , then and , whence , so that . Next we show that one cannot have . Writing this would imply that and , whence
[TABLE]
The factorization (5) would then show that for some . This however is impossible, since was chosen minimally.
To complete the proof of the claim we show that if
[TABLE]
then . In view of (5) the relation would imply . Since
[TABLE]
this would show that and the minimal choice of then produces , giving the required conclusion. This now establishes the lemma in full.
Thus each solution of (7) is associated to a unique proper weighted graph, such that
[TABLE]
However there is considerable redundancy in the equations (8). To investigate this we begin with the following result.
Lemma 3
Let be a proper strict -graph, with . Then there is a unique partition into non-empty sets and such that when and , while whenever or .
Firstly it is easy to see that such a partition must be unique. For if were a different partition then, after relabeling if necessary, we could find indices with and . We would then have both (because ) and (because and ). This contradiction shows that such partitions are unique.
In order to show the existence of a suitable partition we fix a pair with , and let
[TABLE]
Then and , so that neither set is empty. If , say, were in , then
[TABLE]
contradicting Definition 2. For any Definition 2 shows that we must have either or , so that is a partition of .
If had then the triple would contradict Definition 2. Thus , and similarly when . Finally, if then if . Otherwise Definition 2 applied to the triple shows that , since . Thus for all . Now, if with , Definition 2 applied to the triple shows that , since . Hence whenever and . This completes the proof of the lemma.
We now show how a complete -graph can be generated by a smaller graph.
Definition 3
Let be a complete -graph, and suppose is a subgraph of with the same set of vertices but fewer edges. We then say that “generates” if for some , where is obtained from by the following procedure:
Take three distinct vertices for which the edges and belong to but does not, and for which either or . Then is obtained from by adding the edge with weight .
For our purposes it is not necessary to know whether, using a different sequence of edge additions, might generate two different complete -graphs. All we need to know is whether there exist some sequence of edge additions resulting in .
To motivate the definition we consider the ideal generated by those polynomials for which the edge is in . Then trivially we have , since is formed from by the addition of one further generator . However, if in the procedure in Definition 3 we have
[TABLE]
and
[TABLE]
Hence if then
[TABLE]
so that . Alternatively, if in the procedure in Definition 3, we have
[TABLE]
Here we have by (5), since . Hence is in the ideal generated by and . We therefore see again that . It follows that if is the proper complete -graph associated to a system of equation (8), and is generated by , then the system (8) has the same solutions as the smaller system
[TABLE]
We now introduce the small graphs we shall use.
Definition 4
A -graph is said to be a “chain” if there is a permutation such that the edges are precisely the pairs
[TABLE]
and, for any , the maximum of
[TABLE]
is either or is attained at only one point.
We then have the following result.
Lemma 4
For any complete -graph there is a chain -graph which generates .
We prove this by induction on . If we may take to consists of the edges with weights , which clearly generates . Now assume the result is true for complete graphs with . If is not a strict -graph the conclusion is immediate from the induction hypothesis.
Thus we assume that is a strict complete graph with , so that Lemma 3 applies. Let be the restriction of to the vertices in , so that is a complete -graph, where . The induction hypothesis then shows that there is a chain graph say, which generates , in which one re-orders the vertices in as so as to satisfy the chain property in Definition 4. Similarly if is the restriction of to the vertices in , we can obtain a subgraph of which is a chain, and which generates . If there will again be an appropriate ordering of the indices in .
We then take to be the graph with vertices whose edges are the edges of , the edges of , and the additional edge . Moreover we permute the vertices into the order . We claim firstly that this ordering makes a chain, and secondly that generates .
To verify that is a chain we consider a sequence of consecutive pairs of the vertices from the sequence . If the sequence is entirely contained in the first terms the required chain property follows from that for , and similarly if all the elements are taken from the last terms. However if one of the pairs is the edge it suffices to note that this edge has weight while all other edges have weight at most .
To check that generates we note that generates and generates . Thus certainly generates the graph containing the edges of , the edges of and the edge . Hence it suffices to show that generates . Let be an edge of which is not already an edge in . Then, according to Lemma 3 we may assume that and , and that . Applying the procedure in Definition 3 to the triple we see that the edge is in , since , and the edge is also in , by definition. Moreover . Thus the edge can be generated from , with weight . We may then apply the procedure in Definition 3 to the triple . This time the edge is in , since , and the edge can be generated from , as we have just shown. Moreover we have , so that the edge can also be generated from , and is given weight , as required. This completes the proof of the lemma.
As an immediate consequence of Lemma 4 we have the following.
Lemma 5
After a suitable relabelling of the variables, any solution to the equations (7) satisfies some system of equations of the type
[TABLE]
with . Moreover, if , then the maximum of is either or occurs at only one point.
We call a system of equations of the above type a “chain system”. The system defines a variety in . We set and
[TABLE]
so that the corresponding projective variety is given by
[TABLE]
The importance of the chain property is demonstrated by the following result.
Lemma 6
Suppose that for . Then, for a chain system, the variety is a nonsingular complete intersection. Hence is an absolutely irreducible curve over , with degree at most .
To prove that is a nonsingular complete intersection we need to show that the vectors are linearly independent at any point of . Suppose to the contrary that
[TABLE]
If the are not all zero we take to be the smallest index with , and to be the largest index with , so that
[TABLE]
The entries of this vector are labelled by the variables , and one sees that the entry corresponding to is just . We therefore conclude that , and similarly that . In particular we must have . However
[TABLE]
in the notation (6). We therefore see that for some index in the range , and similarly for some with .
We next show that cannot vanish. If, on the contrary, we had then the relation would yield . In general, if , then the relation implies , while implies . Thus, using both forwards and backwards induction we would have for all , which is impossible.
We may therefore assume that , taking us back to the affine situation. Thus we have and with and . Since the chain property shows that the maximum of occurs at only one point, , say. Since we have . Similarly we have . If then , whence . Thus for . It follows that . Similarly, when we have , whence . However with , whence . As in the previous section we therefore conclude that for some pair of non-negative integers . This leads either to (if ) or (if , say). In either case we contradict the assumption of Theorem 1, since . This completes the proof that is a nonsingular complete intersection.
The remainder of the lemma is now straightforward. In general a nonsingular complete intersection is necessarily absolutely irreducible, with degree equal to the product of the degrees of the defining forms, see Browning and Heath-Brown [1, Lemma 3.2] for details. In our case has degree at most , since , and the result follows.
4 Higher Moments — Counting Points, And Counting Curves
In this section we will firstly estimate the number of points on each curve , and then compute the number of such curves that the variety given by (7) produces. Putting these results together will give us an asymptotic formula for .
Since is an absolutely irreducible curve defined over , Weil’s “Riemann Hypothesis” yields
[TABLE]
where is the genus of . In general, if is an irreducible non-degenerate curve of degree in (with ), then according to the Castelnuovo genus bound [2], one has
[TABLE]
where with . This implies in particular that irrespective of the degree of the ambient space in which lies. We therefore deduce that
[TABLE]
since has degree at most .
By inclusion-exclusion we see that
[TABLE]
For distinct curves of degree at most we have
[TABLE]
by Bézout’s Theorem. Hence if there are different curves we see that
[TABLE]
We can get a crude bound for by observing that there are possible permutations describing a chain system, and for each of the edges one has . Thus if we have
[TABLE]
Applying (10) we then deduce the following result.
Lemma 7
If there are different curves then
[TABLE]
Our task now is to investigate the number . We have seen that each curve arises from a proper -graph. We proceed to show that different graphs cannot produce the same curve . The graphs and must differ on at least one edge, so that one would have both and and on . If say, then and , whence for all points on the curve. It then follows that . However is an irreducible component of the curve (7), whence for every index . This gives us a contradiction since it would produce imply that has dimension zero.
We therefore need to count proper graphs. For a proper strict -graph, Lemma 3 produces a unique partition , for which the corresponding graphs and will be proper -graphs. There are proper strict -graphs. Moreover the number of partitions with is
[TABLE]
while, for even , the number with is
[TABLE]
We then see that
[TABLE]
for and . Indeed, since for every we see that this holds for too. If we now define for all the above formula simplifies to give
[TABLE]
We therefore define power series
[TABLE]
for each . Since (11) yields we see that this converges absolutely for . Now, after checking that we have the correct coefficient for , we arrive at
[TABLE]
Since for all we have , so that the coefficients can easily be calculated in general. Moreover it is clear by induction that
[TABLE]
with non-negative real coefficients summing to 1. We then see that
[TABLE]
We clearly have have absolute convergence for small , and we may rearrange to get
[TABLE]
We therefore deduce that
[TABLE]
We also see that the coefficient satisfies the recurrence
[TABLE]
for , with . We can then check that has the initial value and satisfies the recurrence described in Theorem 1.
Recall that our goal is to estimate
[TABLE]
Since the equation has at most solutions we will always have , whence
[TABLE]
Setting
[TABLE]
we then have
[TABLE]
Our plan is to substitute the approximate value for given by Lemma 7.
We first investigate the contribution from the main term . This produces
[TABLE]
However the identity (12) shows that the inner sum vanishes for , and takes the value 1 for . Thus the main term for (13) is just , producing the leading term in (1).
For the proof of Theorem 1 it remains to handle the contribution to (13) arising from the error term in Lemma 7, which will be
[TABLE]
with
[TABLE]
However it is clear from (12) that
[TABLE]
if , so that
[TABLE]
say. This suffices for Theorem 1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] T.D. Browning and D.R. Heath-Brown, Forms in many variables and differing degrees, J. Eur. Math. Soc. (to appear), ar Xiv:1403.5937.
- 2[2] G. Castelnuovo, Ricerche di geometria sulle curve algebriche, Atti Reale Accademia delle Scienze di Torino , 24 (1889), 346–373.
- 3[3] J. Juul, P. Kurlberg, K. Madhu and T.J. Tucker, Wreath products and proportions of periodic points, Int. Math. Res. Not. , 2016, no. 13, 3944–3969.
- 4[4] J.M. Pollard, A Monte Carlo method for factorization, Nordisk Tidskr. Informationsbehandling (BIT) , 15 (1975), no. 3, 331–334.
- 5[5] X. Shao, Polynomial values modulo primes on average and sharpness of the larger sieve, Algebra Number Theory , 9 (2015), no. 10, 2325–2346.
