Furstenberg--S\'{a}rk\"{o}zy theorem and partition regularity of polynomial equations over finite fields
Ethan Ackelsberg, Vitaly Bergelson

TL;DR
This paper advances the understanding of polynomial configurations in finite fields, providing bounds, characterizations, and partition regularity results that extend classical theorems to finite field settings with fixed and large characteristics.
Contribution
It offers a complete algebraic characterization of polynomials satisfying the Furstenberg--Sárközy theorem over finite fields and establishes partition regularity of polynomial equations.
Findings
Bounded polynomial configurations in finite fields matching Weil bounds
Complete algebraic characterization of polynomials for Furstenberg--Sárközy theorem
Partition regularity of polynomial equations with quantitative bounds
Abstract
We prove new combinatorial results about polynomial configurations in large subsets of finite fields. Bergelson--Leibman--McCutcheon (2005) showed that for any polynomial with , if is a subset of a -element finite field and does not contains distinct such that for some , then . In fields of sufficiently large characterstic, the bound can be improved to by the Weil bound. We match this bound in the low characteristic setting and give a complete algebraic characterization of the class of polynomials for which the Furstenberg--S\'{a}rk\"{o}zy theorem holds over finite fields of fixed characteristic. Our next main result deals with an enhancement of the Furstenberg--S\'{a}rk\"{o}zy theorem over finite fields. Another consequence of the Weil bound is that if $P(x)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCoding theory and cryptography · Limits and Structures in Graph Theory · Mathematical Dynamics and Fractals
Asymptotic total ergodicity for actions of and polynomial configurations over finite fields and rings
Ethan Ackelsberg
School of Mathematics, Institute for Advanced Study, Princeton, NJ 08540
and
Vitaly Bergelson
Department of Mathematics, Ohio State University, Columbus, OH 43210
Abstract.
We obtain new combinatorial results about polynomial configurations over finite fields and rings by utilizing the phenomenon of asymptotic total ergodicity (previously studied for actions of on modular rings in [BB23]) in the context of actions of the polynomial ring over a finite field . We prove that a sequence of quotient rings , , is asymptotically totally ergodic if and only if the minimal degree of the irreducible factors of diverges to infinity as . We then show that asymptotic total ergodicity of a sequence of quotient rings leads to asymptotic equidistribution of polynomial sequences in subgroups of . This has several combinatorial consequences:
- (1)
We prove a power saving bound for the Furstenberg–Sárközy theorem over finite fields of fixed characteristic: given an intersective polynomial , there exists such that if , , does not contain distinct with for some , then . This complements recent work of Li and Sauermann [LS22], where they obtain power saving bounds under the assumption . 2. (2)
Under a natural equidistribution condition on , we prove the following enhancement of the Furstenberg–Sárközy theorem in the presence of asymptotic total ergodicity. Suppose the sequence of quotient rings is asymptotically totally ergodic. Then for any and any two sets with and sufficiently large, there exist with and . Moreover, the values for which and obey the “correct” statistical behavior as :
[TABLE]
We also show that, in the absence of asymptotic total ergodicity and an equidistribution condition on the polynomial , one cannot hope for such a refinement of the Furstenberg–Sárközy theorem. 3. (3)
We establish partition regularity of families of polynomial equations over finite fields. For example, we are able to prove: if with , then for any , there exists and such that if and , then there are at least monochromatic solutions to the equation .
Interactions between finitary mathematics (combinatorics in finite fields and rings) and infinitary mathematics (ergodic theory, equidistribution, and Loeb measure spaces) play a central role throughout the paper.
Key words and phrases:
Finite fields, Furstenberg–Sárközy theorem, total ergodicity, equidistribution, partition regularity, Loeb measure
2020 Mathematics Subject Classification:
11B30 (11T06, 37A25, 05D10)
Contents
- 1 Introduction
- 2 Asymptotic total ergodicity
- 3 Asymptotic projection theorem
- 4 Power saving bounds for the Furstenberg–Sárközy theorem in characteristic
- 5 Proof of equivalences
- 6 Partition regularity of polynomial equations over finite fields
1. Introduction
Our starting point is the following classical result:
Theorem 1.1** (Furstenberg–Sárközy theorem [F77, S78]).**
Let be a nonzero polynomial with . For any , there exists such that any subset of size contains distinct elements with for some .
The assumption that can weakened as follows. Call a polynomial intersective if has a root mod for every . It follows from the work of Kamae and Mendès France [KM78, Example 3] that the conclusion of Theorem 1.1 holds if and only if is intersective.
A version of the Furstenberg–Sárközy theorem holds also in a finite characteristic setting. The following result is a consequence of [BL16, Theorem 9.2] together with (a corrected version of) the remark111In the remark following Theorem 9.5 in [BL16], intersective polynomials are defined as polynomials such that for any finite index subgroup , there exists such that for every . The definition of intersective given in item (i) of Theorem 1.2 is different and deals with a wider class of polynomials but is the correct notion in order to get the desired “if and only if” conclusion. An example of an intersective polynomial that does not fit the condition in [BL16] is . There is no multiple for which always belongs to the subgroup of index . However, , so is intersective (according to our definition). following Theorem 9.5 in [BL16]:
Theorem 1.2**.**
Let be a finite field with elements. The following are equivalent for a polynomial :
- (i)
* is intersective: for any , there exists such that ;* 2. (ii)
for any , there exists such that any subset of elements of degree with contains distinct with for some .
A consequence of Theorem 1.2 is a Furstenberg–Sárközy type theorem over large finite fields:
Corollary 1.3**.**
Let be a prime number, and let be an intersective polynomial. Then for any , there exists such that if and with , then contains distinct with for some .
Recent work of Li and Sauermann [LS22], building on earlier work of Green [G17] using the Croot–Lev–Pach [CLP17] polynomial method, establishes quantitative improvements of Theorem 1.2 and Corollary 1.3 under the assumption .
Theorem 1.4** ([LS22], Theorem 1.4 and Corollary 1.5).**
For , let
[TABLE]
where
[TABLE]
- (i)
Fix and a polynomial of degree with . If is a set of polynomials of degree less than and does not contain distinct with for some , then222The notation means that there is a constant such that for all . Subscripts on denote on which parameters the constant depends.**
[TABLE] 2. (ii)
Fix a prime and a polynomial of degree with . If does not contain distinct with for some , then
[TABLE]
In this paper, we produce power saving bounds for the Furstenberg–Sárközy theorem over finite fields by a different method. The bounds we obtain are different from those in Theorem 1.4. In some cases, our bounds are stronger, while in other cases, ours are weaker; see Remark 1.13 below. Our approach draws inspiration from infinitary sources. These are: equidistributional results for polynomial sequences defined over and the phenomenon of asymptotic total ergodicity for actions of . As a consequence of our approach, our results apply in a more general setting than finite fields of characteristic , including quotient rings under some conditions on . The phenomenon of asymptotic total ergodicity for -actions was previously explored in [BB23], where similar Furstenberg–Sárközy-type results are proved in the setting of modular rings when all prime factors of are sufficiently large. The results of this paper are natural analogues of the results in [BB23]. Where appropriate, we note the correspondences between our setting (dealing with -actions and quotient rings ) and the more familiar setting of -actions and modular rings . However, some caution is needed, as our finite characteristic setting introduces new complications. Namely, the distributional behavior of a polynomial whose degree exceeds the characteristic is more sophisticated than the behavior of polynomials over (see, e.g., Theorem 1.9 below), and this creates additional difficulties in our analysis.
Before stating our results, we fix some notation. Let be a finite field of characteristic . We denote the set of monic polynomials over by . Every element has a unique factorization (up to reordering) into monic irreducibles . We denote the quotient ring by , and we have an isomorphism
[TABLE]
by the Chinese remainder theorem. Note that when is irreducible, is a finite field of characteristic . Moreover, any finite field of characteristic can obtained as for some irreducible element . The decomposition of the ring for general given in (1.1) is parallel to the situation with modular rings, where the Chinese remainder theorem gives an isomorphism
[TABLE]
for .
We define an absolute value on by if . Equivalently, for any , is the cardinality of the quotient ring . For , we set to be the size of the least prime factor of .
For a finite set and a function , we write
[TABLE]
For , we define the -norm on by
[TABLE]
Recall that a measure-preserving -system333Given an abelian group , a measure-preserving -system is a quadruple , where is a probability space, and is an action of on by measure-preserving transformations. is totally ergodic if for every , is ergodic. By analogy, we say that a measure-preserving -system is totally ergodic if for every , the action is ergodic. Our first result provides a finitization of the phenomenon of total ergodicity. A similar result for -actions and the quotient rings appears in [BB23].
Theorem 1.5**.**
Let be a sequence in . The following are equivalent:
- (i)
The sequence of quotient rings is asymptotically totally ergodic: for any ,
[TABLE] 2. (ii)
.
We prove Theorem 1.5 in Section 2.
Using the spectral theorem for unitary actions of and equidistributional results for polynomial sequences over , one may show the following:
Theorem 1.6**.**
*Let be a polynomial with zero constant term. Then for any totally ergodic system , any Følner sequence444 A Følner sequence in is a sequence of finite subsets of such that, for any ,
Examples include (the set of all polynomials over of degree ) and (the set of all monic polynomials of degree ). , and any ,*
[TABLE]
where is the orthogonal projection onto the space
[TABLE]
Remark 1.7**.**
(1) A similar result with replaced by a countably infinite field was obtained by Larick in [L98, Theorem 1.1.1].
(2) For -actions, the corresponding version of Theorem 1.6 is simpler. Namely, for any polynomial , any totally ergodic -system , any Følner sequence in , and any ,
[TABLE]
The presence of the projection in Theorem 1.6 rather than is a reflection of the more intricate distributional behavior of polynomials over . The situation where the projection is equal to can be characterized by an equidistributional assumption on the polynomial ; see Proposition 1.14 below.
Our goal is to prove an asymptotic version of Theorem 1.6 and to deduce from it new combinatorial results over finite fields (and rings of the form ). Before formulating our result, we sketch a proof of Theorem 1.6, which will serve as a model for our finitary results.
By the spectral theorem for actions of by unitary operators on a Hilbert space, we may work with the Hilbert space , where is a positive Borel measure on the dual group , and the unitary action is represented by the multiplication operators for and .
Rather than working with as the abstract dual group of , it will be convenient to work with the dual group in a more concrete form. Let be the field of rational functions . Extending the absolute value we defined on to by , the completion of is the field . We think of and as natural analogues of the rational numbers and the real numbers , respectively.
In the characteristic zero setting, the dual group of the integers is isomorphic to the torus . A similar result is true in our setting: the dual group is isomorphic to . In particular, every character takes the form
[TABLE]
for some , where
[TABLE]
and is a fixed nontrivial character on .
A word of caution: with the objects discussed above, we have an isomorphism . The corresponding statement in the more familiar characteristic zero setting is not true: .
To prove Theorem 1.6, it then suffices to show: for -a.e. ,
[TABLE]
For -action, total ergodicity is equivalent to the absence of rational spectrum. Similarly, the assumption that is a totally ergodic -action means that
[TABLE]
Therefore, Theorem 1.6 reduces to studying equidistribution of the sequences for irrational .
A general Weyl-type equidistribution theorem for polynomials over was established in [BL16], and we can use the result to finish the proof of Theorem 1.6. First, we need some definitions for polynomials in finite characteristic:
Definition 1.8**.**
- (1)
A polynomial is called separable if and for . 2. (2)
A polynomial is additive if for any . 3. (3)
For and , define the differencing operator . Then define recursively by . The derivational degree (abbreviated d-) of a polynomial is the minimum such that for any .
Note that d-, since in characteristic . More generally, for with , we have d-. Any polynomial can be written in the form , where , are additive polynomials, and are distinct separable monomials.
We say that is well-distributed mod if
[TABLE]
for every continuous function and every Følner sequence in . A more refined notion of equidistribution is as follows. A function is well-distributed mod in a subgroup if
[TABLE]
for every continuous function and every Følner sequence in . For a subgroup and a finite set , we say that is well-distributed in the components of if there exists such that, for every , the sequence is well-distributed in for some .
Theorem 1.9** ([BL16], Theorem 0.3).**
An additive polynomial is well distributed in the subgroup555In [BL16], the subgroup is called a -subtorus of level , where is a finite subgroup. For any polynomial , the orbit closure is of the form , where and is a finite subset of , and is well-distributed in the components , .
For an additive polynomial and irrational , the orbit closure is equal to the subtorus rather than a union of finitely many shifts of . It follows that for with and , the sequence is well-distributed in the subtorus ; see [BL16, Theorem 8.1] for more details. Thus, for any ,
[TABLE]
This completes the proof of Theorem 1.6.
We can now state our main result, which is an asymptotic version of Theorem 1.6 with quantitative bounds:
Theorem 1.10**.**
Let be a nonconstant polynomial of degree and derivational degree . Write . Let and . Then for any and any ,
[TABLE]
where .
Remark 1.11**.**
(1) In the case , we get that the average \operatorname*{\text{\Large\mathbb{E}}}_{y\in\mathbb{F}[t]_{Q}}{f(x+P(y))} is approximated in by the function
[TABLE]
which is the projection of onto the space of -invariant functions, so Theorem 1.10 can indeed be seen as a finitary version of Theorem 1.6.
(2) The phenomenon encompassed by Theorem 1.10 is simpler in the low degree situation . In the context of finite fields ( with irreducible in our notation), a closely related result was previously established in [BBI21, Lemma 3]. In particular, it is shown that for any , any subsets , and any polynomial with degree ,
[TABLE]
see the statement of Lemma 3 in [BBI21] and the formula for at the top of page 713.
As a consequence of Theorem 1.10, we obtain the following power savings for the Furstenberg–Sárközy theorem:
Corollary 1.12**.**
Let be an intersective polynomial of degree and derivational degree . Let . If does not contain distinct with for some , then
[TABLE]
In particular, if is irreducible (so that is a field with elements), then
[TABLE]
Remark 1.13**.**
(1) If we restrict to being sufficiently large (so that does not reduce to the zero polynomial mod ), then the implicit constant in the conclusion of Corollary 1.12 depends only on the degree and the derivational degree .
(2) The bound given in Theorem 1.4 is difficult to compute in general and to compare with the bound in Corollary 1.12. We can, however, highlight some general features of the different bounds. The power savings obtained in Corollary 1.12 depends only on the derivational degree of the polynomial and applies to all intersective polynomials with coefficients in . By contrast, the Li–Sauermann bound depends on the degree of the polynomial and on the characteristic of the field and applies only to polynomials with zero constant term and with coefficients in .
The disadvantage of our bound is that the power saving decays exponentially with the derivational degree. How the quantity appearing in Theorem 1.4 depends on and is not immediately clear from the definition. However, a related bound due to Green [G17, Theorem 1.2] (which Li and Sauermann optimize) gives power savings of
[TABLE]
for a degree polynomial with . That is, for any , the largest subset of with no nontrivial pattern has cardinality
[TABLE]
Therefore, for fixed , there is a regime of sufficiently high degree polynomials for which the Li–Sauermann bound beats ours. On the other hand, the quantity decays as , so the bound in Corollary 1.12 will outperform this bound in sufficiently high characteristic (for fixed degree ).
Thus, neither of the power saving bounds is universally better than the other, and both methods have their advantages and disadvantages. The main goal of our work is not to produce the best possible power saving bounds but to provide a heuristic backing for why any power saving bound should hold at all and to place the Furstenberg–Sárközy theorem over finite fields within the appropriate general framework.
Combining Theorems 1.9 and 1.10, we may deduce several finitary combinatorial statements from an infinitary statement about equidistribution. Say that a polynomial is good for irrational equidistribution if is well distributed for every irrational .
Proposition 1.14**.**
A polynomial is good for irrational equidistribution if and only if for any totally ergodic system , any Følner sequence in , and any ,
[TABLE]
Proposition 1.14 can be proved along the lines of Theorem 1.6 outlined earlier using the spectral theorem for unitary actions of . For the “only if” direction, upon replacing by the multiplication operators on , we use the fact that is good for irrational equidistribution to conclude
[TABLE]
in . This corresponds to the desired convergence result
[TABLE]
For the “if” direction: suppose is not good for irrational equidistribution, and let such that \lim_{N\to\infty}{\operatorname*{\text{\Large\mathbb{E}}}_{n\in\Phi_{N}}{e\left(P(y)\alpha\right)}}\neq 0. We then take as our totally ergodic system , , , and . For the function , we have , since is a nontrivial character on , while
[TABLE]
Remark 1.15**.**
The naive analogue of Proposition 5.1 for -actions is true. That is, a polynomial is good for irrational equidistribution (meaning that is well-distributed mod 1 for every irrational ) if and only if for any totally ergodic system , any Følner sequence in , and any ,
[TABLE]
However, this result is far less meaningful in the setting of -actions, since every nonconstant integer polynomial is good for irrational equidistribution by Weyl’s equidistribution theorem. This is far from the case in the setting of ; see the examples below.
Example 1.16**.**
(1) Every nonconstant separable polynomial is good for irrational equidistribution; see [BL16, Corollary 0.5].
(2) If are additive polynomials such that for some and , then for any distinct not divisible by , is good for irrational equidistribution. This follows from Theorem 1.9, since the condition ensures that the orbit closure contains the orbit , which is dense mod for irrational .
(3) The polynomial is good for irrational equidistribution. This follows from Theorem 1.18 below. Indeed, upon writing with and , we see that satisfies condition (iii) of Theorem 1.18 for and .
(4) The polynomial is not good for irrational equidistribution: for any of the form , the orbit closure is contained in the infinite index subgroup .
(5) The polynomial is not good for irrational equidistribution. Write with . Then clearly . For any of the form
[TABLE]
satisfying for , one can check by direct calculation that for every . Hence, for any such , is not well distributed mod . Moreover, the set of all such is uncountable so contains irrational elements.
(6) An additive polynomial is good for irrational equidistribution if and only if ; see Proposition 5.4.
The following theorem summarizes the main achievements in this paper. In particular, it emphasizes the role of equidistribution properties in obtaining finitary combinatorial results over the rings . Note that item (v) below strengthens the conclusion of Corollary 1.12 under the assumption that the polynomial is good for irrational equidistribution.
Theorem 1.17**.**
Let be a nonconstant polynomial. The following are equivalent:
- (i)
for any ,
[TABLE] 2. (ii)
there exist such that for any with , one has
[TABLE] 3. (iii)
there exists such that if and , then , where is the group generated by . 4. (iv)
for any , there exists such that if has and are subsets with , then there exist such that and ; 5. (v)
there exist such that for any with , one has
[TABLE]
Moreover, if is good for irrational equidistribution, then each of the properties (i)-(v) holds.
By restricting the coefficients of the polynomial , we can prove a stronger version of Theorem 1.17:
Theorem 1.18**.**
Let . Let be additive polynomials and distinct positive integers not divisible by so that . The following are equivalent:
- (i)
* is good for irrational equidistribution;* 2. (ii)
for any totally ergodic system , any Følner sequence in , and any ,
[TABLE] 3. (iii)
there exist additive polynomials and such that
[TABLE] 4. (iv)
for any ,
[TABLE] 5. (v)
there exist such that for any with , one has
[TABLE] 6. (vi)
there exists such that if and , then , where , . 7. (vii)
for any , there exists such that if has and are subsets with , then there exist such that and ; 8. (viii)
there exist such that for any with , one has
[TABLE]
Combining Furstenberg–Sárközy-type results with the technology of Loeb measures on ultraproducts, we are able to establish partition regularity of families of polynomial equations over finite fields, such as the following:
Theorem 1.19**.**
Let be a nonconstant polynomial, and let be good for irrational equidistribution. For any , there exists and such that for any and any -coloring , there are at least monochromatic solutions to the equation . That is,
[TABLE]
We are in fact able to prove partition regularity under a weaker (but more technically cumbersome) condition on ; see Corollary 6.7. Under this weaker assumption, one application of note is a polynomial Schur theorem over finite fields:
Corollary 1.20**.**
Let with . Then for any , there exists and such that if and , then there are at least monochromatic solutions to the equation . In particular, if the coefficients of are not all divisible by the characteristic of , then there are monochromatic solutions with .
Remark 1.21**.**
In the case , Corollary 1.20 corresponds to the Fermat equation . Schur proved the existence of solutions to the Fermat equation in all prime fields of sufficiently high characteristic using his eponymous partition regularity theorem in [S16]. (This was in fact the original motivation for Schur’s theorem.) The much stronger property of partition regularity of the Fermat equation was established previously in the context of prime fields in [CGS12, Theorem 4] and generalized to a family of related polynomial equations in [L18]. We complete the picture here by extending the partition regularity property to arbitrary finite fields of sufficiently large order (with no assumption on the characteristic).
Related density results for Pythagorean pairs and triples in finite fields were obtained in [DLMS23, Section 6], where they also show that a density version (“density regularity”) of Corollary 1.20 fails already for the Pythagorean equation .
The paper is organized as follows. In Section 2, we prove Theorem 1.5, showing that appropriately captures the phenomenon of asymptotic total ergodicity. Section 3 is dedicated to the proof of Theorem 1.10. The final three sections concern applications of Theorem 1.10: Section 4 deals with power saving bounds in the Furstenberg–Sárközy theorem for intersective polynomials in the presence of asymptotic total ergodicity (Corollary 1.12); Section 5 with further enhancements of the Furstenberg–Sárközy theorem for polynomials with good equidistributional behavior (Theorems 1.17 and 1.18); and Section 6 with partition regularity of polynomial equations over finite fields. The relevant tools are introduced in the corresponding sections as needed.
2. Asymptotic total ergodicity
In this section, we prove that the quantity captures the phenomenon of asymptotic total ergodicity. Recall Theorem 1.5:
See 1.5
Proof.
(ii) (i). Fix . If has , then is an element of the multiplicative group . Hence, for any and any ,
[TABLE]
(i) (ii). Let , and let be an irreducible factor of . Enumerate , and let be the function . Define by . Then
[TABLE]
for every . On the other hand,
[TABLE]
Therefore,
[TABLE]
Now suppose . Taking a subsequence if necessary, we may assume is bounded. By the pigeonhole principle, we may then take a further subsequence and assume that there is a common irreducible factor of every , . The above calculation shows that we may find with
[TABLE]
for each , contradicting (i). ∎
3. Asymptotic projection theorem
We now turn to proving Theorem 1.10 with Fourier analysis. Characters on take the form for some . For a function , we therefore define its Fourier transform by
[TABLE]
We state some basic properties of the Fourier transform
Proposition 3.1**.**
For any , one has
- •
Fourier inversion:
[TABLE]
- •
Parseval’s identity:
[TABLE]
Define F(x):=\operatorname*{\text{\Large\mathbb{E}}}_{y\in\mathbb{F}[t]_{Q}}{f(x+P(y))}-\operatorname*{\text{\Large\mathbb{E}}}_{z\in H_{Q}}{f(x+a_{0}+z)}. Then
[TABLE]
where is the annihilator of the subgroup . Hence, by Parseval’s identity
[TABLE]
All that remains to show is the inequality:
Proposition 3.2**.**
Let be a nonconstant polynomial of degree and derivational degree . Write . Then for any and any ,
[TABLE]
A key ingredient in Proposition 3.2 is the following van der Corput-type inequality. We do not use any ring structure for this result, so we state and prove it in the setting of an arbitrary finite abelian group . For a function , define a multiplicative differencing operator by , and let for and .
Lemma 3.3**.**
Let be a finite abelian group, and let be a subgroup. For any function and any ,
[TABLE]
Remark 3.4**.**
It is worth commenting on two special cases of Lemma 3.3. When , the quantity on the right hand side is equal to , so the conclusion of Lemma 3.3 reduces to the inequality , which is a special case of monotonicity for the Gowers (semi)norms. On the other hand, when , the right hand side is equal to \operatorname*{\text{\Large\mathbb{E}}}_{u\in G}{|f(u)|^{2^{k}}}, so the conclusion of Lemma 3.3 follows by Jensen’s inequality. The general case can be seen as interpolating between these two extremes.
Proof.
Suppose . Note that
[TABLE]
Therefore, by Jensen’s inequality,
[TABLE]
Interchanging the order of averaging and making the substitutions , , we obtain the desired inequality
[TABLE]
Suppose the inequality holds for . Then
[TABLE]
which is in turn bounded above by
[TABLE]
For fixed , applying the case with the function gives
[TABLE]
Putting everything together,
[TABLE]
∎
Proof of Proposition 3.2.
We first make a reduction to separable polynomials. If , then for every , since . Suppose now that . We want to show
[TABLE]
Noting that , we have
[TABLE]
Moreover, for any ,
[TABLE]
Now, for each , the function is a nontrivial character on , so there exists such that . It therefore suffices to prove the following: for any nonconstant separable polynomial ,
[TABLE]
Suppose . Then with . Therefore,
[TABLE]
Now suppose . Let , where d- and d-. By Lemma 3.3,
[TABLE]
for any subgroup . (We will take a convenient choice for later.) We now wish to obtain an expression for that will allow us to bound the avaerage
[TABLE]
For , one has that is constant (as a function of ), since d-, so we can pull the constant outside of the average.
Let so that and . For each , we may write with and . Since is separable by assumption, we have for . Then
[TABLE]
where , is the sum of all monomials of the form
[TABLE]
with
[TABLE]
and is a symmetric polynomial in variables. (If , then .) We can therefore write
[TABLE]
where .
Let
[TABLE]
Note that is a group homomorphism . It follows that
[TABLE]
whenever is a nonzero function. Noting that is a character on , it may be written in the form for some . We have thus obtained the bound
[TABLE]
The remainder of the proof consists of two main steps. First, we show that, for a convenient choice of , the function becomes (after a change of coordinates) a polynomial in variables. Next, we establish a bound on the number of roots of multivariable polynomials mod .
Recall . Let . This is a subgroup, since the function is a homomorphism. For , let so that
[TABLE]
and each of the polynomials is an additive polynomial of degree at most in each coordinate. In particular, for each . Making the substitution , we therefore have
[TABLE]
for some .
For each , the function is a character on , so there exists such that . Hence, defining , we have
[TABLE]
That is, is a polynomial of degree at most in each coordinate.
We claim that is not the zero polynomial. By definition, . We also have
[TABLE]
The coefficients are integers coprime to , so . Hence, . Now, is a sum of terms of the form
[TABLE]
with the property . Therefore, the monomials appearing in are distinct from the monomials appearing in for . It follows that is not the zero polynomial. Thus,
[TABLE]
is not the zero polynomial, and each monomial appearing has degree divisible by . Finally, for , we have
[TABLE]
which consists of monomials in which each variable has degree congruent to mod . This proves that is not the zero polynomial.
The final step is to show that has only a small number of zeros.
Lemma 3.5**.**
Let , and let be a nonzero polynomial of degree in the variable for . Then
[TABLE]
Proof of Lemma.
Let us first consider the case . Write . We view as elements of with . Let , , and . Fix irreducible such that . For some , we have . Since is a field, and reduces to a nonzero polynomial of degree mod , we have
[TABLE]
Now suppose . Then . That is, , so . Hence, . Equivalently, . Therefore,
[TABLE]
Suppose . If , then there is nothing to prove, so assume . Fix , and let . If is not the zero polynomial, then by the induction hypothesis,
[TABLE]
Hence,
[TABLE]
It therefore suffices to prove
[TABLE]
Fix , and let . If , then . By Lemma 3.5, it follows that
[TABLE]
unless is the zero polynomial. So, it remains to find such that is a nonzero polynomial. Note that the coefficients of are polynomial expressions in of degree at most in the variable . Since is not the zero polynomial, there is at least one coefficient that is a nonzero polynomial . By the induction hypothesis,
[TABLE]
Since by assumption, it follows that for some . For this choice of , the polynomial is not the zero polynomial, so we are done. ∎
Applying Lemma 3.5 to , we get the bound
[TABLE]
Thus,
[TABLE]
∎
4. Power saving bounds for the Furstenberg–Sárközy theorem in characteristic
We now prove Corollary 1.12, restated below:
See 1.12
Proof of Corollary 1.12.
Since is intersective, there exists with . Hence, . Therefore, applying Theorem 1.10 with , we have
[TABLE]
by the Cauchy–Schwarz inequality.
On the one hand, if contains no nontrivial patterns , then by Lemma 3.5,
[TABLE]
as long as is large enough so that is not the zero polynomial mod . On the other hand,
[TABLE]
by Lemma 3.3. Therefore,
[TABLE]
where . Multiplying both sides by , we get the desired bound
[TABLE]
∎
5. Proof of equivalences
The goal of this section is to prove Theorem 1.17, restated here for convenience:
See 1.17
First we prove that irrational equidistribution implies condition (iii).
Proposition 5.1**.**
Suppose is good for irrational equidistribution, and let be the group generated by . Then there exists such that if satisfies , then . That is, (iii) holds.
Proof.
We prove the contrapositive. Suppose (iii) fails. Then there is a sequence in such that and for . Equivalently,
[TABLE]
Since , it follows that for some . If in reduced terms (i.e., and are coprime), then , so we may assume without loss of generality that and are coprime. The sequence then consists of distinct elements, so is infinite. Every infinite compact group is uncountable, and there are only countable many rational points, so must contain an irrational element. That is, for some irrational , for every . Hence, is not well distributed, so is not good for irrational equidistribution. ∎
We will now prove the equivalences in Theorem 1.17 by showing the implications illustrated in the following diagram:
{(i)}$${(ii)}$${(iii)}$${(v)}$${(iv)}
Condition (ii) is a quantitative refinement of condition (i), so we immediately have the implication (ii)(i). By Theorem 1.10, we have the additional implications (i)(iii)(ii).
Condition (v) follows from (ii) by a straightforward application of the Cauchy–Schwarz inequality.
Proposition 5.2**.**
(v)(iv).
Proof.
Let . Let , , and be as in (v). Let with . Let with . Then by (v),
[TABLE]
Thus, if
[TABLE]
then we can find with and . ∎
We now prove the final implication to complete the proof of Theorem 1.17:
Proposition 5.3**.**
(iv)(iii).
Proof.
We prove the contrapositive. Suppose (iii) fails. Then there is a sequence in with such that for every . The subgroup has index for some . Let be a union of cosets of , and let . For any and , we have
[TABLE]
That is, if and , then . Moreover,
[TABLE]
Therefore, property (iv) fails for . ∎
Now we proceed to prove the remaining equivalences in Theorem 1.18. As a first step, we have the following characterization of irrational equidistribution for additive polynomials:
Proposition 5.4**.**
Let be an additive polynomial. The following are equivalent:
- (i)
* is good for irrational equidistribution;* 2. (ii)
there exists such that if and , then ; 3. (iii)
* for some .*
Proof.
(i)(ii). See Proposition 5.1.
(ii)(iii). We prove the contrapositive. Suppose with , . We consider two cases separately.
Case 1: .
In this case, we may write , where . Hence, for any , . If , then , while . Therefore, the homomorphism has a nontrivial kernel mod , so is a proper subgroup of . Taking to be an arbitrarily large irreducible element of , this shows that (ii) does not hold.
Case 2: .
Write with . Since is a nonconstant polynomial, the set
[TABLE]
is infinite. Indeed, for any finite collection of irreducibles , consider
[TABLE]
Since is a nonzero polynomial, there exists such that . From the expression on the right hand side of (5.1), we have , and for each . Therefore, there is some irreducible such that . Hence, is infinite as claimed.
Suppose and . Let such that . Then , but , since . Hence, has a nontrivial kernel mod for infinitely many irreducibles , which contradicts condition (ii).
(iii)(i). See [BL16, Theorem 0.1]. ∎
The following lemma is the key tool to reduce equidistributional properties of polynomials to the additive case with which we have just dealt.
Lemma 5.5**.**
Let be additive polynomials, and let for . There exists an additive polynomial such that . Moreover, , where are additive polynomials.
Proof.
If for some , then take with .
Suppose now that and are both nonzero. Let . Write and . Without loss of generality, . Define
[TABLE]
and let . Then .
Claim: .
For any , (5.2) expressed as a sum of an element of and an element of . Hence, . Rearranging (5.2), we have
[TABLE]
Thus, . This proves the claim.
We have shown that, given any nonzero additive polynomials , we may find with such that , and and are of the appropriate form. Repeating this process finitely many times, we eventually reduce to the situation that one of the additively polynomials is zero. We then take to be the remaining nonzero polynomial. ∎
The argument in the proof of Lemma 5.5 provides an algorithm for obtaining that bears a strong resemblance with the Euclidean algorithm. We work through a few simple examples to see more concretely how the algorithm works.
Example 5.6**.**
(1) , . The polynomial has larger degree, so we shift the exponents of to match the degree of and subtract:
[TABLE]
If , then , so we stop, and the resulting polynomial is simply . (Note that when , may be rewritten as , and then it is clear that , so the range of is manifestly a subset of the range of .) Suppose . Then , so we shift the exponents of and subtract:
[TABLE]
Since , the element is invertible, so the image of is all of , and we are done: . (One can check that applying one more step of the algorithm would result in , indicating that the process has terminated.)
(2) , . First, shifting and subtracting, we have
[TABLE]
Next, subtracting without any shifting gives
[TABLE]
Shifting and subtracting from produces , so we are done and .
The following proposition completes the proof of Theorem 1.18:
Proposition 5.7**.**
Let and write with additive polynomials and distinct positive integers not divisible by . The following are equivalent:
- (i)
* is good for irrational equidistribution;* 2. (ii)
there exists such that if and , then , where is the group generated by ; 3. (iii)
there exist additive polynomials and such that
[TABLE]
Proof.
(i)(ii). See Theorem 1.17.
(ii)(iii). For and , let , By (ii), we have that whenever . By Lemma 5.5, there is an additive polynomial of the form
[TABLE]
such that for every . In particular, for all with . Hence, by Proposition 5.4, . That is, (iii) holds.
(iii)(i). Let . Let
[TABLE]
By Theorem 1.9, is well distributed if and only if is the full “torus” . By (iii),
[TABLE]
for every . But is well-distributed (in particular, it is dense) mod (see [BL16, Theorem 0.1]), so . Thus, is good for irrational equidistribution. ∎
6. Partition regularity of polynomial equations over finite fields
In this section, we deduce additional applications of Theorem 1.10 to partition regularity of polynomial equations over finite fields. As a first step, we observe the following criterion for the existence of solutions to polynomial equations of the form over finite fields:
Proposition 6.1**.**
*Let be nonconstant polynomials with . Let and . Let , and let . Then for any and any , *
[TABLE]
where . In particular, if is sufficiently large, then
[TABLE]
Remark 6.2**.**
The main term, , expresses an asymptotic equidistribution property. There are choices of and possible values that are free from obvious algebraic obstructions to solvability of . Proposition 6.1 states that, asymptotically, the polynomial takes on each such value of with roughly equal frequency.
Proof.
For any prime power , any , and any functions , let denote the number of solutions to the equation . Our goal is to show
[TABLE]
for .
For each , let be an additive polynomial with .
Claim: .
Let . Then
[TABLE]
Similarly,
[TABLE]
Hence, by the Cauchy–Schwarz inequality,
[TABLE]
Now, by Theorem 1.10,
[TABLE]
Finally, for each , the polynomial equation has at most solutions , so . Putting everything together,
[TABLE]
as claimed.
Applying the claim also to and , we obtain the estimate
[TABLE]
Let , . Then is a group homomorphism with image . Therefore, is constant in , so
[TABLE]
∎
Problems in the vein of Proposition 6.1 counting solutions to polynomial equations over finite fields are well-studied. For instance, if the equation defines a geometrically irreducible variety, then a theorem of Lang and Weil [LW54] states that the number of solutions is equal to . The class of polynomials handled by Proposition 6.1 is very restricted, and the error term in Proposition 6.1 is in general much weaker that what can obtained with the use of algebraic geometry. However, the elementary method of proof allows us to avoid any irreducibility assumption and is more flexible for combinatorial enhancements, such as the following Ramsey-theoretic result, restated from the introduction:
See 1.19
Our proof of Theorem 1.19 combines Theorem 1.10 with tools from the theory of Loeb measures on ultraproduct spaces. We will need the following generalization of Theorem 1.10, which in fact comes as an easy consequence of the key estimate on character sums in Proposition 3.2:
Proposition 6.3**.**
Let be a nonconstant polynomial of degree and derivational degree . Write . Let and . Then for any , any , and any ,
[TABLE]
where .
Proof.
Expand as a Fourier series:
[TABLE]
Then for , we have
[TABLE]
For each , Proposition 3.2 gives the bound
[TABLE]
Therefore, by Parseval’s identity,
[TABLE]
∎
The relevant constructions for employing measure theory on ultraproducts are summarized as follows:
Definition 6.4**.**
- •
An ultrafilter on is a collection of nonempty subsets of such that:
- –
if , then ;
- –
for any , either or .
The ultrafilter is principal if for some and non-principal otherwise. The space of ultrafilters is denoted .
- •
Given and a family of sets , the ultraproduct is the set
[TABLE]
where is the equivalence relation defined by if and only if .
- •
Given and a sequence taking values in a compact Hausdorff space , the limit of along is defined to be the unique point666Such a point exists by compactness and is unique by the Hausdorff property. such that for any neighborhood of , one has . The limit of along is denoted by .
- •
Let , and let be a family of probability spaces. Let .
- –
An internal set is a set of the form with .
- –
The Loeb -algebra is the -algebra on generated by the algebra of internal sets.
- –
The Loeb measure is the unique probability measure on with the property
[TABLE]
for any internal set .
The main property of the Loeb measure that we will use is the following version of Fubini’s theorem:
Proposition 6.5** (cf. [K77], Theorem 1.12).**
Let and be sequences of finite sets. Let be a non-principal ultrafilter. Let and . Let be a bounded Loeb-measurable function. Then
- (1)
for any , the function is Loeb-measurable on ; 2. (2)
the function is Loeb-measurable on ; and 3. (3)
[TABLE]
Remark 6.6**.**
Proposition 6.5 does not follow from standard version of Fubini’s theorem. The subtlety lies in the structure of the Loeb -algebra on the product space : there are internal subsets of that cannot be approximated by Boolean combinations of Cartesian products of internal subsets of and (on the finitary level, this corresponds to approximating subsets of by products of boundedly many subsets of and ). Therefore, the function need not be measurable with respect to the product of the Loeb -algebras on and . Nevertheless, Proposition 6.5 shows that shares important features with the product measure .
Proof of Theorem 1.19.
Let . Suppose for contradiction that there are -colorings with such that , where
[TABLE]
is the collection of monochromatic solutions to the equation .
Now we define a limit object associated with this sequence of colorings. Fix a non-principal ultrafilter on . Let be the pseudo-finite field , let , and let . Denote by the Loeb measure on obtained by equipping with the normalized counting measure. For any , we denote the Loeb measure on by (not be confused with the product measure on ). Let and . Finally, let be the Loeb measure on obtained from the normalized counting measures on .
Claim 1: .
Let . For , let . Then , so for some , since is an ultrafilter. By the definition of the sets , it follows that . This proves the claim.
Arguing as in the proof of Claim 1 above, one can check that is the set of monochromatic solutions to the equation with respect to the coloring .
Claim 2: .
We have constructed as an internal set, so by the definition of the Loeb measure,
[TABLE]
Now, by Proposition 6.1, for some . By assumption, . Hence, , so , since is non-principal.
Let . Note that
[TABLE]
where . In particular, if and only if .
Without loss of generality, we may assume that for and for , for some . Let . Note that . Let be the map for , . For each and , let , and let . Also let be the map for and . Now, since is good for irrational equidistribution, we have
[TABLE]
by Proposition 6.3 and the Cauchy–Schwarz inequality. Hence,
[TABLE]
The steps are justified as follows. Step (1) is a direct application of Proposition 6.5. The equality (2) comes from the definition of the Loeb measure . In step (3), we have taken the limit of both sides of (6.1) along . The inequality (4) holds for each by the Cauchy–Schwarz inequality:
[TABLE]
Finally, (5) follows from the definition of the Loeb measure .
Thus,
[TABLE]
Let . Since the set has Loeb measure zero, .
For , let such that . Noting that
[TABLE]
it follows that
[TABLE]
The set lies in the image of by definition, so taking the inverse image under ,
[TABLE]
Therefore,
[TABLE]
This final inequality contradicts Claim 2, so the theorem follows by reductio ad absurdum. ∎
The proof of Theorem 1.19 is sufficiently flexible as to allow for many variations. We note that in the proof of Theorem 1.19, we do not use the full strength of the equidistribution assumption on . Carefully following each step in the proof reveals the following characterization of polynomials for which the conclusion of Theorem 1.19 holds:
Corollary 6.7**.**
Let . Write with distinct with and additive polynomials. For , let and . The following are equivalent:
- (i)
there exists such that for all ; 2. (ii)
for any nonconstant polynomial and any , there exist and such that for any and any -coloring , there are at least monochromatic solutions to the equation .
Remark 6.8**.**
Condition (i) in Corollary 6.7 holds for intersective ; see the first two lines in the proof of Corollary 1.12.
Proof.
(i) (ii). The proof of Theorem 1.19 goes through with only minor changes, which we will now describe. Equation (6.1) must be replaced with
[TABLE]
This follows from Proposition 6.3 together with the assumption (i) to eliminate the presence of the constant term . We then note that the Cauchy–Schwarz inequality implies
[TABLE]
for each . The steps (3) and (4) in the proof of Theorem 1.19 may be replaced by the above considerations.
Finally, by Proposition 6.1, the set has cardinality , where is the subgroup generated by and . By assumption, is nonconstant, so , where . Hence, . This upper and lower bound on the cardinality of allow for the remainder of the argument to be carried out without difficulty.
(ii) (i). We prove the contrapositive. Suppose , and take . Then for any , we have
[TABLE]
so does not have any solutions over , much less monochromatic solutions for an arbitrary coloring of . ∎
Another feature of the proof of Theorem 1.19 is the following. Taking the ultraproduct of a sequence of finite fields with characteristic growing to infinity, the same method shows that for any nonconstant polynomials , the equation is partition regular over all fields of sufficiently high characteristic. This follows by noting that and will be nonconstant and separable (hence good for irrational equidistribution; see Example 1.16(1) above) once the characteristic exceeds the degrees of and and the size of some nonconstant coefficient.
In the special case , Theorem 1.19 can be seen as a polynomial version of Schur’s theorem over finite fields. Indeed, the classical theorem of Schur asserts that the equation is partition regular over . We have just established partition regularity of the equation over finite fields whenever satisfies the equidistribution assumption in item (i) of Corollary 6.7. While the condition (i) in Corollary 6.7 depends on the characteristic , it is automatically satisfied for polynomials with zero constant term. Hence, Corollary 1.20 holds.
The equation is often not partition regular (and may not even be solvable) over . A key fact leveraged in the proof of Theorem 1.19 is that polynomials take on a positive proportion of values in finite fields, something that is far from the case in . It remains an interesting and difficult open problem, asked by Erdős and Graham in [EG80], whether the Pythagorean equation is partition regular over . (This was settled with a computer-assisted proof in the case of 2-colorings in [HKM16] but is wide open for 3 or more colors.)
Some comments are in order on the use of ultraproducts in the proofs of the aforementioned partition regularity results. The basic strategy we have taken is to discard those colors that have zero Loeb measure in the ultraproduct and then to use recurrence along the polynomial to find the desired points with . One may be tempted to carry out this strategy in purely finitary terms, avoiding the use of ultraproducts and Loeb measure. Unfortunately, this does not work (at least in its most straightforward implementation). The following discussion illuminates the issues that arise. Fix a polynomial that is good for irrational equidistribution. For simplicity, we will consider . Let . Suppose is large and an -coloring is given. We wish to use a function as a cutoff for distinguishing “large” from “small” color classes. That is, we will consider a color class large if and small if . Without loss of generality, we may assume are large and are small for some . (The requirement that guarantees that at least one color class is large.) We now proceed as in the proof of Theorem 1.19, using “large” as a replacement for having positive Loeb measure. Let . Proposition 6.3 gives the bound
[TABLE]
where . Since for each , we deduce that
[TABLE]
where in the last step we have used the bound . In order to complete the argument, we want to find satisfying . To that end, one would like to show
[TABLE]
The total size of the small color classes is bounded by
[TABLE]
The goal, then, is to choose the function so that
[TABLE]
Dividing by , this reduces to the inequality
[TABLE]
But for , this requires , which violates the condition that .
Working with the ultraproduct allows us to replace “small” with measure zero. This is crucial, as we have just seen that “small” contributions in the finitary setting may accumulate and overtake individual “large” terms. In contrast, finite unions of measure zero sets remain of measure zero. However, our infinitary methods come at a cost: we are unable to provide any quantitative control on the values and appearing in the statement of Theorem 1.19 and related corollaries. It is therefore an interesting problem to obtain a purely finitary proof of Theorem 1.19 with effective bounds.
Acknowledgements
The first author was supported by the National Science Foundation under Grant No. DMS-1926686. We thank Peter Sarnak for pointing us to the work of Lang and Weil [LW54] and for insightful discussions that helped shape Section 6.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[BB 23] Vitaly Bergelson and Andrew Best. The Furstenberg-Sárközy theorem and asymptotic total ergodicity phenomena in modular rings. J. Number Theory 243 (2023) 615–645.
- 2[BBI 21] Vitaly Bergelson, Andrew Best, and Alex Iosevich. Sums of powers in large finite fields: a mix of methods. Amer. Math. Monthly 128 (2021) 701–718.
- 3[BL 16] V. Bergelson and A. Leibman. A Weyl-type equidistribution theorem in finite characteristic. Adv. Math. 289 (2016) 928–950.
- 4[CGS 12] Péter Csikvári, Katalin Gyarmati, and András Sárközy. Density and Ramsey type results on algebraic equations with restricted solution sets. Combinatorica 32 (2012) 425–449.
- 5[CLP 17] Ernie Croot, Vsevolod F. Lev, and Péter Pál Pach. Progression-free sets in ℤ 4 n superscript subscript ℤ 4 𝑛 \mathbb{Z}_{4}^{n} are exponentially small. Ann. of Math. (2) 185 (2017) 331–337.
- 6[DLMS 23] Sebastián Donoso, Anh N. Le, Joel Moreira, and Wenbo Sun. Additive averages of multiplicative correlation sequences and applications. J. Analyse Math. 149 (2023) 719–761.
- 7[EG 80] P. Erdős and R. L. Graham. Old and New Problems and Results in Combinatorial Number Theory . L’Enseignement Mathématique 28 (Université de Genève, Geneva, 1980).
- 8[F 77] Harry Furstenberg. Ergodic behavior of diagonal measures and a theorem of Szemerédi on arithmetic progressions. J. Analyse Math. 31 (1977) 204–256.
