Ergodic Theorems for Nonconventional Arrays and an Extension of the Szemeredi Theorem
Yuri Kifer

TL;DR
This paper extends ergodic theorems and Szemerédi's theorem to nonconventional arrays involving polynomial iterates, establishing convergence results and combinatorial implications for subsets of integers with positive density.
Contribution
It introduces new ergodic theorems for nonconventional arrays with polynomial iterates and extends Szemerédi's theorem to these settings, including multidimensional cases.
Findings
Convergence of averages for weakly mixing transformations with linear polynomial iterates.
Extension of Szemerédi's theorem ensuring structured subsets within dense integer sets.
Results for multiple commuting transformations generalizing classical combinatorial theorems.
Abstract
The paper is primarily concerned with the asymptotic behavior as of averages of nonconventional arrays having the form where 's are bounded measurable functions, is an invertible measure preserving transformation and 's are polynomials of and taking on integer values on integers. It turns out that when is weakly mixing and are linear or, more generally, have the form for some integer valued polynomials and then the above averages converge in but for general polynomials the convergence can be ensured even in the case only when is strongly mixing. Studying also weakly mixing and compact extensions and relying on Furstenberg's structure theorem we derive an extension of Szemer\' edi's theorem saying that for any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLimits and Structures in Graph Theory · Mathematical Dynamics and Fractals · graph theory and CDMA systems
Ergodic Theorems for Nonconventional Arrays and an Extension
of the Szemerédi Theorem
Yuri Kifer
Institute of Mathematics
Hebrew University
Jerusalem, Israel
Institute of Mathematics, The Hebrew University, Jerusalem 91904, Israel
Abstract.
The paper is primarily concerned with the asymptotic behavior as of averages of nonconventional arrays having the form where ’s are bounded measurable functions, is an invertible measure preserving transformation and ’s are polynomials of and taking on integer values on integers. It turns out that when is weakly mixing and are linear or, more generally, have the form for some integer valued polynomials and then the above averages converge in but for general polynomials the convergence can be ensured even in the case only when is strongly mixing. Studying also weakly mixing and compact extensions and relying on Furstenberg’s structure theorem we derive an extension of Szemerédi’s theorem saying that for any subset of integers with positive upper density there exists a subset of positive integers having uniformly bounded gaps such that for and at least of ’s all numbers belong to . We obtain also a version of these results for several commuting transformations which yields a corresponding extension of the multidimensional Szemerédi theorem.
Key words and phrases:
Szeméredi theorem, multiple recurrence, nonconventional averages, triangular arrays
2010 Mathematics Subject Classification:
Primary: 37A30 Secondary: 37A45, 28D05
A part of this work was done during the author’s visit to University of Pennsylvania in Fall of 2016 as the Bogen family visiting professor.
1. Introduction
In 1975 Szemerédi proved the conjecture of Erdős and Turan saying that any set of integers with positive upper density contains arbitrary long arithmetic progressions. In 1977 Furstenberg [10] published an ergodic theory proof of this result which turned out to be a corollary of a multiple recurrence statement for measure preserving transformations.
Namely, let be a probability space, be an invertible -preserving transformation and be a set of positive -measure. Furstenberg proved that in these circumstances for any positive integer ,
[TABLE]
which, in fact, implies existence of infinitely many arithmetic progressions in any set of integers having positive upper density (for a nice exposition of this result see [14]).
An important part of the proof of (1.1) was to show that
[TABLE]
(where ) provided is a measure preserving weakly mixing invertible transformation and ’s are bounded measurable functions. In fact, (1.1) required more general results concerning weak mixing and compact extensions together with a structure theorem describing all possible extensions. Observe that in [3] the convergence (1.2) for weakly mixing transformations was extended from powers to arbitrary essentially distinct polynomials (i.e. having nonconstant pairwise differences) taking on integer values on integers.
In this paper we consider the averages of the form
[TABLE]
where ’s are bounded measurable functions, is an invertible measure preserving transformation and are essentially distinct polynomials of and taking on integer values on integers. It is customary in probability to call sums whose summands depend on the number of summands by the name (triangular) arrays and it seems appropriate to use the same name for sums in (1.3) too while the term ”nonconventional” comes from [12].
First, we study the linear case where ’s are distinct and ’s are arbitrary integers. It turns out that under the weak mixing assumption on ,
[TABLE]
In particular, when for and for the left hand side of (1.4) takes on the following symmetric form
[TABLE]
It is known by [19] that when for all then the left hand side of (1.4) still converges in also without the weak mixing assumption on but not necessarily to the right hand side of (1.4). On the other hand, a simple example shows that for arbitrary ’s there is no convergence of the left hand side of (1.4) if is not weakly mixing. Indeed, take in (1.5) and let be the rotation of the unit circle by one half of it while be the indicator of an arc having length less than one half of the circle. Then and this expression equals the indicator of or the indicator of (depending on the parity of ) if is even while it equals zero for otherwise. Thus, the averages (1.5) will be equal to for each even and 0 for each odd .
We complement the above study by considering weak mixing and compact extensions and relying on the structure theorem from [10] and [11] we conclude that for any invertible measure preserving transformation , numbers as above and a set of positive measure there exists a subset of positive integers with uniformly bounded gaps, called syndetic set, such that
[TABLE]
while this does not hold true, in general, if we take the limit over all positive integers which can be seen from the above example. In fact, we also show that (1.6) follows by a shorter argument relying on recent advanced results from [6] and [2] concerning convergencies along Følner sequences in multidimensionar multiple recurrence results but our direct proof still has a value since, in particular, it concentrates attention on nonconventional arrays and the convergence results like (1.4) cannot be derived from the above references.
By the standard Furstenberg’s argument (1.6) implies an extended version of Szemerédi’s theorem saying that for any subset of integers with positive upper density there exists a syndetic subset such that for all and at least of ’s, all numbers belong to . We obtain also more general results concerning families of commuting transformations studying the limits of
[TABLE]
where and are as above.
If we consider in (1.3), where ’s are essentially distinct and ’s are arbitrary polynomials, then the convergence in of nonconventional averages (1.3) to the product of integrals can be established under the weak mixing assumption. On the other hand, already for and weak mixing is not sufficient, in general, for the convergence of averages (1.3) though strong mixing suffices here. For and general polynomials we show convergence of the expression (1.3) assuming strong -mixing of .
Acknowledgement**.**
The author is greatful to anonymous referees for many helpful suggestions which led to improvements of the original version of this paper.
2. Preliminaries and main results
Let be a separable probability space and be an invertible measure preserving transformation. In studying polynomial nonconventional averages (1.3) we start with the linear case . Let be bounded measurable functions on . The example described in Introduction shows that, in general, the limit
[TABLE]
does not exist in the -sense. Still, we will see that the limit (2.1) exists in the sense if is weakly mixing which means that the product transformation on is ergodic (see, for instance, [11]). Thus we have the following ergodic theorem for nonconventional arrays.
2.1 Theorem**.**
Suppose that an invertible transformation is weakly mixing, are bounded measurable functions and are integers such that ’s are distinct (ordered without loss of generality as ) and ’s are arbitrary. Then
[TABLE]
The condition of Theorem 2.1 that are distinct is important for (2.2). Indeed, let and . Then
[TABLE]
which does not converge to zero as , in general, unless is (strongly) mixing (and ) while under weak mixing only convergence ouside of a set of ’s having zero density can be ensured. We observe (as pointed out by the referee) that Theorem 2.1 actually follows from Theorem 3.2 in the recent paper [20] though motivation and goals of the latter paper seem to be different from ours. In fact, we will study convergence in a more general situation of weak mixing extensions and, in addition, will consider also compact extensions, which together with the structure theorem from [10] and [11] will produce the following result.
2.2 Theorem**.**
Let be integers such that if and . Then for any with there exists an infinite subset of positive integers with uniformly bounded gaps such that
[TABLE]
As the example in Introduction shows this statement does not hold true, in general, if we take over all . On the other hand, if for all then (2.1) was proved in [10] (see also [14]) with over all and it was shown there how such result yields the Szemerédi type theorem. Recall briefly the latter argument. Let be the space of sequences, be the left shift and consider the special sequence where if and only if with being a subset of integers with a positive upper density (called, also the upper Banach density), i.e.,
[TABLE]
for some sequence of intervals with as , denoting by the number of elements in a set . Take the closure in of then any weak limit of the sequence of measures (where is the unit mass at ) is a -invariant probability measure on and if then .
It is easy to see that contains an arithmetic progression of length if and only if is nonempty for some . More generally, contains all numbers for some if and only if is nonempty. Thus, Theorem 2.2 yields the following result.
2.3 Corollary**.**
Let be a subset of nonnegative integers with a positive upper density and be integers satisfying conditions of Theorem 2.2. Then there exist and an infinite set of positive integers with uniformly bounded gaps such that for any the interval contains not less than integers with the property that for some ,
[TABLE]
In particular, if , for and for then for at least integers in the interval the set contains arithmetic progressions with length of both step and of step .
Clearly, the above corollary does not hold true, in general, if we replace by all positive integers. Indeed, let be the set of all even numbers then and cannot both belong to if is odd since then and cannot be both even.
Next, we will discuss an extension of the above results to families of commuting transformations, which will yield also a multidimensional version of Corollary 2.3. Let be a multiplicative free finitely generated abelian group acting on by measure -preserving transformations which are necessarily invertible. Any such group is isomorphic to a -dimensional integer lattice group. Let be bounded measurable functions on . As in the case of one transformation, in general, the limit
[TABLE]
does not exists if over all . Nevertheless, we will see that the limit (2.6) exists in the sense if the abelian group is totally weak mixing, i.e. it consists of weakly mixing transformations with the only exception of the identity.
2.4 Theorem**.**
Suppose that distinct and different from the identity (id) transformations belong to a totally weak mixing free finitely generated abelian group acting on by measure preserving transformations. Let be invertible -preserving transformations of , which commute with each other and with . Then for any bounded measurable functions ,
[TABLE]
where .
Considering weak mixing and primitive extensions we will obtain the following generalization of Theorem 2.2.
2.5 Theorem**.**
Let where are distinct and different from the identity id of while are any transformations from . Then for any with there exists an infinite subset of positive integers with uniformly bounded gaps such that
[TABLE]
where we set .
Clearly, if we set and then we arrive back at the setup of Theorem 2.2. For equal the identity (2.9) was proved in [13] with but our proof will follow more closely Chapter 7 of [11]. Similarly to the one transformation case Theorem 2.5 yields an extension of a multidimensional version of the Szemerédi theorem. Recall, the notion of the upper (Banach) density of a set . For any two vectors such that denote by the parallelepiped . A set is said to have positive upper (Banach) density if there exists a sequence of parallelepipeds with satisfying and such that
[TABLE]
where, again, denotes the number of points in a set .
Since the group in Theorem 2.5 is isomorphic to we can identify the actions of and with additions of some vectors and . For any ordered finite set , and we set and . Next, if and are two ordered finite sets then we write . Now Theorem 2.5 yields the following extension of the multidimensional Szemerédi theorem.
2.6 Corollary**.**
Let be a subset of with a positive upper (Banach) density and let be two ordered sets of vectors from such that are all distinct and non zero. Then there exist and an infinite set of positive integers with uniformly bounded gaps such that for any the interval contains not less than integers such that for some ,
[TABLE]
Corollary 2.6 follows from Theorem 2.5 similarly to the one transformation case. Namely, we consider the action of on by for any . Again, we take to be the closure in of the orbit of the special sequence if and only if and an -invariant measure comes as a weak limit as of the measures where are the same as in (2.9).
The proofs of the above results proceed similarly to [14] and [11], and so we will be trying to make a compromise between keeping the paper relatively self-contained and still avoiding too many repetitions of arguments from [14] and [11]. Though, of course, Theorem 2.2 is a particular case of Theorem 2.5, in order to facilitate the reading, we will consider first the one transformation case and then pass to the case of commuting transformations.
As we mentioned it in Introduction it is possible to give a shorter argument yielding Theorems 2.2 and 2.5, which will be presented in Section 5. This argument relies on quite general results from the recent paper [2]. In fact, this argument together with [6] yields Theorem 2.1 with linear terms replaced by arbitrary polynomials taking on integer values for integer pairs and such that for any integer there exist and with divisible by for each . The results in [6] and [2] rely on advanced machinery developed with the purpose to derive convergence of nonconventional averages in various situations. The direct proof presented here, which proceeds along the lines of the original proof in [14] and [11], still seems to be useful, in particular, for focusing attention on limiting behavior of nonconventional arrays, which is a somewhat different point of view in comparison to other research on multiple recurrence problems and since Theorems 2.2 and 2.5 do not follow from [6] and [2].
2.7 Remark**.**
As we have seen, the limit in Theorem 2.1 does not exist, in general, without the weak mixing assumption but it is plausible that the limit may exist over syndetic subsequences of ’s. It would be interesting also to obtain some uniform versions of Theorems 2.2 and 2.5 in the spirit of [4]. It would be also natural to find most general conditions, which ensure almost everywhere convergence of averages of nonconventional arrays though this question is not completely settled even for standard nonconventional averages (i.e. without dependence of summands on ). Finally, we observe that it may be interesting to obtain a result of the type of Corollary 2.3 for the set of primes in place of a set of positive upper density extending to this situation the main result of [15]. In this case relevant sets of ’s will probably have gaps containing only bounded number of primes.
Next, we consider averages of nonconventional arrays (1.3) with higher degree polynomials . When we can separate dependencies on and the applying the ”PET-induction” from [5] for polynomials in and, essentially, treating as a constant there we will obtain in Section 6 the following result.
2.8 Theorem**.**
Let be different from identity transformations belonging to a totally weak mixing finitely generated free abelian group acting on by measure preserving transformations and be invertible -preserving transformations of which commute with each other and with . Furthermore, let be polynomials taking on integer values on integers and suppose that the expressions
[TABLE]
and the expressions
[TABLE]
depend nontrivially on (i.e. that they are nonconstant maps from to ). In addition, let be arbitrary functions of taking on integer values on integers. Then, for any bounded measurable functions
[TABLE]
If all ’s and ’s coincide with one transformation then (2.11) becomes
[TABLE]
with ’s taking the form where ’s are nonconstant essentially distinct polynomials of and ’s are function of , both taking on integer values on integers. It turns out that for general polynomials of and weak mixing may not be enough for the convergence in (1.3). In Section 6 we will show employing a version of a spectral argument suggested to us by Benji Weiss that already the averages
[TABLE]
do not converge in as , in general, if is only weak mixing. Still, strong mixing of ensures convergence in for this example. More generally, we will prove the following result where we rely on the notion of strong -mixing, which means that
[TABLE]
for any measurable sets .
2.9 Theorem**.**
Let be nonconstant essentially distinct polynomials of and (i.e. is not a constant identically) taking on integer values on integers and nontrivially depending on (i.e. is not just a polynomial of ). If is a strongly -mixing invertible transformation of then (2.12) holds true for any bounded measurable functions .
Observe that both conditions that the polynomials are essentially distinct and nontrivially depend on are important for Theorem 2.9 to hold true. As to the first condition consider which by the ergodic theorem converges as to which usually differs from the product of integrals of and . As to the second condition we can consider which does not converges at all as unless is a constant -almost everywhere. It would be natural to try to show that for strong mixing (i.e. 2-mixing) is not enough, in general, for Theorem 2.9 to hold true but this is not easy since then we would have to construct an example of a 2-mixing but not -mixing transformation which is a version of the old open problem attributed to Rokhlin.
We observe that such dynamical systems as topologically mixing subshifts of finite type, Axiom A diffeomorphisms and expanding transformations considered with an invariant Gibbs measure constructed by a Hölder continuous function (potential) are strong mixing of all orders so the above theorem is applicable for them. This is also true for the Gauss map mod 1, considered with its Gauss invariant measure , as well as some other maps of the interval. Actually, mixing of all orders follows from the property called in probability -mixing (or strong mixing) and the above dynamical systems have this property (and even stronger property called -mixing with exponential speed, see, for instance, [7], [18] and [8]).
These notions are defined via two parameter families of -algebras , on a probability space such that if . We define also for and , for and or for and as minimal -algebras containing for all , containing for all or containing for all , respectively. Such family of -algebras is called -mixing if
[TABLE]
Now, we have the following result which is probably well known but for readers’ convenience we provide details here.
2.10 Proposition**.**
Suppose that is an -mixing family of -algebras on a probability space with . Let be a measure -preserving transformation such that for all . Then for any ,
[TABLE]
Proof.
First, observe that if with then applying the definition of the mixing coefficient subsequently we obtain that
[TABLE]
Next, let for some and all . Since we obtain from (2.15) that (2.14) holds true. Now, let be arbitrary. Then for each there exist and such that where denotes the symmetric difference. Since (2.14) holds true for in place of we obtain that
[TABLE]
and since is arbitrary (2.14) follows. ∎
Observe, that a typical application of the above setup is in the symbolic setup where is a sequence space, is the left shift and the -algebras are generated by the cylinder sets for which the sequence elements on places from to are fixed. This can be extended to dynamical systems having appropriate symbolic representations via, for instance, Markov partitions.
3. One transformation case
In this section we will establish Theorems 2.1, 2.2 and Corollary 2.3.
3.1. Factors and extensions
The strategy of our proof is the same as in [14]. It is based on the notions of factors and extensions. Recall, that if is a measure preserving transformation of a probability space and then is called a factor of while the latter is called an extension of . The latter factor is said to be nontrivial if contains sets of measure strictly between 0 and 1. It is often more convenient to view factors in the following equivalent way (see [14] for more details). Namely, the factor is identified with a system such that for some measurable onto map we have , and . Furthermore, disintegrates into so that and -almost everywhere (a.e.).
Next, let and let be a factor of . Following [14] we set
[TABLE]
This is essentially the conditional expectation provided is identified with . Since and then is constant on for -almost all , and so this conditional expectation can be viewed as a function on . Since we refer often to [14] we will keep the notations from there though they differ slightly from the way conditional expectations with respect to -algebras are written in probability. We will use also the following well known formulas
[TABLE]
provided and are integrable.
Fix a measure preserving system and let be a -invariant -subalgebra. If (2.3) holds true for any and all satisfying the conditions of Theorem 2.2 then we say that the action of on the factor is generalized Szemerédi (GSZ). To make this shorter we will also say in this case that the action of on is GSZ and if is identified with then this is equivalent to saying that the action of on is GSZ.
Similarly to [14] we can see that the set of factors for which is GSZ contains a maximal element and that no proper factor can be maximal. The proof of Theorem 2.2 is based on the notions of relative weak mixing and relative compact extensions of other factors, which will be defined below. We will show that if the action is GSZ for smaller factor then it is also GSZ for a larger factor which is either relative mixing or relative compact with respect to the smaller factor. Considered together with two following facts this will yield our result. First, similarly to [14] we see that if is GSZ for a totally ordered (by inclusion) family of factors (i.e. factors ) then is GSZ for (i.e. for where the latter is the minimal -algebra containing each . Secondly, we rely on the general result from [14] saying that if is an extension of , which is not relative weak mixing, then there exists an intermediate factor between and so that is a (relative) compact extension of .
3.2. Relative weak mixing
Let be a probability space, , , and where preserves a probability measure , is measurable in and all preserve the measure . Then is measure preserving on and is called in [14] a skew product of with (while usually inself is called a skew product transformation). Set , , and . Then is called a relative weak mixing extension of if the action of on is ergodic.
3.1 Proposition**.**
Let be a relative weak mixing extension of and . Then for any ,
[TABLE]
and
[TABLE]
where satisfy conditions of Theorem 2.2.
Proof.
The proof proceeds similarly to Theorem 8.3 in [14]. Recall, that the conditional expectations can be viewed as functions in both and in , which is identified with , and so this conditional expectation is -measurable. Denote the assertions (3.2) and (3.3) by and , respectively, where both mean that they hold true for all relatively weak mixing extensions of and all functions on corresponding spaces.
First, observe that is obvious and will not play a role here so we can denote by any correct assertion. Next, we proceed by induction in showing that (cf. [14]),
(i) implies and
(ii) for (which is also a relative weak mixing extension of ) implies for .
We start with (ii) which is easier. If is measurable with respect to , the integrals in (3.2) have the form
[TABLE]
where still satisfy conditions of Theorem 2.2 and we use (3.1) here and that is -preserving. Thus follows from if is -measurable (assuming the induction hypothesis for all satisfying the conditions of Theorem 2.2).
It follows that writing and using that we can assume that . With this the left hand side of (3.2) takes the form
[TABLE]
where is a function on whenever is a function on (see (6.6) in [14]). By for the above limit equals
[TABLE]
Since the sum here is -measurable we can insert the conditional expectation inside of the integral concluding that the latter limit is zero since
[TABLE]
completing the proof of (ii).
In order to prove (i) we observe that
[TABLE]
It follows that it suffices to prove under the additional condition that for some we have (replacing by ).
We now have to show that for provided . Rewrite
[TABLE]
where will be chosen large but much smaller than . By the convexity of the function we have (up to ),
[TABLE]
By integration and the fact that is measure preserving,
[TABLE]
where and satisfy conditions of Theorem 2.2.
Set and observe that a pair appears in the above sums only if and then for values of we rewrite the above estimate as
[TABLE]
Inserting conditional expectation inside the integral and using for a fixed , every such that and large enough we can replace the integral term in the above inequality by
[TABLE]
Hence, we obtain
[TABLE]
Next, we estimate the integrals appearing in (3.4) by
[TABLE]
Since we obtain from for the case when , which is proved as Lemma 8.1 in [14] (where the ergodicity of by the definition of weak mixing extensions is used), that
[TABLE]
Hence, most of the terms in the right hand side of (3.4) are small provided that is large enough. Since all terms in the right hand side of (3.4) are bounded by and most of them are small, their average in (3.4) becomes arbitrarily small when and are large enough, completing the proof of Proposition 3.1. ∎
Now Theorem 2.1 is a particular case of (3.3) considering a trivial factor , i.e. such that the corresponding -algebra contains only sets of zero or full measure. As to Theorem 2.2 we will need the following corollary of Proposition 3.1.
3.2 Corollary**.**
Let be a relative weak mixing extension of . If the action of on is GSZ, then so is the action of on .
Proof.
The result follows immediately from (3.3) in the same way as in Theorem 8.4 from [14]. ∎
We observe that Proposition 3.1 implies also that if is a relative weak mixing extension of and (2.3) holds true for any with taken over all then the same is true for any , and so the restriction of to comes not from relative weak extensions but from relative compact extensions which will be studied below.
3.3. Relative compact extensions
For brevity and following [14] we will drop here the word ”relative” and will speak about compact extensions. Recall, that is said to be a compact extension of if there exists a set dense in and such that for every there exist functions satisfying
[TABLE]
where, again, .
As explained in Section 3.1 above the proof of Theorem 2.2 will be complete after we establish the following result.
3.3 Proposition**.**
Let be a compact extension of . If the action of on is GSZ then so is the action of on .
Proof.
We will follow the proof of Theorem 9.1 from [14] with a modification at the end. For an arbitrary with we have to show that (2.3) holds true. First, similarly to [14] we conclude that without loss of generality the indicator function of can be assumed to belong to the set appearing in the above definition of compact extensions. We will assume for convenience that is ergodic, otherwise pass to an ergodic decomposition. Then is also ergodic. The condition is equivalent to saying that the sequence is totally bounded, or relatively compact, in for almost all . Since we conclude that the total boundedness of in for in a set of positive measure already implies for an ergodic that is totally bounded in a uniform manner in for almost all .
Denote by the direct sum of copies of endowed with the norm . It is clear that if then the set
[TABLE]
is totally bounded in for -almost all , in fact, uniformly in . We write
[TABLE]
where means that the vector function is considered on a fiber above and, recall, . Throwing away -measure zero set of ’s we can assume that uniform estimates hold true on the whole .
Set . Then . Indeed, this is clear if while if then
[TABLE]
and so . Thus, we can assume without loss of generality that for all . We consider only for which the corresponding elements of have all nonzero components, and so these elements have norm in . The corresponding subset of is denoted by and it is still uniformly totally bounded. For each and let denote the maximum cardinality of -separated sets in , which is a finite monotone decreasing piece-wise constant function of with at most countably many jumps. Since is measurable as a function of there exist , and with so that equals a constant for and .
Take and find integers and so that is a maximal -separated set in . Next, , , as functions on are measurable and can be chosen so that each neighborhood of values of these functions at occurs with positive measure in the set . Let now be the subset of of points such that
[TABLE]
for any with and . Then by the choice of .
Now we use the assumption that the action of on is GSZ, applying it to . Let , be such that
[TABLE]
and let . Since for and for by the definition of (together with (3.6)) then for and .
Similarly to [14] we conclude that the vectors are separated in for , and so these vectors form a maximal such set which must be then dense in . Since there exists such that is -close to it. By the choice of this implies
[TABLE]
The index depends on , so now we sum over to obtain that for each ,
[TABLE]
Integrating over we derive
[TABLE]
Now we sum in , and multiply by ,
[TABLE]
Next, set . Then
[TABLE]
Now we use the assumption that the action of on is GSZ which implies that
[TABLE]
where is an infinite set of positive integers with bounded gaps. Define , which are also sets with bounded gaps. Clearly, (3.9) implies that there exists such that for any large enough
[TABLE]
Then by (3.7) and (3.8) we obtain that for any large enough
[TABLE]
Let
[TABLE]
Then by (3.10) for any large enough there exists such that . Hence, the gaps in are bounded by the bound on gaps of plus and, clearly,
[TABLE]
This completes the proof of Proposition 3.3, as well, as of Theorem 2.2. ∎
4. Commuting transformations
In this section we will obtain Theorems 2.4, 2.5 and Corollary 2.6.
4.1. Factors and extensions with respect to an abelian group of
transformations
Let be a commutative group of transformations acting on so that all preserve a probability measure on . A probability space is called a factor of if there exists an onto map such that and . Define the action of on by for each and . This action preserves the measure and we say that the system is an extension of and the latter is called a factor of the former. Clearly, this definition is compatible with the one given for one transformation in Section 3.1.
Next, is called a relative weak mixing extension of if is a relative weak mixing extension of for each as defined in Section 3.2. Furthermore, is called a (relative) compact extension of if (3.5) holds true simultaneously for all (with the same and ) for -almost all . Finally, following [11] we call an extension primitive if is the direct product of two subgroups where is a compact and is a relative weak mixing extensions of and of , respectively.
Next, as above will be called GSZ if (2.8) holds true for any with and all , where the set depends on and ’s, and are distinct and different from the identity. Next, we rely on the Theorem 6.17 in [11] describing the structure of extensions and show similarly to Proposition 7.1 in [11] that if each is GSZ for totally ordered (by inclusion) family of -algebras then is also GSZ. It follows that in order to establish Theorem 2.5 it suffices to show that any primitive extension of is GSZ provided is GSZ itself.
4.2. Weak mixing extensions
The following result generalizes Proposition 3.1 to the case of several commuting transformations.
4.1 Proposition**.**
Suppose that is a relative weak mixing extension of where is a commutative group of (both and ) measure preserving transformations as above. Let be distinct and different from identity while be invertible (both and ) measure preserving transformations of leaving invariant and commuting with each other and with . Then for each ,
[TABLE]
where , and
[TABLE]
Proof.
First, observe that considering a weak mixing extension of a trivial factor we conclude that (4.2) implies Theorem 2.4. Denote the assertions (4.1) and (4.2) by and , respectively, and prove them by induction showing that
(i) implies and
(ii) (for and ) implies (for and ) where and where defined in Section 3.2.
First, observe that is obvious and does not play role here so we can denote by it any valid assertion. The proof proceeds essentially in the same way as for one transformation. We start with (ii) which is easier. As in the one transformation case we assume first that is -measurable. Then the integrals in (4.1) have the form
[TABLE]
where and . Thus follows from if is -measurable. Hence, as in Section 3.2 we can assume that . Then the left hand side of (4.1) takes the form
[TABLE]
By for and the above limit equals
[TABLE]
Since the sum here is -measurable we can insert the conditional expectation inside of the integral concluding as in Section 3.2 that the latter limit is zero completing the proof of (ii).
In order to prove (i) we observe that
[TABLE]
This enables us to prove under the additional condition that for some we have (replacing by ).
It remains to show that for provided . Rewrite
[TABLE]
where will be chosen large but much smaller than . By convexity of the function we have (up to ),
[TABLE]
Integrating the above inequality we obtain
[TABLE]
where , and we observe that remain distinct and different from the identity. Writing we conclude similarly to Section 3.2 that this inequality implies that
[TABLE]
Inserting conditional expectation inside the integral in the right hand side of (4.3) and using for a fixed , every such that and large enough we can replace the integral term in the above inequality by
[TABLE]
which gives
[TABLE]
Next, we estimate the integrals appearing in (4.4) by
[TABLE]
Since we assume that then by for the case when which is proved as Lemma 8.1 in [14] (where ergodicity of is used which we know from the definition of relative weak mixing),
[TABLE]
The concluding argument is the same as in Proposition 3.1 which yields and completes the proof of Proposition 4.1. ∎
4.3. Primitive extensions
Let be a primitive extension, so that with and are relative compact and weak mixing extensions, respectively. Here is supposed to be a finitely generated free abelian group and . It follows from Proposition 4.1 that
4.2 Lemma**.**
Let be distinct and different from the identity, be arbitrary and . Define . Then for each the number of elements of the set
[TABLE]
satisfies
[TABLE]
denoting by the cardinality of a set .
Proof.
Since then by Proposition 4.1,
[TABLE]
and (4.5) follows. ∎
The implications of compactness which will be needed below are summarized in the following lemma (see Lemma 7.10 in [11]).
4.3 Lemma**.**
Let with . Then we can find a measurable set with as close to as we like and such that for any there exist a finite set of functions and a measurable function with the property that for almost all and every .
We will need also the following consequence of the multidimensional van der Waerden theorem.
4.4 Lemma**.**
(i) Let the number be given and let . There is a finite subset and a number such that for any map there exist and such that
[TABLE]
(ii) Let the number be given and . There is a finite set and a number such that for any map satisfying for some there exist and such that
[TABLE]
Proof.
The assertion (i) is Lemma 7.11 in [11]. In order to prove (ii) we apply (i) with and in place of and , respectively, there. With and given by (i) for such and ’s we obtain
[TABLE]
for . ∎
The following is the main result of this section which, as explained in Section 4.1, yields Theorem 2.5.
4.5 Proposition**.**
Let be a primitive extension and be a GSZ system. Then is also a GSZ system.
Proof.
We proceed similarly to Proposition 7.12 in [11] adapting the proof there to our situation. Let with and let . Replacing by a slightly smaller set, we can assume that has the compactness property described in Lemma 4.3. Writing , we see that there exists a measurable subset with for all . We express as products of elements in and in and assume without loss of generality that for all ,
[TABLE]
where , , and are distinct. Since the set of transformations in the right hand side of (4.6) is at least as large as the one in the left hand side of (4.6) then (2.8) will follow if we prove that for an infinite syndetic set ,
[TABLE]
Let . We will show that there exist an infinite syndetic set and such that for each there exist a subset with and such that for every we can find a set , with satisfying
[TABLE]
Integrating the inequality (4.8) over and taking into account (4.6) we obtain that for any ,
[TABLE]
and both (4.7) and (2.8) will follow.
The set will be determined by two requirements. For we will require that
[TABLE]
whenever and . Choose such that if
[TABLE]
(where denotes the symmetric difference) then (4.9) implies (4.8). Then we require that (4.10) holds true for any and .
Suppose now that and have been found so that (4.10) is satisfied for all , and, in addition,
[TABLE]
Now, applying Lemma 4.2 with and we obtain
[TABLE]
with defined in Lemma 4.2, for all except for a set of ’s of measure less than and for . Set and then considering new and we obtain (4.9). The problem is reduced to finding and such that (4.10) and (4.11) are satisfied.
Next, we replace (4.10) by the requirement that there exists such that
[TABLE]
(where ) with . Since we will have
[TABLE]
which gives (4.10) since
[TABLE]
Now recall that was chosen to comply with conditions of Lemma 4.3. We can therefore find and a function so that for every and -almost all . We define now a sequence of functions by
[TABLE]
for integers and transformations . This is well defined since is a direct product. Then for -almost all ,
[TABLE]
Fix and for which (4.13) holds true and apply Lemma 4.4(ii) to the function on . Independently of and there is a finite set and a number such that takes on the same value for , , for some and some with . Then if and is the corresponding we obtain from (4.13) for , that
[TABLE]
where we took into account that . We have shown that for every , and -almost all there exist and , both having a finite range of possibilities, such that (4.12) is satisfied with and for in place of .
Next, we will produce the set and the sets such that both (4.11) and (4.12) are satisfied for . For each form the set
[TABLE]
where the intersection is taken over with . Using the fact that is a GSZ system we conclude that for each from an infinite syndetic set there exists with for some independent of and such that for some and all .
Now let for . There exist and such that (in place of ) satisfies (4.12) for and . In addition, also satisfies (4.11) for these and taking into account that by the definition of this condition is satisfied with by all and all such that since whenever , and so .
Let be the total number of possibilities for . Then for a subset with , and take a constant value, say, and , respectively. We now define and set and . Then , and
[TABLE]
for for an appropriately defined . Finally, and the gaps of the set are bounded by times of the maximal gap of . This complets the proof of Proposition 4.5, as well as of Theorem 2.5. ∎
5. Short proofs of Theorems 2.2 and 2.5
Recall that is called a Følner sequence if the cardinality of the symmetric difference is o as for any . Now, suppose that for any Følner sequence , ,
[TABLE]
(in fact, we will need this only when ’s are squares). Then there exists and an integer such that in any square with the side of length we can find such that . Indeed, if this were not true then we could find a sequence of squares with sides of length as and a sequence as such that for all . Then, of course,
[TABLE]
which contradicts our assumption since is a Følner sequence. Clearly, this argument remains true for any replacing squares by -dimensional boxes but we will not need this here.
Now, let be numbers whose existence was established above and assume that for all integer and . Set and . Then contains disjoint squares with the side of length , and so
[TABLE]
Hence, there exists such that
[TABLE]
Clearly, is a set of integers with gaps bounded by and
[TABLE]
Next, we will apply the above arguments to the situation of Theorem 2.5. Let be as in Theorem 2.5 commuting measure preserving transformations of a measure space and set with being the identity transformation. Then are commuting measure preserving transformations of and . Now, it follows from Theorem B of [2] that for any set with and any Følner sequence ,
[TABLE]
i.e. the limit exists and it is positive. Taking a_{n,m}=\mu\big{(}\bigcap_{j=0}^{\ell}(T_{j}^{n}\hat{T}_{j}^{m})^{-1}A\big{)} we obtain by the above arguments that there exists an infinite set with bounded gaps such that (2.8) holds true, completing the proof of Theorem 2.5. ∎
Next, we derive a polynomial version of Theorem 2.2. Replace in (2.1) the linear terms by general polynomials taking on integer values on integer pairs and such that for each there exists a pair with all divisible by . Then by Theorem 1.4 in [6],
[TABLE]
for every with and any Følner sequence . Set . Then by the above argument there exists an infinite set of positive integers with uniformly bounded gaps such that
[TABLE]
providing a polynomial version of (2.3). ∎
6. Nonconventional polynomial arrays
6.1. Proof of Theorem 2.8
We start with the proof of Theorem 2.8 which proceeds close to the proof of Theorem D in [5]. First, by changing functions we can always assume without loss of generality that . If and where is an integer and when while ’s are functions of taking on integer values on integers then for any measurable function ,
[TABLE]
since is weakly mixing, and so is weakly mixing and, in particular, ergodic, and so the result follows from the ergodic theorem.
In order to deal with the general case of Theorem 2.8 we will need the following version of the van der Corput theorem whose proof is the same as of Theorem 1.4 in [3] (see also Theorem 1.5 there), and so we refer the reader there. This follows also from uniform versions of the van der Corput theorem (see, for instance, [21]).
6.1 Lemma**.**
Let be a bounded sequence of vectors in a Hilbert space such that
[TABLE]
where is the inner product and denotes the limit as outside a set of integers having zero upper density. Then
[TABLE]
where is the Hilbert space norm.
Next, we will describe the ”PET induction” in our circumstances where we closely follow [5] and refer the reader there for more details. Let be any polynomials and be any functions taking on integer values on integers and such that . Similarly to [5] we will call
[TABLE]
-polynomial expressions where indicates the fact that ’s are not necessarily polynomials. Products of -polynomial expressions and their inverses are -polynomial expressions, and so they form a group . Clearly, if then . The degree, deg of is the maximal degree of polynomials and the degree, deg of a -polynomial expression is defined as the degree of . Again, following [5] we define the weight of a -polynomial expression with as the pair such that degdeg, deg. The weight is greater than if or if and .
Two -polynomial expressions and with and are called equivalent if they have the same weight and the leading coefficient of the polynomials and coincide, as well. Any finite subset of is called a system and the degree of a system is the maximal degree of its elements. To every system a weight matrix is associated where is the number of equivalence classes formed by the elements of the system whose weights are and is the maximal degree of the polynomials appearing in Theorem 2.8. As in [5] we say that the weight matrix precedes the weight matrix if for some , when and except for and , and are arbitrary nonnegative integers when and except for and (for a picture explanation see [5]).
Now observe that the system appearing in (6.1) has the weight matrix where and if . Thus, (6.1) proves Theorem 2.8 for any system with the weight matrix . Next, we proceed step by step considering systems with weight matrices such that each preceeds arriving finally to the matrix with arbitrary predefined weights (for a graphical explanation of this see [5]). Our goal is to show that if Theorem 2.8 is valid for any system with the weight matrix then it is valid for any system with the weight matrix which by induction will yield Theorem 2.8.
Next, we remark that without loss of generality we can assume that for any which is the result of the equality
[TABLE]
Indeed, taking and we transform the left hand side of (2.11) into a sum of similar product expressions where all functions have zero integrals and the result to be proved now is that all corresponding limits are zero. Thus, writing
[TABLE]
[TABLE]
we have to prove that
[TABLE]
As in [5] we can assume without loss of generality that are linearly independent elements of the basis of the finitely generated free abelian group . Then for some polynomials implies . By Lemma 6.1, (6.4) would follow if
[TABLE]
where
[TABLE]
Next, we will need the following result.
6.2 Lemma**.**
Let nonconstant polynomials of and be essentially distinct and nontrivially depend on . Then for each sufficiently large the polynomials are pairwise essentially distinct (where is viewed as a constant) except for pairs where and then .
Proof.
Clearly, are essentially distinct since this was true for . It remains to show that and are essentially distinct for any provided is large enough and either or and does not have the form . Clearly, this is true if and have different degrees in , and so we can assume that they have the same degree in . Then we can write and where , are nonzero while , are arbitrary polynomials in only and , are polynomials of degree less than in . Then where is a polynomial whose degree in is less than having coefficients depending on . Since is a nonzero polynomial then for any large enough and if then and are essentially distinct provided is large enough. The case is ruled out by our assumptions. If and then either or and either or is nonconstant. In both of these cases and are essentially distinct. Next, if and then , and so and are essentially distinct if and only if is nonconstant concluding the proof of the lemma (where, in fact, we did not use that ’s depend polynomially on ). ∎
Observe, that if deg, then deg, and it follows from Lemma 6.2 that depends nontrivially on provided is large enough. Rearranging -polynomial expressions if needed, we can assume that deg for and deg for . The condition deg means that for some integers Hence, in this case . Thus, if we can write
[TABLE]
where is either or it is for some between 1 and and is either for some between 1 and or it is for some between and .
Consider the new system and suppose, without loss of generality, that has the minimal weight in . Since all then is measure preserving and we can write
[TABLE]
where . It follows from the assumptions of Theorem 2.8 that and for . Writing we see from here and Lemma 6.2 that for and large enough . Writing we conclude from here that and for for all large enough.
Introduce the new system . In the same way as in [5] (refering the reader for more explanations there) we conclude that the weight matrix of precedes that of . In order to invoke PET-induction we assume that Theorem 2.8 holds true for all systems whose weight matrices precede that of . Hence, we have for ,
[TABLE]
as . Then by the Cauchy inequality
[TABLE]
[TABLE]
If one of is not linear then degdeg and for some , and so the last product in (6.6) equals zero yielding
[TABLE]
Otherwise, degdeg for all and then , and for some , . Then by weak mixing
[TABLE]
which together with (6.9) yields again (6.10) concluding the proof of Theorem 2.8 since the initial step of the induction is given by (6.1). ∎
6.2. Nonconvergence under weak mixing
Next, we will show that, in general, weak mixing of is not enough to ensure -convergence in (1.3) for general polynomials taking on integer values on integers even in the ”conventional” case . Consider the sum
[TABLE]
where is a measure preserving transformation of a separable probability space and is a bounded measurable function. Recal, that the Koopman operator is unitary and it has a spectral representation in the form
[TABLE]
where is the spectrum of and is the corresponding projection operator valued spectral measure (see, for instance, [17] or [22]). Then
[TABLE]
and so
[TABLE]
Fix a small and for each set
[TABLE]
Observe that if then and . Define inductively and where is the integral part of . Set also . Then
[TABLE]
and is a Cantor like set, in particular, it is a perfect set and for any ,
[TABLE]
Let be a continuous (non-atomic) probability measure on , say, constructed in the same way as the Cantor distribution on the standard Cantor set. Next, we introduce a spectral measure concentrated on by the standard formula for each measurable function on and a measurable set where is the indicator of . The spectral measure is continuous considering it on the probability space since for each any function is zero -almost everywhere. Next, we can find a transformation such that its Koopman operator has the spectral representation
[TABLE]
(see, for instance, Ch. 4 in [9]) and since is a continuous spectral measure then is weakly mixing (see, for instance, [16] or [23]).
By (6.15),
[TABLE]
for any , and all . Hence
[TABLE]
Now, choose a function such that and . If the ergodic theorem holds true for the averages then as which leads to the contradiction in the above inequality if . ∎
6.3. Proof of Theorem 2.9
For the proof of Theorem 2.9 we will need the following result.
6.3 Lemma**.**
Let be a nonconstant polynomial of and taking on integer values on integers. Set
[TABLE]
where denotes the cardinality of a set in brackets and if does not depend on then we set if and , for otherwise. If nontrivially depends on then
[TABLE]
where degn is the degree of the polynomial in considering as a constant. If depends only on then there exists such that for all , and so for such . In both cases .
Proof.
For any there exists at most deg solutions in of the equation , and so (6.17) follows. If is nonconstant then as and the second assertion follows, as well. ∎
Next we can prove Theorem 2.9. As before, without loss of generality we can assume that, at least, one of functions has zero integral with respect to . Set
[TABLE]
and in order to prove Theorem 2.9 we have to show that
[TABLE]
which according to Lemma 6.1 will follow if (6.2) holds true.
Without loss of generality assume that are all indexes such that for some nonzero integers and polynomials in taking on integer values on integers. Then
[TABLE]
By Lemma 6.2, are essentially distinct polynomials, and so their pairwise differences and are nonconstant polynomials of and . Since is strongly -mixing then for any and any bounded measurable functions with there exists such that
[TABLE]
By Lemma 6.3,
[TABLE]
where run over indexes appearing in the above definitions of ’s. Hence, for large enough choosing for functions equal either to some or to we obtain,
[TABLE]
and since is arbitrary we obtain that
[TABLE]
Finaly, relying on strong mixing we let and obtain
[TABLE]
since one of integrals is zero, completing the proof of Theorem 2.9. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1]
- 2[2] T. Austin, Non-conventional ergodic averages for several commuting actions of an amenable group , J. D’Analyse Math. 130 (2016), 243–274.
- 3[3] V. Bergelson, Weakly mixing PET , Ergod. Th.& Dyn. Sys. 7 (1987), 337–349.
- 4[4] V. Bergelson, B. Host, R. Mc Cutcheon and F. Parreau, Aspects of uniformity in recurrence , Colloq. Math. 84/85 (2000), 549–576.
- 5[5] V. Bergelson and A. Leibman, Polynomial extensions of van der Waerden’s and Szemerédi’s theorems , J. Amer. Math. Soc. 9 (1996), 725–753.
- 6[6] V. Bergelson, A. Leibman and E. Lesigne Intersective polynomials and polynomial Szemerédi theorem , Adv. Math. 219 (2008), 369–388.
- 7[7] R. Bowen, Equilibrium States and the Ergodic Theory of Anosov Diffeomorphisms , Lecture Notes in Math. 470, Springer–Verlag, Berlin, 1975.
- 8[8] R.C. Bradley, Introduction to Strong Mixing Conditions, Kendrick Press, Heber City, 2007.
