This paper establishes conditions under which random walks on compact groups become uniformly distributed, and proves related limit theorems including the law of large numbers, the law of the iterated logarithm, and the central limit theorem.
Contribution
It provides necessary and sufficient conditions for equidistribution of random walks on compact groups and extends classical limit theorems to this setting.
Findings
01
Random walks equidistribute if not supported on proper closed subgroups and have an absolutely continuous component.
02
Strong law of large numbers and law of the iterated logarithm hold for sums of functions along the walk.
03
Central limit theorem with remainder term is established for sums along the walk.
Abstract
Let X1β,X2β,β¦ be independent, identically distributed random variables taking values from a compact metrizable group G. We prove that the random walk Skβ=X1βX2ββ―Xkβ, k=1,2,β¦ equidistributes in any given Borel subset of G with probability 1 if and only if X1β is not supported on any proper closed subgroup of G, and Skβ has an absolutely continuous component for some kβ₯1. More generally, the sum βk=1Nβf(Skβ), where f:GβR is Borel measurable, is shown to satisfy the strong law of large numbers and the law of the iterated logarithm. We also prove the central limit theorem with remainder term for the same sum, and construct an almost sure approximation of the process βkβ€tβf(Skβ) by a Wiener process provided Skβ converges to the Haar measure in the total variation metric.
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Full text
\xpatchcmd
Proof.
\proofnameformat
Equidistribution of random walks on compact groups
Let X1β,X2β,β¦ be independent, identically distributed random variables taking values from a compact metrizable group G. We prove that the random walk Skβ=X1βX2ββ―Xkβ, k=1,2,β¦ equidistributes in any given Borel subset of G with probability 1 if and only if X1β is not supported on any proper closed subgroup of G, and Skβ has an absolutely continuous component for some kβ₯1. More generally, the sum βk=1Nβf(Skβ), where f:GβR is Borel measurable, is shown to satisfy the strong law of large numbers and the law of the iterated logarithm. We also prove the central limit theorem with remainder term for the same sum, and construct an almost sure approximation of the process βkβ€tβf(Skβ) by a Wiener process provided Skβ converges to the Haar measure in the total variation metric.
Let G be a compact Hausdorff group, and let Ξ½ be a regular Borel probability measure on G. The following are equivalent.
(i)
Ξ½* is adapted and strictly aperiodic.*
2. (ii)
Ξ½βkβΞΌ* weakly as kββ.*
A similar classical result gives a necessary and sufficient condition for convergence in the total variation metric β₯β β₯TVβ. The Lebesgue decomposition of Ξ½βk with respect to the Haar measure ΞΌ will be written as Ξ½βk=(Ξ½βk)absβ+(Ξ½βk)singβ, where (Ξ½βk)absβ is absolutely continuous and (Ξ½βk)singβ is singular with respect to ΞΌ.
Theorem B**.**
Let G be a compact Hausdorff group, and let Ξ½ be a regular Borel probability measure on G. The following are equivalent.
(i)
Ξ½* is adapted and strictly aperiodic, and (Ξ½βk)absβξ =0 for some kβ₯1.*
2. (ii)
β₯Ξ½βkβΞΌβ₯TVββ0* as kββ.*
Moreover, if these equivalent conditions hold, then the convergence in (ii) is exponentially fast.
Special cases of Theorem B were proved by Bhattacharya [5]. For the general case and the history of related results see [1] and [13]. We mention that if G is connected, then the assumption that (Ξ½βk)absβξ =0 for some kβ₯1 implies that Ξ½ is adapted and strictly aperiodic. This follows from the fact that in a connected, compact Hausdorff group any proper closed subgroup has Haar measure [math].
Theorems A and B concern the distribution of Skβ for a given kβ₯1. We can also view Skβ, k=1,2,β¦ as a random sequence in G, and consider the empirical distribution of the terms S1β,S2β,β¦,SNβ for some Nβ₯1. Under the technical assumption that G is metrizable, Berger and Evans [2, Corollary 3.1] proved the following.
Theorem C**.**
Let G be a compact metrizable group. Let X1β,X2β,β¦ be i.i.d.Β G-valued random variables with distribution Ξ½, and set Skβ=βj=1kβXjβ. The following are equivalent.
(i)
Ξ½* is adapted.*
2. (ii)
For any continuous function f:GβR
[TABLE]
Note that a.s.Β (almost surely) means that the given relation holds with probability 1. Since the Banach space of continuous, real-valued functions on G (or indeed, on any compact metric space) is separable, condition (ii) in the previous theorem is equivalent to the property that with probability 1, (1) holds for all continuous functions f:GβR simultaneously. A (deterministic) sequence akβ, k=1,2,β¦ in G is called uniformly distributed if limNβββ(1/N)βk=1Nβf(akβ)=β«GβfdΞΌ for any continuous function f:GβR. Theorem C thus states that the random sequence Skβ, k=1,2,β¦ is uniformly distributed with probability 1 if and only if Ξ½ is adapted. See [3], [4] and [14] for related results on the circle group G=R/Z, and [2] for the case of continuous time processes.
In this paper we consider βk=1Nβf(Skβ) for Borel measurable functions f:GβR, and we prove the following analogue of Theorem C.
Theorem 1**.**
Let G be a compact metrizable group. Let X1β,X2β,β¦ be i.i.d.Β G-valued random variables with distribution Ξ½, and set Skβ=βj=1kβXjβ. The following are equivalent.
(i)
Ξ½* is adapted, and (Ξ½βk)absβξ =0 for some kβ₯1.*
2. (ii)
For any bounded, Borel measurable function f:GβR
[TABLE]
3. (iii)
For any bounded, Borel measurable function f:GβR
[TABLE]
The implications (iii)β(ii)β(i) are straightforward. Condition (ii) is in fact equivalent to the assumption that (2) holds for the indicator function f=IBβ of any Borel set BβG; indeed, a bounded, Borel measurable function can be uniformly approximated by finite linear combinations of such indicator functions. The equivalence (i)β(ii) in Theorem 1 thus states that the random sequence Skβ, k=1,2,β¦ equidistributes in any given Borel set with probability 1 if and only if Ξ½ is adapted, and (Ξ½βk)absβξ =0 for some kβ₯1. Equidistribution of a random sequence in any given Borel set with probability 1 is sometimes called the βstrong uniform distributionβ property. In contrast, (ordinary) uniform distribution means equidistribution in any Borel set BβG such that ΞΌ(βB)=0. Note that equidistribution in all Borel sets simultaneously is impossible; in particular, no deterministic sequence satisfies the strong uniform distribution property (unless G is finite).
In Theorems C and 1 we did not assume that Ξ½ is strictly aperiodic, whereas in Theorems A and B strict aperiodicity is required for the convergence of Ξ½βk. In the proof of the implication (i)β(iii) in Theorem 1 we will thus first assume that Ξ½ is strictly aperiodic. In case the support of Ξ½ is contained in a coset of a closed normal subgroup Hβ²G, we will see that the factor group G/H is finite and cyclic, and we will argue by induction on the index β£G:Hβ£. Surprisingly, in Theorem 1 the strong law of large numbers (condition (ii)) and the law of the iterated logarithm (condition (iii)) are equivalent. This is a consequence of the fact that whenever Ξ½βk converges to the Haar measure ΞΌ in the total variation metric, the convergence is necessarily exponentially fast. This fact does not have an analogue for weak convergence. We also prove the following central limit theorem under the technical assumption that Ξ½ is a central measure. Note that condition (ii) below expresses convergence in distribution to the standard normal distribution.
Theorem 2**.**
Let G be a compact metrizable group. Let X1β,X2β,β¦ be i.i.d.Β G-valued random variables with distribution Ξ½, and set Skβ=βj=1kβXjβ. Assume that Ξ½ is central. The following are equivalent.
(i)
Ξ½* is adapted and strictly aperiodic, and (Ξ½βk)absβξ =0 for some kβ₯1.*
2. (ii)
For any bounded, Borel measurable function f:GβR such that β«GβfdΞΌ=0 and f is not ΞΌ-a.e.Β zero, we have
[TABLE]
with some constant C(f,Ξ½)>0 depending only on f and Ξ½.
Recall that a compact Hausdorff topological space is metrizable if and only if it is second countable. We mention that in the proofs of Theorems 1 and 2 the choice of the metric on G is irrelevant; we only use the second countability of G. Whether Theorems 1 and 2 are true for compact Hausdorff groups remains open.
2 Results
2.1 Preliminaries
For the rest of the paper we assume that G is a compact metrizable group. Let ΞΌ denote the Haar measure on G normalized so that ΞΌ(G)=1. We will write Lp(G)=Lp(G,ΞΌ) for the Lebesgue space of real-valued, Borel measurable functions with respect to ΞΌ, and β₯fβ₯pβ=β₯fβ₯Lp(G,ΞΌ)β. In addition, β₯β β₯pβ will also denote the Lp-norm of (real-valued) random variables. Recall that ΞΌ is both left and right invariant; that is, for any Borel set BβG and any yβG we have ΞΌ(yB)=ΞΌ(By)=ΞΌ(B). Therefore for any fβL1(G) we have
[TABLE]
The total variation of a finite, signed Borel measure Ο on G is defined as
[TABLE]
Given two finite, signed Borel measures Ο1β and Ο2β on G, their convolution Ο1ββΟ2β is the unique finite, signed Borel measure such that for any bounded, Borel measurable function f:GβR
[TABLE]
It is easy to see that β₯Ο1ββΟ2ββ₯TVββ€β₯Ο1ββ₯TVββ β₯Ο2ββ₯TVβ. A finite, signed Borel measure Ο on G is called central if Ο(yβ1By)=Ο(B) for all Borel sets BβG and all yβG. Similarly, a Borel measurable function f:GβR is called a class function if f(yβ1xy)=f(x) for all x,yβG. Note that for any Borel probability measure Ξ½ on G we have Ξ½βΞΌ=ΞΌβΞ½=ΞΌ. If Ξ½1β and Ξ½2β are Borel probability measures on G, then β₯Ξ½1ββΞ½2ββ₯TVβ=2supBββ£Ξ½1β(B)βΞ½2β(B)β£, where the supremum is over all Borel sets BβG. The support of a Borel probability measure Ξ½ on G, denoted by suppΞ½, is the smallest closed set FβG such that Ξ½(F)=1.
Remark**.**
Every finite Borel measure on G (or indeed, on any Polish space) is regular. Therefore in the definitions of total variation and convolution we could have equivalently used continuous functions f:GβR instead of bounded, Borel measurable ones. The existence and uniqueness of the convolution thus follows from the Riesz representation theorem.
A G-valued random variable is a Borel measurable map X from a probability space to G. Let Ξ½Xβ denote the distribution of X; that is, Ξ½Xβ(B)=Pr(XβB) for all Borel sets BβG. The variable X is called uniformly distributed if Ξ½Xβ=ΞΌ. If X and Y are independent G-valued random variables, then Ξ½XYβ=Ξ½XββΞ½Yβ. We shall write X=dY if the (real-valued or G-valued) random variables X and Y have the same distribution.
Let G^ denote the unitary dual of G; that is, a complete set of pairwise unitarily inequivalent, irreducible unitary representations of G. Recall that every such representation is finite dimensional, and let dΟβ denote the dimension of ΟβG^. Thus Ο(x) is a dΟβΓdΟβ unitary matrix with complex entries for any given xβG. Let Ο0ββG^, Ο0β=1 denote the trivial representation. Given fβL1(G) and ΟβG^, let f^β(Ο)=β«Gβf(x)Ο(x)βdΞΌ(x) denote the Fourier coefficients of f. Here Ο(x)β denotes the conjugate transpose of Ο(x), and the integral is taken entrywise. The Fourier coefficients of a finite, signed Borel measure Ο on G are defined similarly as Ο^(Ο)=β«GβΟ(x)βdΟ(x), ΟβG^. The Parseval formula states that for any f,gβL2(G) we have
[TABLE]
where tr denotes trace. Given ΟβG^ the complex conjugate ΟΛ is also an irreducible unitary representation of G, called the contragradient of Ο. The contragradient ΟΛ may or may not be unitarily equivalent to Ο. For the theory of Fourier analysis on compact groups we refer the reader to [6].
The notation anββͺbnβ and anβ=O(bnβ) mean that there exists an (implied) constant Kβ₯0 such that β£anββ£β€Kbnβ for all nβ₯1. We write anβ=Ξ(bnβ) if anββͺbnββͺanβ. We will use subscripts to denote dependence of the implied constant on certain parameters; thus e.g. anββͺf,Ξ½βbnβ and anβ=Of,Ξ½β(bnβ) mean that the implied constant may depend on f and Ξ½. We emphasize that in the statement of all theorems, propositions and lemmas implied constants in the notation βͺ and O are universal; in particular, they do not even depend on the group G.
2.2 The main theorems
Let G be a compact metrizable group with normalized Haar measure ΞΌ, let X1β,X2β,β¦ be i.i.d.Β G-valued random variables with distribution Ξ½, and set Skβ=βj=1kβXjβ. Let Ξkβ=β₯Ξ½βkβΞΌβ₯TVβ=2supBββ£Pr(SkββB)βΞΌ(B)β£ denote the total variation distance of the distribution of Skβ from the uniform distribution. Note that
[TABLE]
hence Ξk+1ββ€Ξkβ.
The precise rate of convergence in the total variation metric was found by Anoussis and Gatzouras [1, Theorem 4.1]. Let
[TABLE]
where Ο(Ξ½^(Ο)) denotes the spectral radius of the matrix Ξ½^(Ο). Then limkβββΞk1/kβ=infkβ₯1βΞk1/kβ=q; moreover, q<1 if and only if Ξ½ is adapted and strictly aperiodic, and (Ξ½βk)absβξ =0 for some kβ₯1. Thus, as already stated in Theorem B, whenever β₯Ξ½βkβΞΌβ₯TVββ0, the convergence is necessarily exponentially fast; more precisely, we have qkβ€Ξkβ for every kβ₯1, and Ξkββ€(q+Ξ΅)k for every kβ₯k0β(Ξ½,Ξ΅). Let Ξ=1+2βk=1ββΞkβ, and observe 1/(1βq)β€Ξ.
We now give a more quantitative form of Theorem 1. For any mβ₯1 and Ξ΅>0 let Οm,Ξ΅β(N)=N(βi=1mβ1βlogiβN)(logmβN)1+Ξ΅, where log1βN=logN and logiβN=loglogiβ1βN denotes the i-fold iterated logarithm.
Theorem 3**.**
Suppose that Ξ½ is adapted, and that (Ξ½βk)absβξ =0 for some kβ₯1. Let f:GβR be Borel measurable such that β«GβfdΞΌ=0.
(i)
If supcβGβEβ£f(cX1β)β£p<β for some 1β€pβ€2, then for any mβ₯1 and Ξ΅>0
[TABLE]
2. (ii)
If supcβGβEβ£f(cX1β)β£p<β for some p>2, then
[TABLE]
Remark**.**
Note that supcβGβEβ£f(cX1β)β£p<β clearly implies fβLp(G). To mention a sufficient condition, suppose that Ξ½ is absolutely continuous with density dΞΌdΞ½β. If there exists a HΓΆlder conjugate pair r,sβ[1,β], 1/r+1/s=1 such that fβLpr(G) and dΞΌdΞ½ββLs(G), then supcβGβEβ£f(cX1β)β£pβ€β₯fβ₯prpββ₯dΞΌdΞ½ββ₯sβ<β.
Under the extra condition that Ξ½ is strictly aperiodic, we will approximate βk=1Nβf(Skβ) by a sum of independent random variables. Following Strassen [15], we can even construct an almost sure approximation by a Wiener process. To state the precise form of this result, let us introduce the following technical definition. Given a function E(t) positive on (t0β,β) for some t0β, we shall say that two stochastic processes Y(t) and Z(t) in the Skorokhod space D[0,β), possibly defined on different probability spaces, are o(E(t))-equivalent if there exist finitely many processes Y1β(t),Y2β(t),β¦,Ymβ(t) in D[0,β) such that Y1β(t)=Y(t), Ymβ(t)=Z(t), and for all 1β€iβ€mβ1 one of the following hold:
(i)
The processes Yiβ(t) and Yi+1β(t), possibly defined on different probability spaces, have the same distribution.
2. (ii)
The processes Yiβ(t) and Yi+1β(t) are defined on the same probability space, and limtβββ(Yiβ(t)βYi+1β(t))/E(t)=0 a.s.
Roughly speaking (ignoring the different probability spaces), Y(t) and Z(t) being o(E(t))-equivalent thus means Y(t)=Z(t)+o(E(t)) a.s. Given fβL2(G) with β«GβfdΞΌ=0, let
[TABLE]
where U is a uniformly distributed G-valued random variable, independent of X1β,X2β,β¦. As we shall see, the series in (5) is absolutely convergent, and C(f,Ξ½)β₯0.
Theorem 4**.**
Suppose that Ξ½ is adapted and strictly aperiodic, and that (Ξ½βk)absβξ =0 for some kβ₯1. Let f:GβR be Borel measurable such that β«GβfdΞΌ=0. If supcβGβEβ£f(cX1β)β£2+Ξ΄<β for some 0<Ξ΄<2, then the processes βkβ€tβf(Skβ) and C(f,Ξ½)βW(t) are o(t1/2βΞ΄/20)-equivalent, where W(t) is a standard Wiener process.
The almost sure approximation by a Wiener process yields even more precise asymptotics than those in Theorem 3; for instance, it shows that for strictly aperiodic Ξ½ the value of the limsup in (4) is 2C(f,Ξ½)β. The almost sure asymptotics as well as the limit distribution of continuous functionals of the process βkβ€tβf(Skβ) also follow. Instead of the random step functions βkβ€tβf(Skβ), we could have also used the piecewise linear functions βkβ€βtββf(Skβ)+(tββtβ)f(Sβtβ+1β). In that case the o(t1/2βΞ΄/20)-equivalence holds in the space C[0,β) as well.
In Theorem 4 in general we only know C(f,Ξ½)β₯0; in the case C(f,Ξ½)=0 the result simply states that βkβ€tβf(Skβ)=o(t1/2βΞ΄/20) a.s. The natural question whether C(f,Ξ½)>0 is surprisingly delicate. We shall prove a necessary and sufficient condition in terms of irreducible unitary representations of G, see Proposition 7 below. As this condition is rather cumbersome to use, we also give simpler criteria to ensure C(f,Ξ½)>0. In particular, we will show that under mild technical assumptions (e.g.Β f is a class function or Ξ½ is a central measure) we have C(f,Ξ½)=0 if and only if f=0ΞΌ-a.e. We work out the details in Section 3.
It clearly follows from Theorem 4 that Nβ1/2βk=1Nβf(Skβ) has a (possibly degenerate) Gaussian limit distribution. Under slightly weaker assumptions than those in Theorem 4 we also prove a Lyapunov-type bound on the remainder term in the central limit theorem. Let Ξ¦(x)=β«ββxβ(2Ο)β1/2eβt2/2dt denote the standard normal distribution function.
Theorem 5**.**
Suppose that Ξ½ is adapted and strictly aperiodic, and that (Ξ½βk)absβξ =0 for some kβ₯1. Let f:GβR be Borel measurable such that β«GβfdΞΌ=0, and assume fβL2+Ξ΄(G) for some 0<Ξ΄β€1 and supcβGβEf(cX1β)2<β.
(i)
If C(f,Ξ½)>0, then for any integer Nβ₯N0β(f,Ξ½,Ξ΄)
[TABLE]
where K=Ξ(β₯fβ₯2+Ξ΄β/C(f,Ξ½)β)(2+Ξ΄)/(1+Ξ΄).
2. (ii)
If C(f,Ξ½)=0, then Nβ1/2βk=1Nβf(Skβ)β0 in L2.
Remark**.**
If fβL3(G), then the right hand side of (6) becomes KNβ1/4log1/2N with K=Ξβ₯fβ₯33/2β/C(f,Ξ½)3/4. We mention that if fβL4(G), then here β₯fβ₯3β can be replaced by β₯fβ₯2β (see the end of Section 5.2). As we will see in Proposition 8 below, if f is a class function or Ξ½ is a central measure, then C(f,Ξ½)β₯β₯fβ₯22β/(2Ξ). Thus in this case Kβ€2Ξ7/4, so the right hand side of (6) does not depend on f. (N0β(f,Ξ½,Ξ΄) always depends on f, however.)
3 Moment estimates
Throughout this section we assume that Ξ½ is adapted and strictly aperiodic, and that (Ξ½βk)absβξ =0 for some kβ₯1. Further, we fix a Borel measurable function f:GβR such that β«GβfdΞΌ=0, and a uniformly distributed G-valued random variable U independent of X1β,X2β,β¦. We now prove moment estimates for the modified sum βk=1Nβf(USkβ). In Section 4 we shall give the counterparts of these estimates for shifted sums βk=M+1M+Nβf(Skβ).
For every nonempty, finite interval of positive integers JβN let SJβ=βjβJβXjβ. Note that SJβ has distribution Ξ½ββ£Jβ£, hence by definition for any bounded, Borel measurable function g:GβR
[TABLE]
Proposition 6**.**
Assume fβL2(G). The series in (5) is absolutely convergent, and 0β€C(f,Ξ½)β€β₯fβ₯22βΞ. Further, for any integer Nβ₯1
[TABLE]
Proof.
Let Akβ=Ef(U)f(USkβ). Since U is independent of Skβ, we have
[TABLE]
The function g(x)=β«Gβf(u)f(ux)dΞΌ(u) satisfies β«GβgdΞΌ=0 and supGββ£gβ£β€β₯fβ₯22β. Applying (7) to g we thus obtain β£Akββ£β€β₯fβ₯22βΞkβ. Since Ξkββ0 exponentially fast, the series in (5) is absolutely convergent. As Ef(U)2=β₯fβ₯22β, we have C(f,Ξ½)β€β₯fβ₯22βΞ. Finally, C(f,Ξ½)β₯0 will follow from the second claim.
Expanding the square we have
[TABLE]
Let us write USββ=USkβS[k+1,β]β. Since ΞΌβΞ½βk=ΞΌ, the variable USkβ is uniformly distributed on G and independent of S[k+1,β]β; moreover, S[k+1,β]β=dSββkβ. Thus Ef(USkβ)2=Ef(U)2 and Ef(USkβ)f(USββ)=Ef(U)f(USββkβ), so (9) simplifies as
[TABLE]
The second claim thus follows from β£Adββ£β€β₯fβ₯22βΞdβ and the fact that Ξdββ0 exponentially fast.
β
We now study the question whether the normalizing factor C(f,Ξ½) in the variance is zero or positive. To this end, we derive an alternative formula for C(f,Ξ½) in the form of an infinite series with nonnegative terms. Next, we will consider the special case when f is a class function or Ξ½ is a central measure. As we shall see, the behavior of C(f,Ξ½) then simplifies considerably, allowing for effective lower bounds.
Proposition 7**.**
Assume fβL2(G). We have
[TABLE]
where BΞ½β(Ο)=IdΟββ+(IdΟβββΞ½^(Ο))β1Ξ½^(Ο)+(IdΟβββΞ½^(Ο)β)β1Ξ½^(Ο)β and IdΟββ denotes the dΟβΓdΟβ identity matrix. The series in (10) has nonnegative terms and is convergent. In particular, C(f,Ξ½)>0 if and only if at least one term in (10) is nonzero.
Proof.
Let Akβ=Ef(U)f(USkβ), and recall (8). The function hkβ(u)=β«Gβf(ux)dΞ½βk(x) is in L2(G), and its Fourier coefficients are hkββ(Ο)=f^β(Ο)(Ξ½^(Ο)k)β. Applying the Parseval formula in (8) we thus obtain
justifying a change in the order of summation. Since Ο(Ξ½^(Ο))β€q<1, we have βk=1ββΞ½^(Ο)k=(IdΟβββΞ½^(Ο))β1Ξ½^(Ο) in operator norm. We thus obtain
[TABLE]
As C(f,Ξ½) is clearly real, we can take the real part of the series in the previous line, resulting in formula (10).
Finally, we prove that every term in (10) is nonnegative. Fix ΟβG^\{Ο0β}. First, suppose that Ο and Ο are unitarily inequivalent. Then we may assume that Ο,ΟβG^. Since f and Ξ½ are real-valued, we have f^β(Ο)=f^β(Ο)β and Ξ½^(Ο)=Ξ½^(Ο)β. Hence BΞ½β(Ο)=BΞ½β(Ο)β, and the terms in (10) indexed by Ο and Ο are equal. Let F be the orthogonal projection of f in L2(G) to the linear subspace spanned by the matrix elements {Οijβ:1β€i,jβ€dΟβ}βͺ{Οijβ:1β€i,jβ€dΟβ}; that is, F(x)=dΟβtr(f^β(Ο)Ο(x))+dΟβtr(f^β(Ο)Ο(x)). Note that F is real-valued, F^(Ο)=f^β(Ο), F^(Ο)=f^β(Ο) and F^(Οβ²)=0 for all Οβ²ξ =Ο,Ο. Therefore the terms in (10) indexed by Ο and Ο are both C(F,Ξ½)/2. But C(F,Ξ½)β₯0 from Proposition 6, and we are done. Next, suppose that Ο and Ο are unitarily equivalent. Let F be the orthogonal projection of f in L2(G) to the linear subspace spanned by the matrix elements {Οijβ:1β€i,jβ€dΟβ}; note that {Οijβ:1β€i,jβ€dΟβ} span the same linear subspace. Thus F(x)=dΟβtr(f^β(Ο)Ο(x))=dΟβtr(f^β(Ο)Ο(x)). Again, F is real-valued, F^(Ο)=f^β(Ο) and F^(Οβ²)=0 for all Οβ²ξ =Ο. Therefore the term in (10) indexed by Ο is C(F,Ξ½)β₯0.
β
Proposition 8**.**
Assume fβL2(G), and let Ξ½β(B)=Ξ½(Bβ1) (BβG Borel) denote the distribution of X1β1β. Suppose at least one of the following hold.
(i)
f* is a class function*
2. (ii)
Ξ½βΞ½β=Ξ½ββΞ½**
3. (iii)
Ξ½* is a central measure*
Then 1+q1βqββ₯fβ₯22ββ€C(f,Ξ½)β€1βq1+qββ₯fβ₯22β. In particular, C(f,Ξ½)=0 if and only if f=0ΞΌ-a.e.
Proof.
First, assume (i). It follows from Schurβs lemma that f^β(Ο) is a scalar multiple of the identity matrix. Hence (10) simplifies as
[TABLE]
Let Ξ»1β,Ξ»2β,β¦,Ξ»dΟββ denote the eigenvalues of Ξ½^(Ο). Then
[TABLE]
Since Ο(Ξ½^(Ο))β€q<1, we have β£Ξ»iββ£β€q, and so dΟβ1+q1βqββ€trBΞ½β(Ο)β€dΟβ1βq1+qβ. The claim thus follows from the Parseval formula.
Next, assume (ii). Since Ξ½β(Ο)=Ξ½^(Ο)β, the condition Ξ½βΞ½β=Ξ½ββΞ½ implies that the matrix Ξ½^(Ο) is normal. Therefore there exists an orthonormal basis v1β,v2β,β¦,vdΟββ of CdΟβ comprised of eigenvectors of Ξ½^(Ο); say, Ξ½^(Ο)viβ=Ξ»iβviβ. It follows that Ξ½^(Ο)βviβ=Ξ»iββviβ, and hence
[TABLE]
The eigenvalues of BΞ½β(Ο) again satisfy 1+q1βqββ€β£1βΞ»iββ£21ββ£Ξ»iββ£2ββ€1βq1+qβ. It is now easy to see that
[TABLE]
and the claim follows. Finally, note that condition (iii) implies condition (ii).
β
We conclude this section with an estimate of the Lp-norm for 1β€pβ€4. These estimates, combined with the ErdΕsβStechkin and the RademacherβMenshov inequalities will help us bound the fluctuations of βk=1Nβf(Skβ) as N runs in a short interval. Additionally, we will also use them to verify the Lyapunov condition in the proof of Theorem 5.
Proposition 9**.**
Assume fβLp(G) for some 1β€pβ€4. For any integer Nβ₯1
[TABLE]
In the case p=4 we also have
[TABLE]
Proof.
First, assume p=4. Expanding the fourth power we get
[TABLE]
Fix 1β€k1ββ€k2ββ€k3ββ€k4ββ€N. Since USk1ββ is uniformly distributed on G and independent of Xk1β+1β,Xk1β+2β,β¦, we have
[TABLE]
Here we use the convention that Ξ½β0 is the Dirac measure concentrated on the identity element of G, and Ξ0β=β₯Ξ½β0βΞΌβ₯TVββ€2. Let
[TABLE]
As β«Gβg(z)dΞΌ(z)=0, applying (7) to g we obtain
[TABLE]
Fix zβG, and let hzβ(y)=β«Gββ«Gβf(u)f(ux)f(uxy)f(uxyz)dΞΌ(u)dΞ½β(k2ββk1β)(x). Note that
[TABLE]
where wzβ=β«Gβf(uxy)f(uxyz)dΞΌ(y)=β«Gβf(y)f(yz)dΞΌ(y) does not depend on u and x. Applying (7) to hzβ we get
[TABLE]
Here β£wzββ£β€β₯fβ₯22β, and the double integral in the previous line is β€β₯fβ₯22βΞk2ββk1ββ, as seen in the proof of Proposition 6. Hence
[TABLE]
Now fix y,zβG, and let ry,zβ(x)=β«Gβf(u)f(ux)f(uxy)f(uxyz)dΞΌ(u). Note that supGββ£ry,zββ£β€β₯fβ₯44β, and that β«Gβry,zβ(x)dΞΌ(x)=0. Applying (7) we thus get
Summing over 1β€k1ββ€k2ββ€k3ββ€k4ββ€N, (12) follows.
On the other hand, we can use Ξk3ββk2βββ€2 to deduce the simpler estimate
[TABLE]
and by summing over 1β€k1ββ€k2ββ€k3ββ€k4ββ€N we get ββk=1Nβf(USkβ)β4ββͺβ₯fβ₯4βΞNβ. Proposition 6 shows that if fβL2(G), the same estimate holds with β₯β β₯4β replaced by β₯β β₯2β on both sides. Moreover, we also have the trivial estimate ββk=1Nβf(USkβ)β1ββ€β₯fβ₯1βN for any fβL1(G). This settles the endpoints of the intervals 1β€pβ€2 and 2β€pβ€4. The cases 1<p<2 and 2<p<4 follow from the RieszβThorin interpolation theorem applied to the linear operator fβ¦βk=1Nβf(USkβ)βNβ«GβfdΞΌ.
β
4 Approximation by independent variables
Assume again, that Ξ½ is adapted and strictly aperiodic, and that (Ξ½βk)absβξ =0 for some kβ₯1. Fix a Borel measurable function f:GβR such that β«GβfdΞΌ=0. In this section we approximate the shifted sum βk=M+1M+Nβf(Skβ) by a sum of independent random variables. The main tool of this approximation is a coupling between Ξ½βk and ΞΌ, which we will construct using Strassenβs theorem. We mention that this is the only step of the proof where we use the fact that G is metrizable. A similar approach was used by Schatte [14] on the circle group G=R/Z, with a different type of coupling based on the Kolmogorov metric instead of the total variation metric.
Recall that for any two Borel probability measures Ξ½1β and Ξ½2β on G (or indeed, on any Polish space) we have β₯Ξ½1ββΞ½2ββ₯TVβ=2infΟβΟ({(x,y)βGΓG:xξ =y}), where the infimum is over all Borel probability measures Ο on GΓG whose marginals are Ο(BΓG)=Ξ½1β(B) and Ο(GΓB)=Ξ½2β(B). This fact follows from Strassenβs theorem, which in turn is a special case of the Kantorovich duality theorem in the theory of optimal transportation (see e.g.Β [18, Chapter 1]). In particular, for any kβ₯1 there exists a Borel probability measure Οkβ on GΓG with marginals Ξ½βk and ΞΌ such that Οkβ({(x,y)βGΓG:xξ =y})β€Ξkβ. After a suitable extension of the probability space, for any nonempty, finite interval of positive integers JβN we may therefore introduce auxiliary G-valued random variables TJβ,UJβ whose joint distribution is Οβ£Jβ£β; that is, TJβ=dSJβ, UJβ is uniformly distributed on G, and Pr(TJβξ =UJβ)β€Ξβ£Jβ£β. Moreover, we may assume (TJβ,UJβ), JβN and X1β,X2β,β¦ are independent. The independence of the approximating variables will follow from the following observation.
Lemma 1**.**
Let G be a compact metrizable group, and let (S,A) be a measurable space. Let U be a G-valued, and let V be an S-valued random variable. If U and V are independent and U is uniformly distributed on G, then for any Borel measurable function g:SβG the variables g(V)U and V are also independent.
Proof.
Note that g(V)U is uniformly distributed on G. Let Ξ³ denote the distribution of V. For any Borel set BβG and any AβA we have
[TABLE]
β
We construct the approximating variables as follows. Fix an integer Mβ₯0, and let us decompose the infinite set {M+1,M+2,β¦} into consecutive, nonempty, finite intervals of integers H1β,J1β,H2β,J2β,β¦. For all iβ₯1 and kβJiβ let
[TABLE]
Similarly, for all iβ₯2 and kβHiβ let
[TABLE]
Note that here the case i=1 is excluded to ensure that Hiβ is preceded by an interval Jiβ1β. Let us also introduce the variables
[TABLE]
Observe that the random sequence Wkβ, kββi=1ββJiβ has the same distribution as Skβ, kββi=1ββJiβ. Similarly, Wkβ, kββi=2ββHiβ has the same distribution as Skβ, kββi=2ββHiβ.
For every Rβ₯1 let NRβ be such that M+NRβ=maxJRβ. Then βk=M+1M+Nβf(Skβ) along the subsequence NRβ satisfies
[TABLE]
Here the sequence βi=1RββkβJiββf(Skβ), R=1,2,β¦ has the same distribution as βi=1RβYiβ, R=1,2,β¦; similarly, the sequence βi=2RββkβHiββf(Skβ), R=2,3,β¦ has the same distribution as βi=2RβZiβ, R=2,3,β¦. The main idea is to replace Yiβ by Yiββ, and Ziβ by Ziββ. First, we establish the properties of the approximating variables Yiββ and Ziββ, then we estimate the error committed.
Lemma 2**.**
Y1ββ,Y2ββ,β¦* are independent, and EYiββ=0.*
(i)
If fβL2(G), then E(Yiββ)2=C(f,Ξ½)β£Jiββ£+OΞ½β(β₯fβ₯22β).
2. (ii)
If fβLp(G) for some 1β€pβ€4, then for any 0β€R<S
[TABLE]
In the case p=4 we also have
[TABLE]
The same hold for Z2ββ,Z3ββ,β¦ with β£Jiββ£ replaced by β£Hiββ£.
Proof.
To see that Y1ββ,Y2ββ,β¦ are independent, it will be enough to prove that Yiββ is independent of the random vector (Y1ββ,Y2ββ,β¦,Yiβ1ββ) for all iβ₯2. Let W be the random vector whose coordinates are the variables Xkβ, kβ[1,M]βͺJ1ββͺβ―βͺJiβ1β and TH1ββ,UH1ββ,TH2ββ,UH2ββ,β¦,THiβ1ββ,UHiβ1ββ. Further, let Wβ² be the random vector with coordinates Xkβ, kβJiβ. Applying Lemma 1 to V=(W,Wβ²) and U=UHiββ we get that (W,Wβ²) and g(W,Wβ²)UHiββ are independent for any Borel measurable function g. But W and Wβ² are also independent, therefore W, Wβ², g(W,Wβ²)UHiββ are independent as well. Note that (Y1ββ,Y2ββ,β¦,Yiβ1ββ) is a function of W, whereas Yiββ is a function of Wβ² and g(W,Wβ²)UHiββ for some g (in fact, g(W,Wβ²) is simply the product of certain components of W). The independence thus follows.
Now fix iβ₯1. Note that SMββj=1iβ1β(THjββSJjββ)UHiββ is uniformly distributed on G and independent of Xkβ, kβJiβ. Hence Yiββ=βkβJiββf(Wkββ)=dβk=1β£Jiββ£βf(USkβ). Here USkβ is uniformly distributed on G; in particular, EYiββ=βk=1β£Jiββ£βEf(USkβ)=0.
Claim (i) follows from Proposition 6. Now fix 0β€R<S, and let us prove (ii). The case p=1 follows from β₯Yiβββ₯1ββ€βk=1β£Jiββ£ββ₯f(USkβ)β₯1β=β₯fβ₯1ββ£Jiββ£. If p=2, Proposition 6 gives β₯Yiβββ₯2β=ββk=1β£Jiββ£βf(USkβ)β2ββ€β₯fβ₯2βΞβ£Jiββ£β, hence the claim follows from independence. Now assume p=4. The independence of Y1ββ,Y2ββ,β¦ implies
[TABLE]
Proposition 9 shows β₯Yiβββ₯4β=ββk=1β£Jiββ£βf(USkβ)β4ββͺβ₯fβ₯2βΞβ£Jiββ£β+β₯fβ₯4βΞ3/4β£Jiββ£1/4, yielding (17). On the other hand, Proposition 9 also gives β₯Yiβββ₯4ββͺβ₯fβ₯4βΞβ£Jiββ£β, and so ββi=R+1SβYiβββ4ββͺβ₯fβ₯4βΞβi=R+1Sββ£Jiββ£β follows as well. This settles the endpoints of the intervals 1β€pβ€2 and 2β€pβ€4.
Observe that for a given integer Mβ₯0, given intervals H1β,J1β,β¦ and given 0β€R<S, the sum βi=R+1SβYiββ is linear in f. Applying the RieszβThorin interpolation theorem to the linear operator fβ¦βi=R+1SβYiβββ(βi=R+1Sββ£Jiββ£)β«GβfdΞΌ, the cases 1<p<2 and 2<p<4 follow. The proof for Z2ββ,Z3ββ,β¦ is analogous.
β
Lemma 3**.**
If Lpβ=supcβGβEβ£f(cX1β)β£p<β for some pβ₯1, then β₯YiββYiβββ₯pββ€2β£Jiββ£(LpβΞβ£Hiββ£β)1/p and β₯ZiββZiβββ₯pββ€2β£Hiββ£(LpβΞβ£Jiββ£β)1/p.
Proof.
We have β₯YiββYiβββ₯pββ€βkβJiβββ₯f(Wkβ)βf(Wkββ)β₯pβ. Let F be the Ο-algebra generated by SMββj=1iβ1β(THjββSJjββ), THiββ, UHiββ and Xββ, ββJiβ, β<k. Then Wkβ=aXkβ and Wkββ=aβXkβ with some F-measurable random variables a,aβ. Note that if THiββ=UHiββ, then Wkβ=Wkββ. Therefore E(β£f(Wkβ)βf(Wkββ)β£pβ£F)β€2pLpβI{THiββξ =UHiββ}β. Taking the (total) expectation we get Eβ£f(Wkβ)βf(Wkββ)β£pβ€2pLpβPr(THiββξ =UHiββ)β€2pLpβΞβ£Hiββ£β, and the result follows. The proof for β₯ZiββZiβββ₯pβ is analogous.
β
As a simple application of the approximating variables constructed above, we deduce moment estimates for shifted sums βk=M+1M+Nβf(Skβ) from the results of Section 3.
Corollary 10**.**
(i)
If supcβGβEf(cX1β)2<β, then for any integers Mβ₯0 and Nβ₯1
[TABLE]
2. (ii)
If supcβGβEβ£f(cX1β)β£p<β for some 1β€pβ€4, then for any integers Mβ₯0 and Nβ₯1
[TABLE]
with some constant Kf,Ξ½,pβ>0. In the case p=4 we also have
[TABLE]
Proof.
We may assume that N is large enough in terms of f, Ξ½ and p. Let us decompose the index set [M+1,M+N] into two consecutive intervals of integers H1β and J1β such that β£H1ββ£=β4ΞlogNβ. We then have βk=M+1M+Nβf(Skβ)=βkβH1ββf(Skβ)+βkβJ1ββf(Skβ), where βkβJ1ββf(Skβ)=dY1β. To see (i), let us write
[TABLE]
By Lemma 2 (i), here β₯Y1βββ₯2β=C(f,Ξ½)Nβ+Of,Ξ½β(1). Since, say, Ξk1/kββ€(1+q)/2 for kβ₯k0β(Ξ½) and ((1+q)/2)Ξβ€((1+q)/2)1/(1βq)β€eβ1/2, we have Ξβ£H1ββ£ββ€Nβ2. Lemma 3 thus gives β₯Y1ββY1βββ₯2ββͺf,Ξ½β1. Finally, note that supcβGβEf(cX1β)2<β implies supkβ₯1βEf(Skβ)2<β. Hence βkβH1βββ₯f(Skβ)β₯2ββͺf,Ξ½βlogN, and (i) follows. If we use Lemma 2 (ii) instead of Lemma 2 (i), similar arguments show (ii).
β
Remark**.**
We could easily improve the error term Kf,Ξ½,pβlog(N+1) in (ii) by decomposing [M+1,M+N] into more than 2 consecutive intervals of exponentially increasing sizes.
5 Proof of the theorems
We prove Theorem 3 (i) for strictly aperiodic measures and Theorem 4 in Section 5.1; the general case of Theorem 3 and Theorem 1 in Section 5.2; finally, Theorem 5 and the Remark thereafter, and Theorem 2 in Section 5.3.
5.1 Almost sure asymptotics, strictly aperiodic case
Suppose that Ξ½ is adapted and strictly aperiodic, and that (Ξ½βk)absβξ =0 for some kβ₯1. Let f:GβR be Borel measurable such that β«GβfdΞΌ=0, and assume supcβGβEβ£f(cX1β)β£p<β. In this section we prove the strong law of large numbers (3) in the case 1β€pβ€2, and the almost sure approximation by a Wiener process in Theorem 4 in the case 2<p<4. For the sake of brevity, in the proofs of this section implied constants are allowed to depend on f, Ξ½ and p.
First, assume 1β€pβ€2. We start by estimating the fluctuations. Recall that logmβ denotes the m-fold iterated logarithm.
Lemma 4**.**
For any integers mβ₯1, Mβ₯0 and Nβ₯N0β(f,Ξ½,p,m), and any Ξ»>Ξ»0β(f,Ξ½,p,m)
[TABLE]
Proof.
We use induction on m. Corollary 10 (ii) and the RademacherβMenshov inequality [11, Theorem F] give βmax1β€nβ€Nβββk=M+1M+nβf(Skβ)ββpββͺN1/plog(N+1). The m=1 case thus follows from the Markov inequality. Now assume the claim holds for some mβ₯1, and let us prove it for m+1. Let us decompose the index set [M+1,M+N] into consecutive intervals of integers H1β,J1β,H2β,J2β,β¦,HRβ,JRβ, as in Section 4, such that β£Hiββ£,β£Jiββ£β₯4ΞlogN for all i, and R=Ξ(N/logN). Similarly to the proof of Corollary 10 we have Ξβ£Hiββ£β,Ξβ£Jiββ£ββ€Nβ2. Let M+nrβ=maxJrβ, and recall that for any 2β€rβ€R
[TABLE]
Here the variables βi=2rββkβJiββf(Skβ), 2β€rβ€R have the same joint distribution as βi=2rβYiβ, 2β€rβ€R; similarly, βi=2rββkβHiββf(Skβ), 2β€rβ€R have the same joint distribution as βi=2rβZiβ, 2β€rβ€R. Let us introduce the random events
[TABLE]
The event in the claim of the lemma is a subset of AβͺBβͺβi=1RβCiβ. Applying the inductive hypothesis on the interval HiββͺJiβ of length βͺlogN we get Pr(Ciβ)βͺΞ»βp(logN/N)(logmβlogN)p, and hence Pr(βi=1RβCiβ)βͺΞ»βp(logm+1βN)p.
provided Ξ» is large enough. On the other hand, Lemma 3 gives Eβ£YiββYiβββ£pβͺβ£Jiββ£pΞβ£Hiββ£ββͺNβ2(logN)p,
and thus
[TABLE]
Therefore Pr(βi=2Rββ£YiββYiβββ£β₯(Ξ»/8)N1/p)βͺΞ»βp. This relation, together with (18) shows Pr(A)βͺΞ»βp. Repeating the same arguments for Ziβ and Ziββ we get Pr(B)βͺΞ»βp. Hence Pr(AβͺBβͺβi=1RβCiβ)βͺΞ»βp(logm+1βN)p, as claimed.
β
We are now ready to prove (3). Fix mβ₯1 and Ξ΅>0. Let us decompose the set of positive integers into consecutive intervals of integers H1β,J1β,H2β,J2β,β¦, as in Section 4 (with the choice M=0), such that, say, β£Hiββ£=β£Jiββ£=i for all iβ₯1. Similarly to the proof of Corollary 10 we have iβ₯16Ξlogi, and so Ξβ£Hiββ£β=Ξβ£Jiββ£ββ€iβ8 for all integers i large enough in terms of Ξ½.
Consider (3) along the subsequence NRβ=maxJRβ=Ξ(R2). We have
[TABLE]
Here the sequences βi=2RββkβJiββf(Skβ) and βi=2RββkβHiββf(Skβ), R=2,3,β¦ have the same distribution as βi=2RβYiβ and βi=2RβZiβ, R=2,3,β¦, respectively. Using Lemma 2 (ii) we get βi=i0β(m)ββEβ£Yiβββ£p/(iΟm,Ξ΅β(i))βͺβi=i0β(m)ββ1/Οm,Ξ΅β(i)<β. By a classical form of the strong law of large numbers (see e.g.Β [12, p.Β 209]) and R1/pΟm,Ξ΅β(R)1/p=Ξ(Οm,Ξ΅β(NRβ)1/p), we have
[TABLE]
Lemma 3 gives Eβ£YiββYiβββ£pβͺβ£Jiββ£pΞβ£Hiββ£ββͺipβ8, and hence Pr(β£YiββYiβββ£β₯1/i2)βͺi3pβ8β€iβ2. By the BorelβCantelli lemma βi=2βββ£YiββYiβββ£<β a.s., and consequently (19) remains true if we replace Yiββ by Yiβ. Repeating the same arguments for Ziβ and Ziββ, we obtain (3) along the subsequence NRβ=maxJRβ.
On the other hand, applying Lemma 4 with m+2 on the interval HRββͺJRβ of length 2R, we get
[TABLE]
The BorelβCantelli lemma shows that with probability 1, for any Rβ₯1 and any NβHRββͺJRβ the fluctuation satisfies ββk=minHRβNβf(Skβ)ββͺΟβΟm+1,Ξ΅β(N)1/p with an implied constant depending on the point Ο of the probability space. Therefore (3) holds along all N. This finishes the proof of Theorem 3 (i) under the extra condition that Ξ½ is strictly aperiodic.
Assume p=2+Ξ΄ for some 0<Ξ΄<2, and C(f,Ξ½)>0. Let us decompose the set of positive integers into consecutive intervals of integers H1β,J1β,H2β,J2β,β¦, as in Section 4 (with the choice M=0), such that β£Hiββ£=β22Ξlog(i+1)β and β£Jiββ£=βiΞ΄/(4+2Ξ΄)β for all iβ₯1. As before, we have Ξβ£Hiββ£ββ€iβ11 and Ξβ£Jiββ£ββ€iβ11 for all integers i large enough in terms of Ξ½ and Ξ΄.
Corollary 10 (ii) and the ErdΕsβStechkin inequality [11, Theorem A] give βmax1β€nβ€Nβββk=M+1M+nβf(Skβ)ββ2+Ξ΄ββͺNβ for any Mβ₯0 and Nβ₯1. Therefore for any Rβ₯1 we have
[TABLE]
The BorelβCantelli lemma shows that with probability 1, for any Rβ₯1 and any NβHRββͺJRβ the fluctuation satisfies ββk=minHRβNβf(Skβ)ββͺΟβR1/2 with an implied constant depending on the point Ο of the probability space. For any tβ₯1 let R(t) denote the positive integer for which βtββHR(t)ββͺJR(t)β. Summing over minJ1ββ€kβ€maxJR(t)β instead of 1β€kβ€t, we thus obtain
[TABLE]
Here βi=1R(t)ββkβJiββf(Skβ)=dβi=1R(t)βYiβ and βi=2R(t)ββkβHiββf(Skβ)=dβi=2R(t)βZiβ in the Skorokhod space D[0,β). From Lemma 3 we get Eβ£YiββYiβββ£2+Ξ΄βͺβ£Jiββ£2+Ξ΄Ξβ£Hiββ£ββͺiβ11+Ξ΄/2. Hence Pr(β£YiββYiβββ£β₯1/i2)βͺiβ7+5Ξ΄/2β€iβ2, so by the BorelβCantelli lemma βi=1βββ£YiββYiβββ£<β a.s. Clearly the same holds for ZiββZiββ.
By Lemma 2 we have β₯Ziβββ₯2+Ξ΄ββͺlogiβ, and βi=2RβEβ£Ziβββ£2=Ξ(RlogR). It follows (see e.g.Β [12, p.Β 246]) that βi=2RβZiββ satisfies the law of the iterated logarithm; in particular, ββi=2RβZiββββͺΟβRlogRloglogRβ a.s. Note that R(t)1/2βͺt(2+Ξ΄)/(4+3Ξ΄) and (2+Ξ΄)/(4+3Ξ΄)<1/2βΞ΄/20 whenever 0<Ξ΄<2. Thus the second double sum on the right hand side of (20) is o(t1/2βΞ΄/20) a.s., and consequently the processes βkβ€tβf(Skβ) and βi=1R(t)βYiββ are o(t1/2βΞ΄/20)-equivalent.
A special case of a theorem of Strassen [15, Theorem 4.4] states the following. Given independent random variables ΞΆiβ, i=1,2,β¦ with EΞΆiβ=0 and VRβ=βi=1RβEβ£ΞΆiββ£2ββ, for any tβ₯V1β let Rβ²(t) denote the positive integer for which VRβ²(t)ββ€t<VRβ²(t)+1β. If βi=1ββEβ£ΞΆiββ£p/ViΞΈp/2β<β for some p>2 and 0β€ΞΈβ€1, then the processes βi=1Rβ²(t)βΞΆiβ and W(t) are o(t(1+ΞΈ)/4logt)-equivalent, where W(t) is a standard Wiener process.
We apply Strassenβs theorem to ΞΆiβ=Yiββ/C(f,Ξ½)β, i=1,2,β¦. By Lemma 2 we have VRβ=βi=1RβEβ£ΞΆiββ£2=βi=1Rββ£Jiββ£+O(R)=Ξ(R1+Ξ΄/(4+2Ξ΄)) and Eβ£ΞΆiββ£2+Ξ΄βͺβ£Jiββ£1+Ξ΄/2βͺiΞ΄/4. Hence βi=1ββEβ£ΞΆiββ£2+Ξ΄/ViΞΈ(1+Ξ΄/2)β<β for any ΞΈ>(4+Ξ΄)/(4+3Ξ΄). Choosing ΞΈ close enough to (4+Ξ΄)/(4+3Ξ΄), we have (1+ΞΈ)/4<1/2βΞ΄/20, and so the processes βi=1Rβ²(t)βYiββ/C(f,Ξ½)β and W(t) are o(t1/2βΞ΄/20)-equivalent; clearly so are βi=1Rβ²(t)βYiββ and C(f,Ξ½)βW(t).
Finally, we show that the processes Y(t)=βi=1Rβ²(t)βYiββ and βi=1R(t)βYiββ are o(t1/2βΞ΄/20)-equivalent. Clearly maxJRβ=βi=1Rβ(β£Hiββ£+β£Jiββ£), and recall that VRβ=βi=1Rββ£Jiββ£+O(R). Therefore for all large enough integer r, on the interval Vrββ€t<Vr+1β we have Rβ²(t)=r and R(t)=Rβ²(Vrββs) for some 0β€sβͺVr(4+2Ξ΄)/(4+3Ξ΄)βlogVrβ, and hence βi=1Rβ²(t)βYiββ=Y(Vrβ) and βi=1R(t)βYiββ=Y(Vrββs). Letting Krβ=cVr(4+2Ξ΄)/(4+3Ξ΄)βlogVrβ with a large enough constant c>0, it will thus be enough to prove that
[TABLE]
Recalling the distribution of the running maximum of a Wiener process, we have
[TABLE]
Choosing, say, Ξ»=2logVrββ and noting (2+Ξ΄)/(4+3Ξ΄)<1/2βΞ΄/20, the BorelβCantelli lemma shows that the process W(t) satisfies the property in (21); clearly so does C(f,Ξ½)βW(t). Since (21) is invariant under o(t1/2βΞ΄/20)-equivalence, Y(t) also satisfies (21). This finishes the proof in the case C(f,Ξ½)>0.
If C(f,Ξ½)=0, the proof is much simpler. In this case Lemma 2 gives Eβ£Yiβββ£2βͺ1. Therefore βi=1ββEβ£Yiβββ£2/i1+2Ξ΅<β for any Ξ΅>0, and by the strong law of large numbers βi=1R(t)βYiββ=o(R(t)1/2+Ξ΅) a.s. Similarly, βi=2R(t)βZiββ=o(R(t)1/2+Ξ΅) a.s. Using these relations instead of the law of the iterated logarithm and Strassenβs theorem and noting that R(t)1/2+Ξ΅βͺt1/2βΞ΄/20 for small enough Ξ΅>0, we get βkβ€tβf(Skβ)=o(t1/2βΞ΄/20) a.s., as claimed.
β
Under the extra condition that Ξ½ is strictly aperiodic, we proved claim (i) in Section 5.1, whereas claim (ii) follows from Theorem 4. We now show that the condition of strict aperiodicity can be removed, and prove the general case of Theorem 3.
Assume that the pair (G,Ξ½) satisfies the conditions of Theorem 3; that is, G is a compact metrizable group, and Ξ½ is a Borel probability measure on G such that Ξ½ is adapted, and (Ξ½βk)absβξ =0 for some kβ₯1. We shall use the notation ΞΌGβ for the normalized Haar measure on G. It is not difficult to see that if Ξ½1β and Ξ½2β are Borel probability measures on G, then supp(Ξ½1ββΞ½2β)=(suppΞ½1β)(suppΞ½2β) (see e.g.Β [17, Lemma 2]). Therefore suppΞ½βk=(suppΞ½)k, where we use the notation Ak={a1βa2ββ―akβ:a1β,a2β,β¦,akββA}. In particular, suppΞ½β(k+1) contains a translate of suppΞ½βk, so the sequence ΞΌGβ(suppΞ½βk) is nondecreasing. Let Ξ±(G,Ξ½)=limkβββΞΌGβ(suppΞ½βk). Note that Ξ±(G,Ξ½)>0 because (Ξ½βk)absβξ =0 for some kβ₯1. The following simple observation is a special case of [7, Theorem 14]. For the sake of completeness we include a short proof.
Lemma 5**.**
Let G be a compact metrizable group. If KβG is nonempty and closed, and K2βK, then K is a subgroup.
Proof.
Let aβK be arbitrary. By assumption anβK for all nβ₯1. Using the compactness of K we have ankββbβK as kββ for some subsequence ankβ. For any fixed nβ₯1 we have ankββnβaβnbβK as kββ. After replacing nkβ by another subsequence we may assume that ankββbβK and aβnkβbβcβK as kββ for some c. Then b=ankβaβnkβbβbc, hence c=1βK. It remains to prove that for any aβK we have aβ1βK. But aK is also nonempty and closed, and (aK)2βaK. By the previous argument 1βaK, therefore aβ1βK.
β
Assume now, that there exists a proper closed normal subgroup Hβ²G such that suppΞ½βaH for some coset aH. Since H is normal, we have suppΞ½βkβakH for all kβ₯1. Thus ΞΌGβ(H)=ΞΌGβ(akH)β₯ΞΌGβ(suppΞ½βk), and so ΞΌGβ(H)β₯Ξ±(G,Ξ½)>0. In particular, aH has finite order d in the factor group G/H. Since βi=1dβaiH is a closed subgroup of G containing suppΞ½, and Ξ½ is assumed to be adapted, we have G=βi=1dβaiH and β£G:Hβ£=d. As suppΞ½βdβH, we can view Ξ½βd as a Borel probability measure on the compact metrizable group H. Note that ΞΌHβ(B)=dβ ΞΌGβ(B) (BβH Borel) is the normalized Haar measure on H. Clearly (Ξ½βd)βk has an absolutely continuous component with respect to ΞΌHβ for some kβ₯1. It is also not difficult to see that Ξ½βd is adapted on H. Indeed, suppose K<H is a proper closed subgroup for which suppΞ½βdβK. Consider C=βi=1dβsuppΞ½βiK, and note that here suppΞ½βiKβaiH; in particular, Cξ =G. On the other hand, writing an arbitrary integer kβ₯1 in the form k=nd+i, 1β€iβ€d we have suppΞ½βk=(suppΞ½βi)(suppΞ½βd)nβsuppΞ½βiK. Therefore the topological closure βk=1ββsuppΞ½βkβ is a subset of Cξ =G. Using Lemma 5 we get that βk=1ββsuppΞ½βkβ is a proper closed subgroup of G, contradicting the adaptedness of Ξ½. Altogether, we find that the pair (H,Ξ½βd) satisfies the conditions of Theorem 3. Observe, moreover, that Ξ±(H,Ξ½βd)=dβ Ξ±(G,Ξ½).
Assume in addition, that the pair (H,Ξ½βd) satisfies the claims of Theorem 3. We now prove that under all these assumptions (G,Ξ½) also satisfies the claims of Theorem 3. Fix a Borel measurable function f:GβR such that supcβGββ£f(cX1β)β£p<β for some pβ₯1. It will be enough to prove that for any 1β€iβ€d we have
[TABLE]
for any mβ₯1 and Ξ΅>0 in the case 1β€pβ€2, and
[TABLE]
in the case p>2. Fix 1β€iβ€d, and let Fiβ denote the Ο-algebra generated by X1β,X2β,β¦,Xiβ. Letting Ynβ=βj=i+(nβ1)d+1i+ndβXjβ, the variables Y1β,Y2β,β¦ are i.i.d.Β H-valued random variables with distribution Ξ½βd, independent of Fiβ. Let b=X1βX2ββ―Xiβ, and note bβaiH a.s. Let g:HβR, g(x)=f(bx), and observe supcβHβE(β£g(cY1β)β£pβ£Fiβ)<β a.s.Β and β«HβgdΞΌHβ=dβ«aiHβfdΞΌGβ a.s. We thus have
[TABLE]
By the assumption that (H,Ξ½βd) satisfies the claims of Theorem 3, we have
[TABLE]
in the case 1β€pβ€2, and
[TABLE]
in the case p>2. Taking the (total) probability, (22) and (23) follow.
Finally, we prove that the pair (G,Ξ½) satisfies the claims of Theorem 3. Let H0β=G. If Ξ½ is not strictly aperiodic in H0β, then let H1ββ²H0β be a proper closed normal subgroup such that suppΞ½ is contained in a coset of H1β, and let d1β=β£H0β:H1ββ£. As seen above, the pair (H1β,Ξ½βd1β) satisfies the conditions of Theorem 3, hence we can iterate this procedure. We obtain a sequence H0ββ³H1ββ³β―β³Hjβ, where Hiβ is a proper closed normal subgroup of Hiβ1β with index diβ=β£Hiβ1β:Hiββ£, and suppΞ½β(d1ββ―diβ1β) is contained in a coset of Hiβ for all 1β€iβ€j. The procedure ends after step j if Ξ½β(d1ββ―djβ) is strictly aperiodic in Hjβ. Note that 1β₯Ξ±(Hiβ,Ξ½β(d1ββ―diβ))=d1ββ―diβΞ±(G,Ξ½), therefore the procedure terminates after finitely many steps. We prove the claims by induction on j. If j=0, that is, Ξ½ is strictly aperiodic, the claims have already been proved. To prove the inductive step from jβ1 to j, we first apply the inductive hypothesis to (H1β,Ξ½βd1β), then the arguments above to conclude that (G,Ξ½) satisfies the claims of Theorem 3.
β
The implication (iii)β(ii) is trivial, whereas (i)β(iii) is a special case of Theorem 3. Let us finally prove (ii)β(i). First, suppose that Ξ½βk is singular with respect to ΞΌ for every kβ₯1. Then there exists a Borel set BβG such that ΞΌ(B)=0 and Pr(SkββB)=1 for every kβ₯1. Hence the indicator function f=IBβ does not satisfy (ii), giving a contradiction. Suppose next, that Ξ½ is not adapted; that is, there exists a proper closed subgroup H<G such that Pr(X1ββH)=1. Then Pr(SkββH)=1 for all kβ₯1. Since every nonempty open subset of G has positive Haar measure, we have ΞΌ(H)<1. Therefore f=IHβ does not satisfy (ii), giving a contradiction.
β
In this proof implied constants will be universal. Claim (ii) follows from Corollary 10 (i). To see (i), fix a positive integer N large enough in terms of f, Ξ½ and Ξ΄, and let us prove (6). Let ENβ=NβΞ΄/(2+2Ξ΄)logΞ΄/(1+Ξ΄)N and K=Ξ(β₯fβ₯2+Ξ΄β/C(f,Ξ½)β)(2+Ξ΄)/(1+Ξ΄). Let us decompose the set {1,2,β¦,N} into consecutive intervals of integers H1β,J1β,β¦,HRβ,JRβ, as in Section 4 (with the choice M=0), such that β£Hiββ£=β4ΞlogNβ and β£Jiββ£=Ξ((Ξ/K2)NΞ΄/(1+Ξ΄)log2/(1+Ξ΄)N) for all 1β€iβ€R. As in the proof of Corollary 10, we have Ξβ£Hiββ£ββ€Nβ2, and clearly the same holds for Ξβ£Jiββ£β.
Recall that
[TABLE]
where βi=1RββkβJiββf(Skβ)=dβi=1RβYiβ and βi=2RββkβHiββf(Skβ)=dβi=2RβZiβ. From Lemma 2 and the classical Lyapunov condition (see e.g.Β [12, p.Β 154]) we get
[TABLE]
Here βi=1RβE(Yiββ)2=C(f,Ξ½)N+Of,Ξ½β(N1/(1+Ξ΄)), therefore the error of replacing the normalizing factor on the left hand side of (24) by C(f,Ξ½)Nβ is o(ENβ). Similarly, βi=2RβZiββ also satisfies the central limit theorem with remainder term O(KENβ). In particular,
[TABLE]
Applying this with x=logNβ and noting that 1βΞ¦(logNβ)=O(Nβ1/2)=o(ENβ), we obtain
[TABLE]
From Lemma 3 we get β₯βi=1Rβ(YiββYiββ)β₯2ββͺf,Ξ½ββi=1Rββ£Jiββ£Ξβ£Hiββ£βββͺf,Ξ½β1, hence the Chebyshev inequality gives
[TABLE]
We similarly deduce
[TABLE]
Finally, note that supcβGβEf(cX1β)2<β implies supkβ₯1βEf(Skβ)2<β. Therefore β₯βkβH1ββf(Skβ)β₯2ββͺf,Ξ½βlogN, and the Chebyshev inequality gives
We now prove the Remark made after Theorem 5. If fβL4(G), then instead of β₯Yiβββ₯3ββͺβ₯fβ₯3βΞβ£Jiββ£β, Lemma 2 gives the slightly better estimate β₯Yiβββ₯3ββ€β₯Yiβββ₯4ββͺβ₯fβ₯2βΞβ£Jiββ£β+β₯fβ₯4βΞ3/4β£Jiββ£1/4. Therefore if in the definition of K we replace β₯fβ₯2+Ξ΄β by β₯fβ₯2β, the Lyapunov condition gives that (24) and (25) hold with error terms O(KENβ)+o(ENβ)=O(KENβ). The rest of the proof remains unchanged.
The implication (i)β(ii) follows from Theorem 5 and Proposition 8. The latter is needed to ensure C(f,Ξ½)>0. We now prove (ii)β(i). The facts that Ξ½ is adapted, and that (Ξ½βk)absβξ =0 for some kβ₯1 follow similarly to the proof of Theorem 1. Suppose that suppΞ½ is contained in a coset aH of some proper closed normal subgroup Hβ²G. We have seen in Section 5.2 that the index d=β£G:Hβ£ is finite, and G=βi=1dβaiH. Note that if k=nd+i for some 1β€iβ€d, then SkββaiH a.s. Letting f=IHββΞΌ(H), we thus have ββk=1Nβf(Skβ)ββ€1β1/d a.s. Hence Nβ1/2βk=1Nβf(Skβ) cannot have a nondegenerate limit distribution.
β
Bibliography18
The reference list from the paper itself. Each links out to its DOI / PubMed record.
1[1] M. Anoussis and D. Gatzouras: A spectral radius formula for the Fourier transform on compact groups and applications to random walks. Adv. Math. 188 (2004), no. 2, 425β443.
3[3] I. Berkes and B. Borda: On the law of the iterated logarithm for random exponential sums. Trans. Amer. Math. Soc. 371 (2019), no. 5, 3259β3280.
4[4] I. Berkes and M. Raseta: On the discrepancy and empirical distribution function of { n k β Ξ± } subscript π π πΌ \{n_{k}\alpha\} . Unif. Distrib. Theory 10 (2015), no. 1, 1β17.
5[5] R. N. Bhattacharya: Speed of convergence of the n π n -fold convolution of a probability measure on a compact group. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete 25 (1972/73), 1β10.
6[6] G. B. Folland: A Course in Abstract Harmonic Analysis. Second edition. CRC Press, Boca Raton, FL, 2016.
7[7] B. Gelbaum, G. K. Kalisch and J. M. H. Olmsted: On the embedding of topological semigroups and integral domains. Proc. Amer. Math. Soc. 2 (1951), 807β821.
8[8] Y. Kawada and K. ItΓ΄: On the probability distribution on a compact group. I. Proc. Phys.-Math. Soc. Japan (3) 22 (1940), 977β998.