A quantitative version of the theorem on Khintchine's constant
Piotr Kamie\'nski

TL;DR
This paper provides explicit measure estimates for the set of numbers with continued fraction partial quotients' products growing exponentially at a rate close to Khintchine's constant, using large deviations theory and cumulant methods.
Contribution
It offers a quantitative, non-asymptotic measure estimate for numbers with continued fraction products near Khintchine's predicted growth rate, with explicit bounds.
Findings
Measure estimates can be made arbitrarily close to full for large N.
Bounds are explicit and not asymptotic, depending on parameters.
Employs large deviations theory and cumulant method in proof.
Abstract
In the paper we provide measure estimates for the set of numbers whose sequence of products of continued fraction partial quotients has exponential growth with rate close to the one predicted by Khintchine's theorem, i.e. for which \begin{equation*} e^{(\kappa - T)n} \leqslant M_n \leqslant e^{(\kappa + T)n} \end{equation*} for a fixed and all greater than some fixed integer , where is the Khintchine constant. Choosing large enough the measure can be made arbitrarily close to full, for any given . The bounds are not of asymptotic nature, but explicit in terms of the parameters involved. In the proof we compile several known result of large deviations theory, employing the cumulant method in particular. We also discuss the numerical values of the quantities involved.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
A quantitative version of the theorem on Khintchine’s constant
Piotr Kamieński
Abstract.
In the paper we provide measure estimates for the set of numbers whose sequence of products of continued fraction partial quotients has exponential growth with rate close to the one predicted by Khintchine’s theorem, i.e. for which
[TABLE]
for a fixed and all greater than some fixed integer , where is the Khintchine constant. Choosing large enough the measure can be made arbitrarily close to full, for any given . The bounds are not of asymptotic nature, but explicit in terms of the parameters involved. In the proof we compile several known result of large deviations theory, employing the cumulant method in particular. We also discuss the numerical values of the quantities involved.
1. Motivation
Diophantine (and Brjuno) numbers(i)(i)(i)see definition 5.1 for details are commonly used in small divisors problems ([1, 16, 18]). In KAM theory, for instance, if the frequency of an invariant torus is Diophantine then this torus survives once a perturbation is introduced - this happens for small enough perturbation parameter values and under the additional twist condition. The term “small enough”, however, if specified precisely by the KAM-type theorem usually means “smaller than some explicit formula depending on the Diophantine constant and exponent ”(as in e.g. [5]). The problem with this approach appears when we consider a family of tori with varying frequencies. Changing by replacing either its first few decimal digits or continued fraction partial quotients does not change , but might decrease quite significantly, lowering the applicability threshold of the theorem through a multiplicative correction.
We propose an alternative to the Diophantine condition, what we call the Khintchine-Lévy condition (or condition for short) to account for this disadvantage. In the present paper we give the definition of the Khintchine-Lévy numbers and prove that the set of all those numbers is generic in the sense of Lebesgue measure, as is the case with the set of Diophantine numbers. Specifically we provide explicit lower bounds on the measure of the set of numbers in terms of the parameters involved in their definition. In a parallel paper [9] we employ numbers to prove a small denominators result that is similar in nature to ones already obtained for Diophantine numbers. Khintchine-Lévy numbers, however, have one advantage over Diophantine ones, namely they are less sensitive to the aforementioned changes in the initial part of the continued fraction. In the estimates in [9] such changes are reflected only through a minor additive correction.
We briefly introduce some notation before proceeding with the details. We will be working with irrational numbers considered as a probability space with the Borel algebra and either the Lebesgue measure or the Gauss measure given in terms of a density function(ii)(ii)(ii)Note that in particular the two measures are absolutely continuous with respect to one another. In particular the terms “Gauss almost all” and “Lebesgue almost all” can be used interchangeably.: . Expected value of a random variable w.r.t. a measure will be denoted by and will denote the complement of a set .
Each has a unique infinite continued fraction expansion into a sequence of partial quotients (iii)(iii)(iii)Note that we consider , since we are in .:
[TABLE]
The shift on the continued fraction expansion is a measurable transformation known as the Gauss map :
[TABLE]
It preserves the Gauss measure and is ergodic with respect to that measure ([15]). We also note that . For we denote
[TABLE]
The main motivation comes from the classical theorem on Khintchine’s constant ([10]). It tells us that for almost all the limit of exists, is finite and constant as a function of within said full measure set. We denote this limit as and refer to as Khintchine constant(v)(v)(v)This is consistent with existing literature, where Khintchine’s constant is defined as the pointwise a.e. limit of .. One can observe that is actually the time- average of the test function along the orbit of under the action of . Khintchine’s theorem is thus a consequence of the Birkhoff pointwise ergodic theorem and must be equal to the spatial average of :
[TABLE]
In probabilists’ language Khintchine’s theorem is actually a strong law of large numbers for the sequence of “samples” . This result was later improved in the form of a plethora of limit theorems (see the monograph [8, Chapter 3] and references therein for a survey). To the author’s knowledge, however, all of the existing results are of asymptotic nature, but none provide exact estimates of the measure with explicitly computed constants. Our main result, theorem 2.2, aims to fill this gap in.
It is also worth noting that the sequence of denominators of convergents(vii)(vii)(vii)defined as the denominators of the reduced fraction obtained by truncating the continued fraction expansion at to has also been extensively studied in the literature. In [9] we discuss why this sequence is even more important from the point of view small denominators problems and KAM theory. Notable results include the analogue of Khintchine’s theorem by Khintchine and Lévy ([12]), its refinement by Philipp and Stackelberg in the form of a law of the iterated logarithm ([14]) and further refinements by Ibragimov [7] and Misevičius [13], who obtained a central limit theorem with error bounds. We were, however, unable to prove the counterpart of theorem 2.2 for the sequence for technical reasons, which we discuss in the final section 7.
In section 2 we provide the reader with a precise definition of a Khintchine-Lévy number in definition 2.1 and later in theorem 2.2 we specify how far from full the measure of the set of these numbers is. In section 3 we introduce all the necessary tools and combine them into a proof of this result, the most important one being the cumulant method in theorem 3.13, which provides estimates for the tails of a r.v. given the estimates for its cumulants. In section 4 we formulate and discuss the proof of a variation on theorem 2.2 for a slight modification of the sequence . Section 5 contains a brief comparison of the Khintchine-Lévy numbers with Diophantine numbers. We conclude the paper with a brief practical analysis of the numerical values of parameters used along its course and some final remarks in sections 6 and 7. The simple, but lengthy formulas are contained within appendix A for clarity.
2. Main result
The idea behind the Khintchine-Lévy condition is the following. From Khintchine’s theorem we can vaguely conclude that on a full measure set of the sequence asymptotically exhibits exponential growth similar to . We therefore conjecture that on a slightly smaller set, one whose measure is only close to full, the sequence also exhibits exponential growth, but with slightly more relaxed requirements on its rate. Along the course of the paper we will learn that this is indeed the case, as stated in theorem 2.2.
Definition 2.1** (Khintchine-Lévy condition).**
We say that an irrational number is upper- with constants and if the following inequality holds for all :
[TABLE]
Similarly, a number is lower- with constants and if for all we have
[TABLE]
We denote the sets formed by the numbers with the above properties by, respectively, and . We also denote and where .
Also, for a given natural number , we denote by the set
[TABLE]
and similarly for and .
If a set of numbers is of the form for some and we will refer to it as a (upper/lower-)Khintchine-Lévy set or a -set for short.
We are now ready to state the main result of this paper, which is in fact the aforementioned conjecture with all the necessary details accounted for.
Theorem 2.2** (Estimates on the measure of -sets).**
Let be a natural number and let be a positive real number. Denote
[TABLE]
where and are universal constants given in (15) and (28). Also denote . The lower bounds on the Gauss measures of Khintchine-Lévy sets are given by
[TABLE]
In particular for being a square of an integer we have
[TABLE]
Formulas (8) and (10) imply in particular that regardless how small is we can still find an such that the measure is -close to full for any fixed . In section 6 we discuss the function from a numerical point of view.
3. Proof of theorem 2.2
To estimate the measure of from below is the same as to estimate the measure of its complement from above. The complement, however, can be expressed as a sum of complements of :
[TABLE]
Our focus will therefore be centered on estimating from above to use subadditivity of in the end.(viii)(viii)(viii)The sum in (11) is not disjoint, but we are not concerned with the overlaps of the summands in the proof.
3.1. -sets as tails of probability distributions
To perform the proof of theorem 2.2 we first need to reformulate its statement in spirit of large deviations theory, we will mainly use the language of random variables and . Once this is done we will lay the framework of the proof out and fill in all the details in all the following subsections of this section.
First observe that for all and thus - this is a consequence of the fact that and the -invariance of (recall (4)). Using this we can now write in terms of centerings of and :
[TABLE]
and similarly for . This way estimating from below is the same as estimating the right/left tail of the centering of from above.
Our strategy will be the following. We first estimate the moments of in lemma 3.2. These moment estimates will allow us to use theorem 3.11 to obtain estimates on the cumulants of and also of (ix)(ix)(ix)Shifting a random variable by a constant affects only the first cumulant, the only one we will not be concerned with. - for this, however, we will need two additional assumptions on : the -mixing assumption and the Markov chain association assumption. We introduce them in definitions 3.3 and 3.6 and verify their validity for in lemmas 3.5 and 3.7. Once we have the cumulant estimates of we can estimate its tails - this is done with the help of theorem 3.13.
Before we proceed we clarify what we exactly mean by moments and cumulants.
Definition 3.1**.**
Let and let be a random variable on a probability space . We define the -th moment of as and the -th cumulant of as
[TABLE]
We will sometimes refer to as the cumulant generating function.
3.2. Moment estimates
Lemma 3.2** (Estimates of the moments of ).**
The following estimates on the -th moment of are valid for any and :
[TABLE]
Here
[TABLE]
Proof.
We will prove a stronger inequality, namely
[TABLE]
In the formulation of the lemma, however, we decided to keep the (severe) exponential overestimation so that our result fits the framework of theorem 3.11.
[TABLE]
Equality is a consequence of -invariance of , while in we used the fact that and that on the interval the function is decreasing. In the equalities following we simply substituted for and for , respectively. After splitting the integral into the sum of two integrals we change the variables once again: on from to and on from to . As a result we obtain
[TABLE]
The last equality stems from the definition of the Euler gamma function and the fact that for integer arguments we have . ∎
3.3. Mixing properties of
There is a number of types of mixing for sequences of random variables (for a deeper insight see e.g. [6] and references therein or [4]), the main idea behind all of them being the following: the further away from each other two random variables are in the sequence (in terms of the indexing number ) the closer they are to being independent. We will be primarily interested in the notion of -mixing. However, a stronger property of -mixing will also prove to be a useful tool.
Definition 3.3** (-mixing sequence of r.vs, -mixing function and -mixing coefficients).**
Let be a sequence of random variables on a probability space . For indices denote by the -algebra generated by random variables with . We define the -mixing function of the sequence to be given by
[TABLE]
where the supremum is taken over for which .
We define the -mixing coefficients of the sequence to be
[TABLE]
We say that the sequence is -mixing (w.r.t. ) if as .
The property of -mixing is defined analogously, we alter only the mixing function in the definition:
Definition 3.4** (-mixing sequence of r.vs, -mixing function and -mixing coefficients).**
With the notations of definition 3.3 we define the -mixing function of the sequence to be given by
[TABLE]
where the supremum is taken over for which .
The -mixing coefficients are defined analogously to in definition 3.3 and the sequence is called -mixing if they tend to [math] with .
The -mixing property entails -mixing and additionally ([4]). It turns out that the sequence enjoys the -mixing property and the mixing coefficients decay at least exponentially fast:
Lemma 3.5** (Quantitative estimates on the mixing coefficients of , [8, Proposition 2.3.7]).**
The coefficients of the sequence are bounded from above by , and
[TABLE]
for all , where is the Gauss-Kuzmin-Wirsing constant whose approximate value is .
Lemma 3.5 holds true also for the sequence (with exactly the same mixing coefficients). This is because the -mixing property depends only on algebras generated by the initial and tail parts of the sequence in question and these do not change upon composing the sequence with a bijective, measurable function (recall that ). This exponential decay will be useful for us in the technical results of subsection 3.5.
3.4. Markov chain association
Definition 3.6** (Sequence of r.vs. associated to a Markov chain).**
We say that a sequence of random variables on a probability space is associated to a Markov chain through a sequence of functions if
[TABLE]
for a Markov chain .
Lemma 3.7**.**
The sequence is associated to the Markov chain
[TABLE]
through the sequence of functions given by
[TABLE]
Proof.
Equality is a direct consequence of , which stems from the definition of . We thus have to prove that is indeed a Markov chain. The definition of a Markov chain requires a choice of probability (in our case a natural one would be to choose ). However, is a Markov chain for any probability (for which the definition of a Markov chain makes sense). Observe that once the chain is at some state we can uniquely determine all its past states through the shift . This way any conditional probability under the condition of all past states being fixed is actually the conditional probability under the condition of just the previous state being fixed, provided that these probabilities are nonzero, which is the case for . ∎
3.5. The quantity
We now have almost all the necessary tools to proceed with the estimation of the cumulants. We need, however, to define and study one more quantity - . Its nature is purely technical, but it will become crucial for us in the formulation of theorem 3.11.
Definition 3.8** (The quantity).**
Let be a function and let . We define to be
[TABLE]
We will be primarily interested in , where is the -mixing function of the sequence . Again, upper bounds on this quantity will turn out to be essential for us.
Lemma 3.9** (Estimates on for the sequence ).**
If is the -mixing function of the sequence then the following inequality holds:
[TABLE]
where is a universal constant given by
[TABLE]
with being the Gauss-Kuzmin-Wirsing constant.
For the reader acquainted with various types of mixing and metrical theory of continued fractions it may have appeared that we use the -mixing property with regard to (and ) unnecessarily, as these sequences enjoy the stronger property of -mixing and therefore might be suitable for large deviation theorems which produce better estimates. This is not the case, however, as these theorems employ the eponymous , which in turn depends on which may turn out to be infinite in the case. Before we proceed with the proof of lemma 3.9 we clarify this subtlety in the following
Example 3.10**.**
Suppose that is a sequence of r.vs. such that the algebra generated by admits sets of arbitrarily small measure for some . Let be a sequence of sets in this algebra whose measures decrease to [math]. This algebra is contained in both and . We therefore have
[TABLE]
The phenomenon described above does not appear if we use -mixing instead. Note that in definitions 3.3 and 3.4 we used and as the codomains for and , respectively. This is because is a natural upper bound for since we can estimate .
Proof of lemma 3.9.
We will estimate the sum using the -mixing coefficients and lemma 3.5. We will, however, take into account what has been said in example 3.10 and majorize all the terms in the sum except the first one, for which we use .
We employ the bounds of lemma 3.5 for the remaining terms:
[TABLE]
Both the sum in curly brackets in (28) and the number are bounded from above by , which concludes the proof. ∎
3.6. Estimating the cumulants of the centered sum
We first state the abstract theorem that will allow us to pass from estimates on the moments of to estimates on the cumulants of .
Theorem 3.11** (Moment estimates imply cumulant estimates for the sum, [17, Theorem 4.21]).**
Let be a sequence of random variables defined on a probability space and denote
[TABLE]
Assume that the sequence is associated to some Markov chain and that it is -mixing. Assume also that it satisfies the following moment estimate:
[TABLE]
for some constants and and all integers and . Then for each and the following cumulant estimate is valid for :
[TABLE]
where are taken with respect to .
Let us now apply theorem 3.11 with . Its assumptions are verified with and (lemma 3.2). Choosing and applying the estimates on (lemma 3.9) we arrive at the following
Theorem 3.12** (Cumulant estimates for ).**
For any and the -th cumulant of is bounded by
[TABLE]
Theorem 3.12 holds also if we replace with its centering since shifting a random variable by a constant does not affect its cumulants of order . We will use this simple observation in what follows.
3.7. Estimating the tails of the centered sum
We now turn to estimating the tails of . Once again we begin by stating the abstract large deviations theorem.
Theorem 3.13** (Cumulant estimates imply tail estimates, [17, Lemma 2.4], [2]).**
Let be a centered(x)(x)(x)i.e. random variable defined on a probability space . Assume there exist constants and such that for all integers we have
[TABLE]
Then for all the following inequality is valid:
[TABLE]
Here denotes the cumulant taken w.r.t. , while the notation indicates that the inequality holds both for and .
We may now plug the results of theorem 3.12 for into theorem 3.13. Its assumptions are verified for measure and constants and .
Theorem 3.14** (Tail estimates for ).**
For any and the following tail estimate holds for :
[TABLE]
We now combine the results of this section altogether to obtain the desired estimates on the measure of -sets.
Proof of theorem 2.2.
Rewriting the estimates of theorem 3.14 with we arrive at
[TABLE]
The final estimates (9) and (10) stem from (11), the subadditivity of and the estimates for the sum of terms of the form over for contained in lemma A.1 in appendix A. ∎
4. The case of incremented partial quotients
Theorem 2.2 is not limited for application only to the sequence , one can also use it for other sequences for which a counterpart of Khintchine’s theorem on Khintchine constant holds. We demonstrate it for the sequence of products of incremented partial quotients :
[TABLE]
We choose among other sequences for this purpose since it provides an upper bound for the sequence of denominators of convergents , similarly to , which provides a lower bound. This proves useful in the small divisors estimates that we perform in [9].
We begin by introducing the notations that are a counterpart of (3):
[TABLE]
for . By Birkhoff’s pointwise ergodic theorem the sequence tends to a constant almost everywhere just like in the theorem on Khintchine’s constant. This time, however, the test function is , therefore with
[TABLE]
The Khintchine-Lévy sets are thus defined as
[TABLE]
for and . The sets and are defined analogously to definition 2.1. Theorem 2.2 for -sets reads
Theorem 4.1** (Estimates on the measure of -sets).**
Let be a natural number and let be a positive real number. Denote
[TABLE]
where is the Dirichlet function: . Define as in (8), but with in place of . With the notations of theorem 2.2 the estimates on the measures of are the same as in (9) and (10), but with in place of .
Proof.
For the theorem to be proven one needs lemma 3.2 to hold for the sequence along with the equality of averages for all . The claim on averages follows from the invariance of , as was the case with the sequence : we have for all . The cumulant estimates of theorem 3.12 depend only on the constants in the moment estimates and the value of . The latter stems in turn from the mixing coefficients of the sequence in question, which do not change when we switch from to . The Markov chain association assumption also holds for , only for a different sequence of functions: changes to in (25). With that the whole proof forms a food chain that feeds on the moment estimates, which read
[TABLE]
The changes of variables used along the way are and . We also employed the standard formulas for the Dirichlet function:
[TABLE]
and the fact that it is decreasing with so that for ∎
5. Properties of Khintchine-Lévy numbers
In this section we briefly compare Khintchine-Lévy numbers with Diophantine numbers. We begin by recalling the definition of the latter along with a few well-known properties.
Definition 5.1** (Diophantine number).**
Let and . We say that a real number is -Diophantine if the inequality
[TABLE]
holds for all integers and with . A number is called Diophantine if it is -Diophantine for some and .
We also have the following characterization of Diophanticity in terms of the continued fraction expansion:
Lemma 5.2** (Diophanticity in terms of the continued fraction expansion).**
If an irrational number is -Diophantine with and then its partial quotients can be estimated by
[TABLE]
Conversely, an estimate as in (47) for all results in being -Diophantine.
Proof.
If a number is -Diophantine we have which gives and this gives (47) since .
For the reverse implication fix and suppose we have . We have that for such and any and also for any . Therefore it suffices to show that with . Assuming (47) we have, however,
[TABLE]
The above reasoning works regardless of the choice of , therefore the proof is concluded. ∎
The denominators satisfy a recurrence relation
[TABLE]
which implies that
[TABLE]
for all through simple induction.
We first note that when a number is -Diophantine with (xi)(xi)(xi)Otherwise known as a constant type number. then it is also Khintchine-Lévy.
Lemma 5.3**.**
A number that is -Diophantine with some satisfies with and .
Proof.
By lemma 5.2 constant type numbers are precisely the ones with a bounded sequence of partial quotients: , which implies for all . ∎
Note, however, that constant type numbers form a set of measure zero ([11]). On the other hand, the complement of the set of Diophantine numbers with fixed and is small whenever is small:
Lemma 5.4** (Measure of the set of Diophantine numbers).**
The measure of the set of numbers that are not -Diophantine can be estimated from above by if and . Here denotes the Riemann function.
Proof.
The excluded numbers are contained in the set
[TABLE]
Each of the intervals has length equal to , apart from the intervals and and their total length (for a fixed ) adds up to , therefore . ∎
When it comes to on the other hand it turns out that Khintchine-Lévy numbers are Diophantine, but not the other way round.
Lemma 5.5**.**
If for some and then is -Diophantine with small enough and , where . If for some and , then it is -Diophantine with small enough and .
Proof.
From (49) we can infer that for any we have , where is the Fibonacci sequence with , and in consequence . Assuming that we have, for all , that
[TABLE]
By lemma 5.2 we see that is -Diophantine with and a suitably chosen (xii)(xii)(xii)Choosing we account for the fact that (52) holds for ..
The case of is similar with the exception that the estimates begin with
[TABLE]
to end with instead of and thus with . ∎
Note that in the first case in lemma 5.5 we can bring as close as we wish to by setting small, while in the second case the critical is .
Using the ideas of the proof of lemma 5.5 we can infer that for some implies at most exponential growth of partial quotients. Therefore any sequence of partial quotients that has a superexponential subsequence gives rise to a non- number . Using this we can construct a non- number , which is Diophantine. In fact can even have a very sparse distribution of partial quotients.
Example 5.6** (A non- Diophantine number).**
Fix and and set . We define through its partial quotients:
[TABLE]
bearing in mind that the second case in (54) may produce two or more values for small enough and . For fixed , however, there are only finitely many ’s for which this happens and if this is the case we define . We will not be interested in the initial partial quotients.
For the number has a superexponential subsequence of partial quotients, therefore it cannot be in for any and . We will show that is -Diophantine for any and small enough, where is any constant with and is a constant specified later in the proof.(xiii)(xiii)(xiii)The constant can be chosen as close to as we wish, at the expense of . Note that this way we can make the exponent as close to the critical as we wish.
To do this we will verify that for all we have
[TABLE]
as this entails (47) and we will be able to use lemma 5.2. After a minor alteration (55) is equivalent to
[TABLE]
Fix for large enough, so that there is no ambiguity in (54) and let be such that . First observe that since for . Also fix and note that for large enough. Additionally set to be the smallest number for which . We have
[TABLE]
If we now prove that (56) holds after we substitute with the right-hand side of (57) then the whole proof is concluded. To do this we need to consider two cases: and . In the first case , so we need
[TABLE]
to hold for all and some . This is, however, the case: the sequence in the largest brackets diverges to , so it must have a minimal value and we only need to set small enough to elevate the whole expression above [math] since .
The second case gives , we therefore similarly require
[TABLE]
Subtracting from both sides gives a similar inequality to (58), but with a different coefficient at , namely
[TABLE]
For we have and by an analogous argument to the one in previous case we can make inequality (59) valid choosing a small enough .
6. Measure of KL-sets: a practical point of view
In this section we focus on the numerical values of estimates of theorems 2.2 and 4.1 for particular values of . We outline the motivation for this in [9], where we perform estimates in a small divisors problem under the assumption that the frequency belongs to one of the -sets. It turns out that the quality of these estimates is best when is as small as possible. There is, however, a price to pay if we want to set small, namely we have to set large to obtain reasonable estimates on the measure of -sets.
To better illustrate our reasoning we will focus on the set . At the end of the section we present a detailed exposition of numerical values of estimates from theorem 2.2 for selected values of and . For simplicity we will consider the case when is a square of an integer, so that the finite sum term in the estimates of theorem 2.2 vanishes.
Inequality (9) written for tells us that the quantities that will be essential for us are the numerator
[TABLE]
and the denominator
[TABLE]
appearing on its right-hand side, our goal will be to make as close to [math] as possible. The numerical value of is(xiv)(xiv)(xiv)Analogously to a well known formula for we can express as a sum of an infinite series , which suggests that it is only reasonable to consider of the same order of magnitude (and also when considering the set (xv)(xv)(xv)Actually even , since all satisfy .). We will therefore consider to be a number satisfying . The problem is that evaluated even at a number as small as is very close to and the distance to gets even smaller as we decrease towards [math]. This makes small, which tells us that needs to be even smaller. For instance
[TABLE]
and this gives . The only thing we can do to overcome the effect of being big is manipulating the exponent , that appears in . It turns out that, for instance, to have we need . More general numerical values are provided in table 1 below.
Define . The cells in table 1 contain the approximations of minimal values of which guarantee that the estimate is better than the value given in the leftmost column with the value of given in the top row. For instance the bottom-right cell tells us that in order to have the estimate better than with one needs to have .
In other words the values appearing in table 1 tell us that if we want to have a guarantee that e.g. of numbers satisfy the inequality
[TABLE]
for all “large enough” then “large enough” means “greater than ”. Observe, however, that for a given value of the entries of the table are of the same order of magnitude. This means that in order to reach a sharp measure estimate one does not pay a significantly greater price than that of crossing the threshold given by the value in the “” line, the “currency” here being the amount of initial numbers that need to be excluded from our considerations.
The values for the -sets are provided in table 2, is defined analogously to .
7. Concluding remarks
Since theorem 2.2 holds for both and it is natural to ask whether it also does for . The sequence of denominators of convergents also enjoys exponential growth almost everywhere, with rate ([12]). We were, however, not able to reproduce the reasoning of section 3 due to a slightly different nature of this sequence, compared to either or . The first difference between and is in the averages: we have , while with a remainder bounded in . More importantly, however, the success of the reasoning in section 3 relies on the fact that can be expressed as the sum of summands , which satisfy both the mixing assumption and the Markov chain association assumption. For one can use the sequence as a counterpart of , but this sequence does not have the mixing property and this way the whole food chain of lemmas we used in 3 breaks apart. To see this we need to take a closer look at the structure of the “past” and the “future” -algebras of the sequence .(xvi)(xvi)(xvi)note that they are the same as the same -algebras for the sequence The latter is given by (xvii)(xvii)(xvii)By with no indices we mean the -algebra generated by random variables or sets in brackets. for and is thus generated by the preimages of singletons of rationals with . Due to how is constructed from the sets , however, are actually finite intersections of the preimages of singletons of positive integers through functions . As a consequence can actually be written as for any , which in particular means that contains all of the “past” -algebras with as they are actually equal to by a similar argument. This inclusion of -algebras is what prevents the mixing coefficients of from converging to [math] just as was the case in example 3.10, since admits sets of arbitrarily small measure.
We also made a choice of sticking to -mixing instead of -mixing even though we only consider quantities which are -mixing if they exhibit any kind of mixing. This is because in the formula for in definition 3.8 there is a dependence on for a mixing function and an integer index . In example 3.10 we learnt, however, that may be infinite, which would yield no control over and in consequence no control over the measure of sets. This is not the case if we consider -mixing.
The price we pay for this detail is, however, quite significant. The type of mixing we employ has an impact on the quality of the cumulant estimates in theorem 3.11. This result also has a -mixing counterpart ([17, Theorem 4.21, second inequality]) in which the cumulant estimates are better - instead of a factor in (33) there appears . This decrease of the exponent at , plugged into the rest of the food chain of theorems, would have an impact on theorem 2.2.
Observe that in the proof of this theorem we sum terms of the form (with ) which gives a slowly converging series and large values in tables 1 and 2. Using -mixing would switch the summands to a geometric progression whose series converges much faster and gives much better values in the counterpart of tables 1 and 2. The orders of magnitudes (i.e. the exponents at ) in said table would reduce roughly by half. We will delve into this matter in further research of the subject as the problem seems to stem directly from the fact that we are using very general large deviation theorems which do not take into account the specifics of the very well studied sequence .
Some evidence that the numbers obtained in tables 1 and 2 are far from optimal comes also from the analysis of the continued fraction of . A (non-rigorous) analysis of its initial partial quotients provided in [3] gives the maximal value of equal to (stemming from the unusually large ). We also have for , which is inconsistent with table 2 by several orders of magnitude if we assume that has a somewhat “generic” continued fraction expansion. For larger the difference is even more striking: considering gives an oscillation of the order of magnitude of between and . Using the data in [3] and the estimates of the current paper one can only derive a rather ineffective result on in spirit of the ones provided by tables 1 and 2.
Corollary 7.1**.**
The number satisfies the inequality
[TABLE]
for all with probability(xviii)(xviii)(xviii)As in the rest of the paper by probability we mean the Gauss measure . .
Proof.
The data in [3] give the estimate for . For we can use theorem 2.2 with which gives the base . ∎
Acknowledgements
During a part of the research, which led to preparation of this article, the author was supported by the Foundation for Polish Science under the MPD Programme Geometry and Topology in Physical Models, co-financed by the EU European Regional Development Fund, Operational Program Innovative Economy 2007-2013.
Appendix A Auxiliary identities
Lemma A.1** (Sum of ).**
For any and the following inequality holds:
[TABLE]
where . In particular when is a square of an integer the estimate takes form
[TABLE]
Proof.
The proof relies on the following identity:
[TABLE]
We have
[TABLE]
∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] V. I. Arnol’d. Small denominators. I. Mapping the circle onto itself. Izvestiya Akademii Nauk SSSR. Seriya Matematicheskaya , 25:21–86. English translation in Amer. Math. Soc. Transl. (2), 46:213–284, 1965.
- 2[2] R. Bentkus and R. Rudzkis. On exponential estimates of the distributions of random variables. Lithuanian Mathematical Journal , 20:15–30, 1980.
- 3[3] N. Bickford. Pi CF and the Continued Fraction of Pi. http://neilbickford.com/picf.htm, 2010. [Online, accessed 3-Oct-2018].
- 4[4] R. C. Bradley. Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions. Probab. Surveys , 2:107–144, 2005.
- 5[5] R. de la Llave, A. González, À. Jorba, and J. Villanueva. KAM theory without action-angle variables. Nonlinearity , 18:855–895, 2005.
- 6[6] S. Hörmann. Berry-Esseen bounds for econometric time series. Alea , 6:377–397, 2009.
- 7[7] I. A. Ibragimov. A theorem from the metric theory of continued fractions. Vestnik Leningrad. Univ. , 16(1):13–24, 1961.
- 8[8] M. Iosifescu and C. Kraaikamp. Metrical Theory of Continued Fractions . Springer Netherlands, 2002.
