A quantitative version of the theorem on Khintchine's constant

Piotr Kamie\'nski

arXiv:1903.00255·math.DS·March 4, 2019

A quantitative version of the theorem on Khintchine's constant

Piotr Kamie\'nski

PDF

TL;DR

This paper provides explicit measure estimates for the set of numbers with continued fraction partial quotients' products growing exponentially at a rate close to Khintchine's constant, using large deviations theory and cumulant methods.

Contribution

It offers a quantitative, non-asymptotic measure estimate for numbers with continued fraction products near Khintchine's predicted growth rate, with explicit bounds.

Findings

01

Measure estimates can be made arbitrarily close to full for large N.

02

Bounds are explicit and not asymptotic, depending on parameters.

03

Employs large deviations theory and cumulant method in proof.

Abstract

In the paper we provide measure estimates for the set of numbers whose sequence of products of continued fraction partial quotients $M_{n} = a_{1} \dots a_{n}$ has exponential growth with rate close to the one predicted by Khintchine's theorem, i.e. for which \begin{equation*} e^{(\kappa - T)n} \leqslant M_n \leqslant e^{(\kappa + T)n} \end{equation*} for a fixed $T > 0$ and all $n$ greater than some fixed integer $N$ , where $e^{κ} = 2.685 \dots$ is the Khintchine constant. Choosing $N$ large enough the measure can be made arbitrarily close to full, for any given $T$ . The bounds are not of asymptotic nature, but explicit in terms of the parameters involved. In the proof we compile several known result of large deviations theory, employing the cumulant method in particular. We also discuss the numerical values of the quantities involved.

Tables2

Table 1. Table 1. Approximate minimal value of N = K 2 𝑁 superscript 𝐾 2 N=K^{2} for given T 𝑇 T and desired 𝖾𝗌𝗍 ′ superscript 𝖾𝗌𝗍 ′ \mathsf{est}^{\prime} .

	$T = 2$	$T = 1$	$T = 0.5$	$T = 0.1$
${𝖾𝗌𝗍}^{'} > 1 %$	$3.969 𝚎𝟿$	$8.122 𝚎𝟷𝟶$	$1.633 𝚎𝟷𝟸$	$1.610 𝚎𝟷𝟻$
${𝖾𝗌𝗍}^{'} > 50 %$	$4.225 𝚎𝟿$	$8.585 𝚎𝟷𝟶$	$1.724 𝚎𝟷𝟸$	$1.681 𝚎𝟷𝟻$
${𝖾𝗌𝗍}^{'} > 90 %$	$4.900 𝚎𝟿$	$9.860 𝚎𝟷𝟶$	$1.949 𝚎𝟷𝟸$	$1.854 𝚎𝟷𝟻$
${𝖾𝗌𝗍}^{'} > 99 %$	$6.084 𝚎𝟿$	$1.183 𝚎𝟷𝟷$	$2.292 𝚎𝟷𝟸$	$2.116 𝚎𝟷𝟻$
${𝖾𝗌𝗍}^{'} > 99.9 %$	$7.225 𝚎𝟿$	$1.399 𝚎𝟷𝟷$	$2.663 𝚎𝟷𝟸$	$2.394 𝚎𝟷𝟻$

Table 2. Table 2. Approximate minimal value of N = K 2 𝑁 superscript 𝐾 2 N=K^{2} for given T 𝑇 T and desired 𝖾𝗌𝗍 𝖾𝗌𝗍 \mathsf{est} .

	$T_{+} = 2$	$T_{+} = 1$	$T_{+} = 0.5$	$T_{+} = 0.1$
$𝖾𝗌𝗍 > 1 %$	$2.074 𝚎𝟷𝟶$	$4.238 𝚎𝟷𝟷$	$8.456 𝚎𝟷𝟸$	$8.154 𝚎𝟷𝟻$
$𝖾𝗌𝗍 > 50 %$	$2.220 𝚎𝟷𝟶$	$4.489 𝚎𝟷𝟷$	$8.898 𝚎𝟷𝟸$	$8.496 𝚎𝟷𝟻$
$𝖾𝗌𝗍 > 90 %$	$2.560 𝚎𝟷𝟶$	$5.112 𝚎𝟷𝟷$	$9.992 𝚎𝟷𝟸$	$9.328 𝚎𝟷𝟻$
$𝖾𝗌𝗍 > 99 %$	$3.098 𝚎𝟷𝟶$	$6.068 𝚎𝟷𝟷$	$1.166 𝚎𝟷𝟹$	$1.058 𝚎𝟷𝟼$
$𝖾𝗌𝗍 > 99.9 %$	$3.686 𝚎𝟷𝟶$	$7.090 𝚎𝟷𝟷$	$1.345 𝚎𝟷𝟹$	$1.192 𝚎𝟷𝟼$

Equations140

e^{(κ - T) n} ⩽ M_{n} ⩽ e^{(κ + T) n}

e^{(κ - T) n} ⩽ M_{n} ⩽ e^{(κ + T) n}

ω = [a_{1}, a_{2}, a_{3}, \dots] = \frac{1}{a _{1} + \frac{1}{a _{2} + \frac{1}{a _{3} + \frac{1}{⋱}}}} .

ω = [a_{1}, a_{2}, a_{3}, \dots] = \frac{1}{a _{1} + \frac{1}{a _{2} + \frac{1}{a _{3} + \frac{1}{⋱}}}} .

G (ω) = G ([a_{1}, a_{2}, a_{3}, \dots]) = [a_{2}, a_{3}, \dots] = {\frac{1}{ω}} .

G (ω) = G ([a_{1}, a_{2}, a_{3}, \dots]) = [a_{2}, a_{3}, \dots] = {\frac{1}{ω}} .

M_{n} (ω) := a_{1} (ω) \cdot \dots \cdot a_{n} (ω), X_{n} := lo g a_{n} \mbox an d S_{n} := X_{1} + \dots + X_{n} .

M_{n} (ω) := a_{1} (ω) \cdot \dots \cdot a_{n} (ω), X_{n} := lo g a_{n} \mbox an d S_{n} := X_{1} + \dots + X_{n} .

κ = \int_{X} lo g a_{1} d γ = \int_{0}^{1} lo g (⌊ x^{- 1} ⌋) \frac{d x}{( 1 + x ) lo g 2} .

κ = \int_{X} lo g a_{1} d γ = \int_{0}^{1} lo g (⌊ x^{- 1} ⌋) \frac{d x}{( 1 + x ) lo g 2} .

M_{n} (ω) ⩽ e^{(κ + T_{+}) n} .

M_{n} (ω) ⩽ e^{(κ + T_{+}) n} .

e^{(κ - T_{-}) n} ⩽ M_{n} (ω) .

e^{(κ - T_{-}) n} ⩽ M_{n} (ω) .

KL_{n}^{+} (T) := {ω \in X : M_{n} (ω) ⩽ e^{(κ + T) n}}

KL_{n}^{+} (T) := {ω \in X : M_{n} (ω) ⩽ e^{(κ + T) n}}

Ξ (T) = exp - \frac{T ^{2}}{2 ( 128 r ˉ ^{2} Λ ˉ + ( ( 16 r ˉ Λ ˉ ) ^{1/3} T ) ) ^{3/2}},

Ξ (T) = exp - \frac{T ^{2}}{2 ( 128 r ˉ ^{2} Λ ˉ + ( ( 16 r ˉ Λ ˉ ) ^{1/3} T ) ) ^{3/2}},

γ (KL^{\pm} (T, N)) ⩾ 1 - n = N \sum K^{2} - 1 Ξ (T)^{n} + \frac{Ξ ( T ) ^{K}}{1 - Ξ ( T )} (2 K + 1 + \frac{4Ξ ( T )}{1 - Ξ ( T )}) .

γ (KL^{\pm} (T, N)) ⩾ 1 - n = N \sum K^{2} - 1 Ξ (T)^{n} + \frac{Ξ ( T ) ^{K}}{1 - Ξ ( T )} (2 K + 1 + \frac{4Ξ ( T )}{1 - Ξ ( T )}) .

γ (KL^{\pm} (T, N)) ⩾ 1 - (1 - Ξ (T))^{- 1} (2 N + 1 + \frac{4Ξ ( T )}{1 - Ξ ( T )}) \cdot Ξ (T)^{N} .

γ (KL^{\pm} (T, N)) ⩾ 1 - (1 - Ξ (T))^{- 1} (2 N + 1 + \frac{4Ξ ( T )}{1 - Ξ ( T )}) \cdot Ξ (T)^{N} .

KL^{\pm} (T, N)^{c} = n = N ⋃ \infty KL_{n}^{\pm} (T)^{c} .

KL^{\pm} (T, N)^{c} = n = N ⋃ \infty KL_{n}^{\pm} (T)^{c} .

KL_{n}^{+} (T)^{c} = {ω \in X : \frac{1}{n} j = 1 \sum n (X_{j} - κ) ⩾ T} = {ω \in X : S_{n} - nκ ⩾ T n}

KL_{n}^{+} (T)^{c} = {ω \in X : \frac{1}{n} j = 1 \sum n (X_{j} - κ) ⩾ T} = {ω \in X : S_{n} - nκ ⩾ T n}

\Gamma_{k}(Y)={1\over i^{k}}{d^{k}\over dt^{k}}\left(\log\left(\mathbb{E}_{\mu}e^{itY}\right)\right)\Big{|}_{t=0}.

\Gamma_{k}(Y)={1\over i^{k}}{d^{k}\over dt^{k}}\left(\log\left(\mathbb{E}_{\mu}e^{itY}\right)\right)\Big{|}_{t=0}.

E_{γ} ∣ X_{n} ∣^{k} ⩽ k! \cdot \overset{r}{ˉ}^{k} .

E_{γ} ∣ X_{n} ∣^{k} ⩽ k! \cdot \overset{r}{ˉ}^{k} .

\overset{r}{ˉ} = \frac{3}{2 lo g 2} \approx 1.471.

\overset{r}{ˉ} = \frac{3}{2 lo g 2} \approx 1.471.

E_{γ} ∣ X_{n} ∣^{k} ⩽ \overset{r}{ˉ}^{2} \cdot k! .

E_{γ} ∣ X_{n} ∣^{k} ⩽ \overset{r}{ˉ}^{2} \cdot k! .

E_{γ} ∣ X_{n} ∣^{k} = E_{γ} ∣ lo g a_{n} ∣^{k} = (⋆) E_{γ} ∣ lo g a_{1} ∣^{k} = \int_{0}^{1} \frac{∣ lo g ⌊ x ^{- 1} ⌋ ∣ ^{k}}{( 1 + x ) lo g 2} d x ⩽ (⋆⋆) \int_{0}^{1} \frac{∣ lo g ( x ^{- 1} - 1 ) ∣ ^{k}}{( 1 + x ) lo g 2} d x = = \int_{0}^{\infty} \frac{∣ lo g ^{k} y ∣ d y}{( 1 + y ) ( 2 + y ) lo g 2} = \int_{- \infty}^{\infty} \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} = = (\int_{- \infty}^{0} + \int_{0}^{\infty}) \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} .

E_{γ} ∣ X_{n} ∣^{k} = E_{γ} ∣ lo g a_{n} ∣^{k} = (⋆) E_{γ} ∣ lo g a_{1} ∣^{k} = \int_{0}^{1} \frac{∣ lo g ⌊ x ^{- 1} ⌋ ∣ ^{k}}{( 1 + x ) lo g 2} d x ⩽ (⋆⋆) \int_{0}^{1} \frac{∣ lo g ( x ^{- 1} - 1 ) ∣ ^{k}}{( 1 + x ) lo g 2} d x = = \int_{0}^{\infty} \frac{∣ lo g ^{k} y ∣ d y}{( 1 + y ) ( 2 + y ) lo g 2} = \int_{- \infty}^{\infty} \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} = = (\int_{- \infty}^{0} + \int_{0}^{\infty}) \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} .

(\int_{- \infty}^{0} + \int_{0}^{\infty}) \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} = \frac{1}{lo g 2} (\int_{0}^{\infty} \frac{u ^{k} e ^{u} d u}{( 1 + e ^{u} ) ( 2 + e ^{u} )} + \int_{\infty}^{0} \frac{u ^{k} e ^{- u} ( - d u )}{( 1 + e ^{- u} ) ( 2 + e ^{- u} )}) = = \frac{1}{lo g 2} \int_{0}^{\infty} u^{k} e^{u} [\frac{1}{( 1 + e ^{u} ) ( 2 + e ^{u} )} + \frac{1}{( e ^{u} + 1 ) ( 2 e ^{u} + 1 )}] d u = = \frac{1}{lo g 2} \int_{0}^{\infty} \frac{3 u ^{k} e ^{u} d u}{( 2 + e ^{u} ) ( 1 + 2 e ^{u} )} ⩽ \frac{3}{2 lo g 2} \int_{0}^{\infty} u^{k} e^{- u} d u = \frac{3}{2 lo g 2} \cdot k! .

(\int_{- \infty}^{0} + \int_{0}^{\infty}) \frac{∣ z ∣ ^{k} e ^{z} d z}{( 1 + e ^{z} ) ( 2 + e ^{z} ) lo g 2} = \frac{1}{lo g 2} (\int_{0}^{\infty} \frac{u ^{k} e ^{u} d u}{( 1 + e ^{u} ) ( 2 + e ^{u} )} + \int_{\infty}^{0} \frac{u ^{k} e ^{- u} ( - d u )}{( 1 + e ^{- u} ) ( 2 + e ^{- u} )}) = = \frac{1}{lo g 2} \int_{0}^{\infty} u^{k} e^{u} [\frac{1}{( 1 + e ^{u} ) ( 2 + e ^{u} )} + \frac{1}{( e ^{u} + 1 ) ( 2 e ^{u} + 1 )}] d u = = \frac{1}{lo g 2} \int_{0}^{\infty} \frac{3 u ^{k} e ^{u} d u}{( 2 + e ^{u} ) ( 1 + 2 e ^{u} )} ⩽ \frac{3}{2 lo g 2} \int_{0}^{\infty} u^{k} e^{- u} d u = \frac{3}{2 lo g 2} \cdot k! .

φ (s, t) = sup ∣ μ (B ∣ A) - μ (B) ∣

φ (s, t) = sup ∣ μ (B ∣ A) - μ (B) ∣

φ_{n} = k \in N sup φ (k, k + n) .

φ_{n} = k \in N sup φ (k, k + n) .

ψ (s, t) = sup \frac{μ ( A \cap B )}{μ ( A ) μ ( B )} - 1

ψ (s, t) = sup \frac{μ ( A \cap B )}{μ ( A ) μ ( B )} - 1

ψ_{n} ⩽ ψ_{2} λ_{0}^{n - 2}

ψ_{n} ⩽ ψ_{2} λ_{0}^{n - 2}

Y_{n} = f_{n} (ξ_{n})

Y_{n} = f_{n} (ξ_{n})

s_{n} = [a_{n}, a_{n - 1}, \dots, a_{1}] = \frac{1}{a _{n} + \frac{1}{a _{n - 1} + \frac{1}{⋱ + \frac{1}{a _{1}}}}} .

s_{n} = [a_{n}, a_{n - 1}, \dots, a_{1}] = \frac{1}{a _{n} + \frac{1}{a _{n - 1} + \frac{1}{⋱ + \frac{1}{a _{1}}}}} .

f_{1} (ξ) = \dots = f_{n} (ξ) = \dots = lo g ⌊ ξ^{- 1} ⌋ .

f_{1} (ξ) = \dots = f_{n} (ξ) = \dots = lo g ⌊ ξ^{- 1} ⌋ .

Λ_{n} (f, u) = max {1, 1 ⩽ s ⩽ n max t = s \sum n f (s, t)^{1/ u}} .

Λ_{n} (f, u) = max {1, 1 ⩽ s ⩽ n max t = s \sum n f (s, t)^{1/ u}} .

Λ_{n} (φ, 2) ⩽ \overset{ˉ}{Λ},

Λ_{n} (φ, 2) ⩽ \overset{ˉ}{Λ},

\overset{ˉ}{Λ} = 1 + (lo g 2 - \frac{1}{2})^{1/2} + (\frac{π ^{2} lo g 2}{12} - \frac{1}{2})^{1/2} \cdot \frac{1}{1 - λ _{0}^{1/2}} \approx 2.029

\overset{ˉ}{Λ} = 1 + (lo g 2 - \frac{1}{2})^{1/2} + (\frac{π ^{2} lo g 2}{12} - \frac{1}{2})^{1/2} \cdot \frac{1}{1 - λ _{0}^{1/2}} \approx 2.029

ψ (s, s) ⩾ n sup \frac{μ ( C _{n} \cap C _{n} )}{μ ( C _{n} ) μ ( C _{n} )} - 1 = n sup \frac{1}{μ ( C _{n} )} - 1 = \infty

ψ (s, s) ⩾ n sup \frac{μ ( C _{n} \cap C _{n} )}{μ ( C _{n} ) μ ( C _{n} )} - 1 = n sup \frac{1}{μ ( C _{n} )} - 1 = \infty

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

A quantitative version of the theorem on Khintchine’s constant

Piotr Kamieński

Abstract.

In the paper we provide measure estimates for the set of numbers whose sequence of products of continued fraction partial quotients $M_{n}=a_{1}\ldots a_{n}$ has exponential growth with rate close to the one predicted by Khintchine’s theorem, i.e. for which

[TABLE]

for a fixed $T>0$ and all $n$ greater than some fixed integer $N$ , where $e^{\kappa}=2.685\ldots$ is the Khintchine constant. Choosing $N$ large enough the measure can be made arbitrarily close to full, for any given $T$ . The bounds are not of asymptotic nature, but explicit in terms of the parameters involved. In the proof we compile several known result of large deviations theory, employing the cumulant method in particular. We also discuss the numerical values of the quantities involved.

1. Motivation

Diophantine (and Brjuno) numbers(i)(i)(i)see definition 5.1 for details are commonly used in small divisors problems ([1, 16, 18]). In KAM theory, for instance, if the frequency $\omega$ of an invariant torus is Diophantine then this torus survives once a perturbation is introduced - this happens for small enough perturbation parameter values and under the additional twist condition. The term “small enough”, however, if specified precisely by the KAM-type theorem usually means “smaller than some explicit formula depending on the Diophantine constant $C$ and exponent $\tau$ ”(as in e.g. [5]). The problem with this approach appears when we consider a family of tori with varying frequencies. Changing $\omega$ by replacing either its first few decimal digits or continued fraction partial quotients does not change $\tau$ , but might decrease $C$ quite significantly, lowering the applicability threshold of the theorem through a multiplicative correction.

We propose an alternative to the Diophantine condition, what we call the Khintchine-Lévy condition (or $\mathtt{KL}-$ condition for short) to account for this disadvantage. In the present paper we give the definition of the Khintchine-Lévy numbers and prove that the set of all those numbers is generic in the sense of Lebesgue measure, as is the case with the set of Diophantine numbers. Specifically we provide explicit lower bounds on the measure of the set of $\mathtt{KL}$ numbers in terms of the parameters involved in their definition. In a parallel paper [9] we employ $\mathtt{KL}$ numbers to prove a small denominators result that is similar in nature to ones already obtained for Diophantine numbers. Khintchine-Lévy numbers, however, have one advantage over Diophantine ones, namely they are less sensitive to the aforementioned changes in the initial part of the continued fraction. In the estimates in [9] such changes are reflected only through a minor additive correction.

We briefly introduce some notation before proceeding with the details. We will be working with irrational numbers $\omega\in\mathbb{X}:=[0,1]\setminus\mathbb{Q}$ considered as a probability space with the Borel $\sigma-$ algebra and either the Lebesgue measure $\lambda$ or the Gauss measure $\gamma$ given in terms of a density function(ii)(ii)(ii)Note that in particular the two measures are absolutely continuous with respect to one another. In particular the terms “Gauss almost all” and “Lebesgue almost all” can be used interchangeably.: $d\gamma(x)={d\lambda(x)\over(1+x)\log 2}$ . Expected value of a random variable w.r.t. a measure $\mu$ will be denoted by $\mathbb{E}_{\mu}$ and $A^{c}$ will denote the complement of a set $A\subset\mathbb{X}$ .

Each $\omega\in\mathbb{X}$ has a unique infinite continued fraction expansion into a sequence of partial quotients $a_{j}~{}\in~{}\mathbb{N}_{+},j=1,2,\ldots$ (iii)(iii)(iii)Note that we consider $a_{0}=0$ , since we are in $\mathbb{X}\subset(0,1)$ .:

[TABLE]

The shift on the continued fraction expansion is a measurable transformation known as the Gauss map $G:\mathbb{X}\mapsto\mathbb{X}$ :

[TABLE]

It preserves the Gauss measure and is ergodic with respect to that measure ([15]). We also note that $a_{n}(\omega)=a_{1}(G^{n-1}(\omega))$ . For $n\geqslant 1$ we denote

[TABLE]

The main motivation comes from the classical theorem on Khintchine’s constant ([10]). It tells us that for almost all $\omega\in\mathbb{X}$ the limit of ${1\over n}S_{n}(\omega)$ exists, is finite and constant as a function of $\omega$ within said full measure set. We denote this limit as $\kappa$ and refer to $e^{\kappa}$ as Khintchine constant(v)(v)(v)This is consistent with existing literature, where Khintchine’s constant is defined as the pointwise a.e. limit of $\root n \of{M_{n}}$ .. One can observe that ${1\over n}S_{n}(\omega)$ is actually the time- $n$ average of the test function $X_{1}$ along the orbit of $\omega$ under the action of $G$ . Khintchine’s theorem is thus a consequence of the Birkhoff pointwise ergodic theorem and $\kappa$ must be equal to the spatial average of $X_{1}$ :

[TABLE]

In probabilists’ language Khintchine’s theorem is actually a strong law of large numbers for the sequence of “samples” $(X_{n})_{n=1}^{\infty}$ . This result was later improved in the form of a plethora of limit theorems (see the monograph [8, Chapter 3] and references therein for a survey). To the author’s knowledge, however, all of the existing results are of asymptotic nature, but none provide exact estimates of the measure with explicitly computed constants. Our main result, theorem 2.2, aims to fill this gap in.

It is also worth noting that the sequence of denominators $(q_{n})_{n=1}^{\infty}$ of convergents(vii)(vii)(vii)defined as the denominators of the reduced fraction obtained by truncating the continued fraction expansion at $a_{n}$ to $\omega$ has also been extensively studied in the literature. In [9] we discuss why this sequence is even more important from the point of view small denominators problems and KAM theory. Notable results include the analogue of Khintchine’s theorem by Khintchine and Lévy ([12]), its refinement by Philipp and Stackelberg in the form of a law of the iterated logarithm ([14]) and further refinements by Ibragimov [7] and Misevičius [13], who obtained a central limit theorem with error bounds. We were, however, unable to prove the counterpart of theorem 2.2 for the sequence $(q_{n})$ for technical reasons, which we discuss in the final section 7.

In section 2 we provide the reader with a precise definition of a Khintchine-Lévy number in definition 2.1 and later in theorem 2.2 we specify how far from full the measure of the set of these numbers is. In section 3 we introduce all the necessary tools and combine them into a proof of this result, the most important one being the cumulant method in theorem 3.13, which provides estimates for the tails of a r.v. given the estimates for its cumulants. In section 4 we formulate and discuss the proof of a variation on theorem 2.2 for a slight modification of the sequence $M_{n}$ . Section 5 contains a brief comparison of the Khintchine-Lévy numbers with Diophantine numbers. We conclude the paper with a brief practical analysis of the numerical values of parameters used along its course and some final remarks in sections 6 and 7. The simple, but lengthy formulas are contained within appendix A for clarity.

2. Main result

The idea behind the Khintchine-Lévy condition is the following. From Khintchine’s theorem we can vaguely conclude that on a full measure set of $\omega$ the sequence $M_{n}(\omega)$ asymptotically exhibits exponential growth similar to $e^{\kappa n}$ . We therefore conjecture that on a slightly smaller set, one whose measure is only close to full, the sequence $M_{n}(\omega)$ also exhibits exponential growth, but with slightly more relaxed requirements on its rate. Along the course of the paper we will learn that this is indeed the case, as stated in theorem 2.2.

Definition 2.1 (Khintchine-Lévy condition).

We say that an irrational number $\omega$ is upper- $\mathtt{KL}$ with constants $T_{+}>0$ and $N\in\mathbb{N}$ if the following inequality holds for all $n\geqslant N$ :

[TABLE]

Similarly, a number is lower- $\mathtt{KL}$ with constants $T_{-}>0$ and $N\in\mathbb{N}$ if for all $n\geqslant N$ we have

[TABLE]

We denote the sets formed by the numbers $\omega$ with the above properties by, respectively, $\mathtt{KL}^{+}(T_{+},N)$ and $\mathtt{KL}^{-}(T_{-},N)$ . We also denote $\mathtt{KL}(T_{-},T_{+},N):=\mathtt{KL}^{+}(T_{+},N)\cap\mathtt{KL}^{-}(T_{-},N)$ and $\mathtt{KL}(T,N):=\mathtt{KL}(T,T,N)$ where $T>0$ .

Also, for a given natural number $n$ , we denote by $\mathtt{KL}^{+}_{n}(T)$ the set

[TABLE]

and similarly for $\mathtt{KL}^{-}$ and $\mathtt{KL}$ .

If a set of numbers $A\subset\mathbb{X}$ is of the form $A=\mathtt{KL}^{\pm}(T,N)$ for some $T$ and $N$ we will refer to it as a (upper/lower-)Khintchine-Lévy set or a $\mathtt{KL}$ -set for short.

We are now ready to state the main result of this paper, which is in fact the aforementioned conjecture with all the necessary details accounted for.

Theorem 2.2 (Estimates on the measure of $\mathtt{KL}$ -sets).

Let $N$ be a natural number and let $T$ be a positive real number. Denote

[TABLE]

where $\bar{r}$ and $\bar{\Lambda}$ are universal constants given in (15) and (28). Also denote $K=\left\lceil\sqrt{N}\right\rceil$ . The lower bounds on the Gauss measures of Khintchine-Lévy sets are given by

[TABLE]

In particular for $N=K^{2}$ being a square of an integer we have

[TABLE]

Formulas (8) and (10) imply in particular that regardless how small $T$ is we can still find an $N=N(T)$ such that the measure is $\varepsilon$ -close to full for any fixed $\varepsilon>0$ . In section 6 we discuss the function $N(T)$ from a numerical point of view.

3. Proof of theorem 2.2

To estimate the measure of $\mathtt{KL}^{\pm}(T,N)$ from below is the same as to estimate the measure of its complement $\mathtt{KL}^{\pm}(T,N)^{c}$ from above. The complement, however, can be expressed as a sum of complements of $\mathtt{KL}_{n}(T)$ :

[TABLE]

Our focus will therefore be centered on estimating $\gamma(\mathtt{KL}_{n}^{\pm}(T)^{c})$ from above to use subadditivity of $\gamma$ in the end.(viii)(viii)(viii)The sum in (11) is not disjoint, but we are not concerned with the overlaps of the summands in the proof.

3.1. $\mathtt{KL}$ -sets as tails of probability distributions

To perform the proof of theorem 2.2 we first need to reformulate its statement in spirit of large deviations theory, we will mainly use the language of random variables $X_{n}$ and $S_{n}$ . Once this is done we will lay the framework of the proof out and fill in all the details in all the following subsections of this section.

First observe that $\mathbb{E}_{\gamma}X_{j}=\kappa$ for all $j$ and thus $\mathbb{E}_{\gamma}S_{n}=n\kappa$ - this is a consequence of the fact that $X_{j}=X_{1}\circ G^{j-1}$ and the $G$ -invariance of $\gamma$ (recall (4)). Using this we can now write $\mathtt{KL}_{n}^{+}(T)^{c}$ in terms of centerings of $X_{j}$ and $S_{n}$ :

[TABLE]

and similarly for $\mathtt{KL}_{n}^{-}(T)$ . This way estimating $\gamma(\mathtt{KL}_{n}^{\pm}(T))$ from below is the same as estimating the right/left tail of the centering of $S_{n}$ from above.

Our strategy will be the following. We first estimate the moments of $X_{n}$ in lemma 3.2. These moment estimates will allow us to use theorem 3.11 to obtain estimates on the cumulants of $S_{n}$ and also of $S_{n}-n\kappa$ (ix)(ix)(ix)Shifting a random variable by a constant affects only the first cumulant, the only one we will not be concerned with. - for this, however, we will need two additional assumptions on $X_{n}$ : the $\varphi$ -mixing assumption and the Markov chain association assumption. We introduce them in definitions 3.3 and 3.6 and verify their validity for $X_{n}$ in lemmas 3.5 and 3.7. Once we have the cumulant estimates of $S_{n}-n\kappa$ we can estimate its tails - this is done with the help of theorem 3.13.

Before we proceed we clarify what we exactly mean by moments and cumulants.

Definition 3.1.

Let $k\in\mathbb{N}_{+}$ and let $Y$ be a random variable on a probability space $(\mathbb{Y},\mathcal{Y},\mu)$ . We define the $k$ -th moment of $Y$ as $\mathbb{E}_{\mu}|Y|^{k}$ and the $k$ -th cumulant of $Y$ as

[TABLE]

We will sometimes refer to $t\mapsto\log\left(\mathbb{E}_{\mu}e^{itY}\right)$ as the cumulant generating function.

3.2. Moment estimates

Lemma 3.2 (Estimates of the moments of $X_{n}$ ).

The following estimates on the $k$ -th moment of $X_{n}$ are valid for any $k\geqslant 2$ and $n\geqslant 1$ :

[TABLE]

Here

[TABLE]

Proof.

We will prove a stronger inequality, namely

[TABLE]

In the formulation of the lemma, however, we decided to keep the (severe) exponential overestimation so that our result fits the framework of theorem 3.11.

[TABLE]

Equality $(\star)$ is a consequence of $G$ -invariance of $\gamma$ , while in $(\star\star)$ we used the fact that $\lfloor x^{-1}\rfloor\geqslant x^{-1}-1$ and that on the interval $(0,1)$ the function $|\log^{k}(\cdot)|$ is decreasing. In the equalities following $(\star\star)$ we simply substituted $x^{-1}-1$ for $y$ and $y$ for $e^{z}$ , respectively. After splitting the integral into the sum of two integrals we change the variables once again: on $[0,\infty)$ from $z$ to $u$ and on $(-\infty,0)$ from $z$ to $-u$ . As a result we obtain

[TABLE]

The last equality stems from the definition of the Euler gamma function and the fact that for integer arguments we have $\Gamma(k+1)=k!$ . ∎

3.3. Mixing properties of $(X_{n})$

There is a number of types of mixing for sequences of random variables (for a deeper insight see e.g. [6] and references therein or [4]), the main idea behind all of them being the following: the further away from each other two random variables are in the sequence (in terms of the indexing number $n$ ) the closer they are to being independent. We will be primarily interested in the notion of $\varphi$ -mixing. However, a stronger property of $\psi$ -mixing will also prove to be a useful tool.

Definition 3.3 ( $\varphi$ -mixing sequence of r.vs, $\varphi$ -mixing function and $\varphi$ -mixing coefficients).

Let $(Y_{\nu})_{\nu=1}^{\infty}$ be a sequence of random variables on a probability space $(\mathbb{Y},\mathcal{Y},\mu)$ . For indices $a\leqslant b\in\mathbb{N}_{*}\cup\{\infty\}$ denote by $\sigma_{a}^{b}$ the $\sigma$ -algebra generated by random variables $Y_{\nu}$ with $a\leqslant\nu\leqslant b$ . We define the $\varphi$ -mixing function of the sequence $(Y_{\nu})$ to be $\varphi:\mathbb{N}^{2}\mapsto[0,1]$ given by

[TABLE]

where the supremum is taken over $A\in\sigma_{1}^{s},B\in\sigma_{t}^{\infty}$ for which $\mu(A)>0$ .

We define the $\varphi$ -mixing coefficients of the sequence $(Y_{\nu})$ to be

[TABLE]

We say that the sequence $(Y_{\nu})$ is $\varphi$ -mixing (w.r.t. $\mu$ ) if $\varphi_{n}\to 0$ as $n\to\infty$ .

The property of $\psi$ -mixing is defined analogously, we alter only the mixing function in the definition:

Definition 3.4 ( $\psi$ -mixing sequence of r.vs, $\psi$ -mixing function and $\psi$ -mixing coefficients).

With the notations of definition 3.3 we define the $\psi$ -mixing function of the sequence $(Y_{\nu})$ to be $\psi:\mathbb{N}^{2}\mapsto[0,\infty]$ given by

[TABLE]

where the supremum is taken over $A\in\sigma_{1}^{s},B\in\sigma_{t}^{\infty}$ for which $\mu(A)\mu(B)>0$ .

The $\psi$ -mixing coefficients $\psi_{n}$ are defined analogously to $\varphi_{n}$ in definition 3.3 and the sequence $(Y_{\nu})$ is called $\psi$ -mixing if they tend to [math] with $n\to\infty$ .

The $\psi$ -mixing property entails $\varphi$ -mixing and additionally $\varphi_{n}\leqslant\psi_{n}/2$ ([4]). It turns out that the sequence $(a_{n})$ enjoys the $\psi$ -mixing property and the mixing coefficients decay at least exponentially fast:

Lemma 3.5 (Quantitative estimates on the mixing coefficients of $(a_{n})$ , [8, Proposition 2.3.7]).

The coefficients $\psi_{n}$ of the sequence $(a_{n})$ are bounded from above by $\psi_{1}=2\log 2-1\approx 0.386$ , $\psi_{2}={\pi^{2}\log 2\over 6}-1\approx 0.140$ and

[TABLE]

for all $n\geqslant 2$ , where $\lambda_{0}$ is the Gauss-Kuzmin-Wirsing constant whose approximate value is $\lambda_{0}\approx 0.304$ .

Lemma 3.5 holds true also for the sequence $(X_{n}^{\dagger})_{n=1}^{\infty}$ (with exactly the same mixing coefficients). This is because the $\psi$ -mixing property depends only on $\sigma-$ algebras generated by the initial and tail parts of the sequence in question and these do not change upon composing the sequence with a bijective, measurable function (recall that $X_{n}=\log a_{n}$ ). This exponential decay will be useful for us in the technical results of subsection 3.5.

3.4. Markov chain association

Definition 3.6 (Sequence of r.vs. associated to a Markov chain).

We say that a sequence of random variables $(Y_{n})_{n=1}^{\infty}$ on a probability space $(\mathbb{Y},\mathcal{Y},\mu)$ is associated to a Markov chain through a sequence of functions $(f_{n})_{n=1}^{\infty},f_{n}~{}:~{}\mathbb{R}~{}\mapsto~{}\mathbb{R}$ if

[TABLE]

for a Markov chain $(\xi_{n})_{n=1}^{\infty}$ .

Lemma 3.7.

The sequence $(X_{n})_{n=1}^{\infty}$ is associated to the Markov chain

[TABLE]

through the sequence of functions $(f_{n})_{n=1}^{\infty}$ given by

[TABLE]

Proof.

Equality $X_{n}=\log\left\lfloor\left(s_{n}\right)^{-1}\right\rfloor$ is a direct consequence of $a_{n}<s_{n}^{-1}<a_{n}+1$ , which stems from the definition of $s_{n}$ . We thus have to prove that $(s_{n})_{n=1}^{\infty}$ is indeed a Markov chain. The definition of a Markov chain requires a choice of probability (in our case a natural one would be to choose $\gamma$ ). However, $(s_{n})$ is a Markov chain for any probability (for which the definition of a Markov chain makes sense). Observe that once the chain is at some state $\bar{s}\in\mathbb{Q}$ we can uniquely determine all its past states through the shift $G$ . This way any conditional probability under the condition of all past states being fixed is actually the conditional probability under the condition of just the previous state being fixed, provided that these probabilities are nonzero, which is the case for $\gamma$ . ∎

3.5. The $\Lambda(f,n)$ quantity

We now have almost all the necessary tools to proceed with the estimation of the cumulants. We need, however, to define and study one more quantity - $\Lambda_{n}(f,u)$ . Its nature is purely technical, but it will become crucial for us in the formulation of theorem 3.11.

Definition 3.8 (The $\Lambda_{n}(f,u)$ quantity).

Let $f:\mathbb{N}^{2}\mapsto\mathbb{R}$ be a function and let $u\neq 0$ . We define $\Lambda_{n}(f,u)$ to be

[TABLE]

We will be primarily interested in $\Lambda_{n}(\varphi,2)$ , where $\varphi$ is the $\varphi$ -mixing function of the sequence $(X_{n})$ . Again, upper bounds on this quantity will turn out to be essential for us.

Lemma 3.9 (Estimates on $\Lambda_{n}(\varphi,2)$ for the sequence $(X_{n})$ ).

If $\varphi$ is the $\varphi$ -mixing function of the sequence $(X_{n})$ then the following inequality holds:

[TABLE]

where $\bar{\Lambda}$ is a universal constant given by

[TABLE]

with $\lambda_{0}$ being the Gauss-Kuzmin-Wirsing constant.

For the reader acquainted with various types of mixing and metrical theory of continued fractions it may have appeared that we use the $\varphi$ -mixing property with regard to $(a_{n})$ (and $(X_{n})$ ) unnecessarily, as these sequences enjoy the stronger property of $\psi$ -mixing and therefore might be suitable for large deviation theorems which produce better estimates. This is not the case, however, as these theorems employ the eponymous $\Lambda_{n}(\cdot,\cdot)$ , which in turn depends on $f(s,s)$ which may turn out to be infinite in the $f=\psi$ case. Before we proceed with the proof of lemma 3.9 we clarify this subtlety in the following

Example 3.10.

Suppose that $(Y_{n})$ is a sequence of r.vs. such that the $\sigma-$ algebra generated by $Y_{s}$ admits sets of arbitrarily small measure for some $s$ . Let $(C_{n})$ be a sequence of sets in this $\sigma-$ algebra whose measures $\mu(C_{n})$ decrease to [math]. This $\sigma-$ algebra is contained in both $\sigma_{1}^{s}$ and $\sigma_{s}^{\infty}$ . We therefore have

[TABLE]

The phenomenon described above does not appear if we use $\varphi$ -mixing instead. Note that in definitions 3.3 and 3.4 we used $[0,1]$ and $[0,\infty]$ as the codomains for $\varphi$ and $\psi$ , respectively. This is because $1$ is a natural upper bound for $\varphi(\cdot,\cdot)$ since we can estimate $|\mu(B|A)-\mu(B)|\leqslant\max\left\{\mu(B|A),\mu(B)\right\}\leqslant 1$ .

Proof of lemma 3.9.

We will estimate the sum $\sum_{t=s}^{n}\varphi(s,t)^{1/2}$ using the $\psi$ -mixing coefficients and lemma 3.5. We will, however, take into account what has been said in example 3.10 and majorize all the terms in the sum except the first one, for which we use $\varphi(s,s)\leqslant\varphi_{0}\leqslant 1$ .

We employ the bounds of lemma 3.5 for the remaining terms:

[TABLE]

Both the sum in curly brackets in (28) and the number $1$ are bounded from above by $\bar{\Lambda}$ , which concludes the proof. ∎

3.6. Estimating the cumulants of the centered sum

We first state the abstract theorem that will allow us to pass from estimates on the moments of $X_{n}$ to estimates on the cumulants of $S_{n}$ .

Theorem 3.11 (Moment estimates imply cumulant estimates for the sum, [17, Theorem 4.21]).

Let $(Y_{n})_{n=1}^{\infty}$ be a sequence of random variables defined on a probability space $(\mathbb{Y},\mathcal{Y},\mu)$ and denote

[TABLE]

Assume that the sequence $(Y_{n})$ is associated to some Markov chain and that it is $\varphi$ -mixing. Assume also that it satisfies the following moment estimate:

[TABLE]

for some constants $\gamma_{1}\geqslant 0$ and $H_{1}>0$ and all integers $k\geqslant 2$ and $n\geqslant 1$ . Then for each $k\geqslant 2,n\geqslant 1$ and $\delta>0$ the following cumulant estimate is valid for $W_{n}$ :

[TABLE]

where $\Gamma_{k}$ are taken with respect to $\mu$ .

Let us now apply theorem 3.11 with $Y_{n}=X_{n}$ . Its assumptions are verified with $\gamma_{1}=0$ and $H_{1}=\bar{r}$ (lemma 3.2). Choosing $\delta=1$ and applying the estimates on $\Lambda_{n}(\varphi,2)$ (lemma 3.9) we arrive at the following

Theorem 3.12 (Cumulant estimates for $S_{n}$ ).

For any $k\geqslant 2$ and $n\geqslant 1$ the $k$ -th cumulant of $S_{n}$ is bounded by

[TABLE]

Theorem 3.12 holds also if we replace $S_{n}$ with its centering $S_{n}-n\kappa$ since shifting a random variable by a constant does not affect its cumulants of order $k\geqslant 2$ . We will use this simple observation in what follows.

3.7. Estimating the tails of the centered sum

We now turn to estimating the tails of $S_{n}-n\kappa$ . Once again we begin by stating the abstract large deviations theorem.

Theorem 3.13 (Cumulant estimates imply tail estimates, [17, Lemma 2.4], [2]).

Let $W$ be a centered(x)(x)(x)i.e. $\mathbb{E}_{\mu}W=0$ random variable defined on a probability space $(\mathbb{Y},\mathcal{Y},\mu)$ . Assume there exist constants $\gamma_{2}\geqslant 0,H>0$ and $\bar{\Delta}>0$ such that for all integers $k\geqslant 2$ we have

[TABLE]

Then for all $x\geqslant 0$ the following inequality is valid:

[TABLE]

Here $\Gamma_{k}$ denotes the cumulant taken w.r.t. $\mu$ , while the notation $\pm W$ indicates that the inequality holds both for $W$ and $-W$ .

We may now plug the results of theorem 3.12 for $W=S_{n}-n\kappa$ into theorem 3.13. Its assumptions are verified for measure $\mu=\gamma$ and constants $\gamma_{2}=1,\bar{\Delta}=\left(16\bar{r}\bar{\Lambda}\right)^{-1}$ and $H=128\bar{r}^{2}\bar{\Lambda}\cdot n$ .

Theorem 3.14 (Tail estimates for $S_{n}-n\kappa$ ).

For any $n\geqslant 1$ and $x>0$ the following tail estimate holds for $S_{n}-n\kappa$ :

[TABLE]

We now combine the results of this section altogether to obtain the desired estimates on the measure of $\mathtt{KL}$ -sets.

Proof of theorem 2.2.

Rewriting the estimates of theorem 3.14 with $x=nT$ we arrive at

[TABLE]

The final estimates (9) and (10) stem from (11), the subadditivity of $\gamma$ and the estimates for the sum of terms of the form $e^{-\alpha\sqrt{n}}$ over $n\geqslant N$ for $\alpha>0$ contained in lemma A.1 in appendix A. ∎

4. The case of incremented partial quotients

Theorem 2.2 is not limited for application only to the sequence $(M_{n})$ , one can also use it for other sequences for which a counterpart of Khintchine’s theorem on Khintchine constant holds. We demonstrate it for the sequence of products of incremented partial quotients $(M_{n}^{\prime})_{n=1}^{\infty}$ :

[TABLE]

We choose $(M_{n}^{\prime})$ among other sequences for this purpose since it provides an upper bound for the sequence of denominators of convergents $(q_{n})$ , similarly to $(M_{n})$ , which provides a lower bound. This proves useful in the small divisors estimates that we perform in [9].

We begin by introducing the notations that are a counterpart of (3):

[TABLE]

for $n\geqslant 1$ . By Birkhoff’s pointwise ergodic theorem the sequence ${1\over n}S_{n}^{\prime}(\omega)$ tends to a constant almost everywhere just like in the theorem on Khintchine’s constant. This time, however, the test function is $X_{1}^{\prime}$ , therefore ${1\over n}S_{n}^{\prime}\to\kappa^{\prime}$ with

[TABLE]

The Khintchine-Lévy sets are thus defined as

[TABLE]

for $T>0$ and $N\in\mathbb{N}$ . The sets $\mathtt{KL}^{\prime-}(T,N),\mathtt{KL}^{\prime}(T,N)$ and $\mathtt{KL}_{n}^{\prime\pm}(T)$ are defined analogously to definition 2.1. Theorem 2.2 for $\mathtt{KL}^{\prime}$ -sets reads

Theorem 4.1 (Estimates on the measure of $\mathtt{KL}^{\prime}$ -sets).

Let $N$ be a natural number and let $T$ be a positive real number. Denote

[TABLE]

where $\eta$ is the Dirichlet $\eta$ function: $\eta(s):=\sum_{n=1}^{\infty}(-1)^{n-1}n^{-s}$ . Define $\Xi^{\prime}(T)$ as in (8), but with $\bar{r}^{\prime}$ in place of $\bar{r}$ . With the notations of theorem 2.2 the estimates on the measures of $\mathtt{KL}^{\prime\pm}(T,N)$ are the same as in (9) and (10), but with $\Xi^{\prime}(T)$ in place of $\Xi(T)$ .

Proof.

For the theorem to be proven one needs lemma 3.2 to hold for the sequence $(X_{n}^{\prime})$ along with the equality of averages $\mathbb{E}_{\gamma}X_{n}^{\prime}=\kappa^{\prime}$ for all $n\in\mathbb{N}$ . The claim on averages follows from the $G$ invariance of $\gamma$ , as was the case with the sequence $(X_{n})$ : we have $X_{j}^{\prime}=X_{1}^{\prime}\circ G^{j-1}$ for all $j$ . The cumulant estimates of theorem 3.12 depend only on the constants in the moment estimates and the value of $\bar{\Lambda}$ . The latter stems in turn from the mixing coefficients of the sequence in question, which do not change when we switch from $(X_{n})$ to $(X_{n}^{\prime})$ . The Markov chain association assumption also holds for $(X_{n}^{\prime})$ , only for a different sequence of functions: $\xi\mapsto\log\lfloor\xi^{-1}\rfloor$ changes to $\xi\mapsto\log(1+\lfloor\xi^{-1}\rfloor)$ in (25). With that the whole proof forms a food chain that feeds on the moment estimates, which read

[TABLE]

The changes of variables used along the way are $x^{-1}=y$ and $y=e^{z}$ . We also employed the standard formulas for the Dirichlet $\eta$ function:

[TABLE]

and the fact that it is decreasing with $k$ so that $\eta(k+1)\leqslant\eta(3)$ for $k\geqslant 2.$ ∎

5. Properties of Khintchine-Lévy numbers

In this section we briefly compare Khintchine-Lévy numbers with Diophantine numbers. We begin by recalling the definition of the latter along with a few well-known properties.

Definition 5.1 (Diophantine number).

Let $\tau\geqslant 1$ and $C>0$ . We say that a real number $\omega$ is $(C,\tau)$ -Diophantine if the inequality

[TABLE]

holds for all integers $p$ and $q$ with $q\neq 0$ . A number is called Diophantine if it is $(C,\tau)$ -Diophantine for some $C>0$ and $\tau\geqslant 1$ .

We also have the following characterization of Diophanticity in terms of the continued fraction expansion:

Lemma 5.2 (Diophanticity in terms of the continued fraction expansion).

If an irrational number $\omega$ is $(C,\tau)$ -Diophantine with $C>0$ and $\tau\geqslant 1$ then its partial quotients can be estimated by

[TABLE]

Conversely, an estimate as in (47) for all $n\geqslant 0$ results in $\omega$ being $(C/(1+2C),\tau)$ -Diophantine.

Proof.

If a number is $(C,\tau)$ -Diophantine we have ${1\over q_{n+1}}>|q_{n}\omega-p_{n}|\geqslant{C\over q_{n}^{\tau}}$ which gives ${q_{n+1}\over q_{n}}<C^{-1}q_{n}^{\tau-1}$ and this gives (47) since ${q_{n+1}\over q_{n}}>a_{n+1}$ .

For the reverse implication fix $n$ and suppose we have $q_{n}\leqslant q<q_{n+1}$ . We have that $|q_{n}\omega-p_{n}|\leqslant|q\omega-p|$ for such $q$ and any $p$ and also ${D\over q^{\tau}}\leqslant{D\over q_{n}^{\tau}}$ for any $D>0$ . Therefore it suffices to show that ${D\over q_{n}^{\tau}}\leqslant|q_{n}\omega-p_{n}|$ with $D={C\over 1+2C}$ . Assuming (47) we have, however,

[TABLE]

The above reasoning works regardless of the choice of $n$ , therefore the proof is concluded. ∎

The denominators $(q_{n})$ satisfy a recurrence relation

[TABLE]

which implies that

[TABLE]

for all $n\geqslant 0$ through simple induction.

We first note that when a number $\omega$ is $(C,\tau)$ -Diophantine with $\tau=1$ (xi)(xi)(xi)Otherwise known as a constant type number. then it is also Khintchine-Lévy.

Lemma 5.3.

A number $\omega$ that is $(C,1)$ -Diophantine with some $C>1$ satisfies $\omega\in\mathtt{KL}^{\prime+}(T,N)$ with $N=1$ and $T=\log(C^{-1}+1)-\kappa^{\prime}$ .

Proof.

By lemma 5.2 constant type numbers are precisely the ones with a bounded sequence of partial quotients: $a_{n}\leqslant C^{-1}$ , which implies $M_{n}^{\prime}<(C^{-1}+1)^{n}$ for all $n\geqslant 1$ . ∎

Note, however, that constant type numbers form a set of measure zero ([11]). On the other hand, the complement of the set of Diophantine numbers with fixed $\tau>1$ and $C>0$ is small whenever $C$ is small:

Lemma 5.4 (Measure of the set of Diophantine numbers).

The measure of the set of numbers $\omega\in[0,1]$ that are not $(C,\tau)$ -Diophantine can be estimated from above by $2C\zeta(\tau)$ if $\tau>1$ and $C>0$ . Here $\zeta$ denotes the Riemann $\zeta$ function.

Proof.

The excluded numbers are contained in the set

[TABLE]

Each of the intervals has length equal to $l(q)=2Cq^{-(1+\tau)}$ , apart from the intervals $[0,C/q^{1+\tau})$ and $(1-C/q^{1+\tau},1]$ and their total length (for a fixed $q$ ) adds up to $ql(q)$ , therefore $\lambda(\mathtt{Excl})\leqslant\sum_{q=0}^{\infty}ql(q)=2C\zeta(\tau)$ . ∎

When it comes to $\tau>1$ on the other hand it turns out that Khintchine-Lévy numbers are Diophantine, but not the other way round.

Lemma 5.5.

If $\omega\in\mathtt{KL}^{+}(T,N)$ for some $T>0$ and $N\in\mathbb{N}$ then $\omega$ is $(C,\tau)$ -Diophantine with $C$ small enough and $\tau=1+{\kappa+T\over\log\varphi}$ , where $\varphi=(1+\sqrt{5})/2$ . If $\omega\in\mathtt{KL}(T_{-},T_{+},N)$ for some $T_{-},T_{+}>0$ and $N\in\mathbb{N}$ , then it is $(C,\tau)$ -Diophantine with $C$ small enough and $\tau=1+{T_{+}+T_{-}\over\log\varphi}$ .

Proof.

From (49) we can infer that for any $\omega$ we have $q_{n}\geqslant F_{n-1}$ , where $F_{n}$ is the Fibonacci sequence with $F_{1}=F_{2}=1$ , and in consequence $q_{n}>\varphi^{n}/3$ . Assuming that $\omega\in\mathtt{KL}^{+}(T,N)$ we have, for all $n\geqslant N$ , that

[TABLE]

By lemma 5.2 we see that $\omega$ is $(C,\tau)$ -Diophantine with $\tau=1+{\kappa+T\over\log\varphi}$ and a suitably chosen $C$ (xii)(xii)(xii)Choosing $C$ we account for the fact that (52) holds for $n\geqslant N$ ..

The case of $\mathtt{KL}(T_{-},T_{+},N)$ is similar with the exception that the estimates begin with

[TABLE]

to end with $T_{+}+T_{-}$ instead of $\kappa+T$ and thus with $\tau=1+{T_{+}+T_{-}\over\log\varphi}$ . ∎

Note that in the first case in lemma 5.5 we can bring $\tau$ as close as we wish to $1+\kappa/\log\varphi\approx 3.051$ by setting $T$ small, while in the second case the critical $\tau$ is $1$ .

Using the ideas of the proof of lemma 5.5 we can infer that $\omega~{}\in~{}\mathtt{KL}^{\prime+}(T,N)~{}\cup~{}\mathtt{KL}^{+}(T,N)$ for some $T>0,N\in\mathbb{N}$ implies at most exponential growth of partial quotients. Therefore any sequence of partial quotients that has a superexponential subsequence gives rise to a non- $\mathtt{KL}$ number $\omega$ . Using this we can construct a non- $\mathtt{KL}$ number $\omega^{*}$ , which is Diophantine. In fact $\omega^{*}$ can even have a very sparse distribution of partial quotients.

Example 5.6 (A non- $\mathtt{KL}$ Diophantine number).

Fix $s>1$ and $\delta>0$ and set $d_{n}:=\lfloor(1+\delta)^{n}\rfloor$ . We define $\omega^{*}$ through its partial quotients:

[TABLE]

bearing in mind that the second case in (54) may produce two or more values for small enough $\delta$ and $j$ . For fixed $\delta$ , however, there are only finitely many $j$ ’s for which this happens and if this is the case we define $a_{j}=1$ . We will not be interested in the initial partial quotients.

For $s>1$ the number $\omega$ has a superexponential subsequence of partial quotients, therefore it cannot be in $\mathtt{KL}^{\prime+}(T,N)\cup\mathtt{KL}^{+}(T,N)$ for any $T>0$ and $N\in\mathbb{N}$ . We will show that $\omega$ is $(C,\tau)$ -Diophantine for any $\tau>A(1+\delta)^{s}$ and $C=C(\tau)$ small enough, where $A$ is any constant with $A>\alpha^{s}$ and $\alpha>1$ is a constant specified later in the proof.(xiii)(xiii)(xiii)The constant $\alpha$ can be chosen as close to $1$ as we wish, at the expense of $C$ . Note that this way we can make the exponent as close to the critical $\tau=1$ as we wish.

To do this we will verify that for all $j\in\mathbb{N}$ we have

[TABLE]

as this entails (47) and we will be able to use lemma 5.2. After a minor alteration (55) is equivalent to

[TABLE]

Fix $j>J$ for $J$ large enough, so that there is no ambiguity in (54) and let $n$ be such that $d_{n-1}\leqslant j<d_{n}$ . First observe that $S_{d_{n-1}}=S_{d_{n-1}+1}=\ldots=S_{d_{n}-1}$ since $X_{i}=0$ for $i=d_{n-1}+1,\ldots,d_{n}-1$ . Also fix $\alpha>1$ and note that $\lfloor x\rfloor\geqslant x/\alpha$ for $x$ large enough. Additionally set $k_{0}$ to be the smallest number for which $d_{k_{0}}\geqslant J$ . We have

[TABLE]

If we now prove that (56) holds after we substitute $S_{j}$ with the right-hand side of (57) then the whole proof is concluded. To do this we need to consider two cases: $J<j<d_{n}-1$ and $j=d_{n}-1$ . In the first case $X_{j+1}=0$ , so we need

[TABLE]

to hold for all $n$ and some $C>0$ . This is, however, the case: the sequence in the largest brackets diverges to $+\infty$ , so it must have a minimal value and we only need to set $C$ small enough to elevate the whole expression above [math] since $\tau-1>0$ .

The second case gives $X_{j+1}=X_{d_{n}}=\log\lfloor e^{d_{n}^{s}}\rfloor\leqslant d_{n}^{s}\leqslant(1+\delta)^{ns}$ , we therefore similarly require

[TABLE]

Subtracting $(1+\delta)^{ns}$ from both sides gives a similar inequality to (58), but with a different coefficient at $(1+\delta)^{ns}$ , namely

[TABLE]

For $\tau>A(1+\delta)^{s}$ we have $E>0$ and by an analogous argument to the one in previous case we can make inequality (59) valid choosing a small enough $C$ .

6. Measure of KL-sets: a practical point of view

In this section we focus on the numerical values of estimates of theorems 2.2 and 4.1 for particular values of $T$ . We outline the motivation for this in [9], where we perform estimates in a small divisors problem under the assumption that the frequency $\omega$ belongs to one of the $\mathtt{KL}$ -sets. It turns out that the quality of these estimates is best when $T$ is as small as possible. There is, however, a price to pay if we want to set $T$ small, namely we have to set $N$ large to obtain reasonable estimates on the measure of $\mathtt{KL}$ -sets.

To better illustrate our reasoning we will focus on the set $\mathtt{KL}^{\prime+}(T,N)$ . At the end of the section we present a detailed exposition of numerical values of estimates from theorem 2.2 for selected values of $T$ and $N$ . For simplicity we will consider the case when $N=K^{2}$ is a square of an integer, so that the finite sum term in the estimates of theorem 2.2 vanishes.

Inequality (9) written for $\Xi^{\prime}(T)$ tells us that the quantities that will be essential for us are the numerator

[TABLE]

and the denominator

[TABLE]

appearing on its right-hand side, our goal will be to make ${\mathsf{num^{\prime}}\over\mathsf{den^{\prime}}}$ as close to [math] as possible. The numerical value of $\kappa^{\prime}$ is(xiv)(xiv)(xiv)Analogously to a well known formula for $\kappa$ we can express $\kappa^{\prime}$ as a sum of an infinite series $\kappa^{\prime}=\sum_{r=1}^{\infty}\log_{2}(r+1)\log(1+(r(r+2))^{-1})$ $\kappa^{\prime}\approx 1.410$ , which suggests that it is only reasonable to consider $T$ of the same order of magnitude (and also $T<\kappa^{\prime}$ when considering the set $\mathtt{KL}^{-}$ (xv)(xv)(xv)Actually even $T<\kappa^{\prime}-\log 2\approx 0.716$ , since all $\omega$ satisfy $2^{n}\leqslant M_{n}^{\prime}(\omega)$ .). We will therefore consider $T$ to be a number satisfying $T\leqslant 2$ . The problem is that $\Xi^{\prime}$ evaluated even at a number as small as $T=2$ is very close to $1$ and the distance to $1$ gets even smaller as we decrease $T$ towards [math]. This makes $\mathsf{den^{\prime}}$ small, which tells us that $\mathsf{num^{\prime}}$ needs to be even smaller. For instance

[TABLE]

and this gives $\mathsf{den^{\prime}}^{-1}\approx 4.161\cdot 10^{3}$ . The only thing we can do to overcome the effect of $\mathsf{den^{\prime}}^{-1}$ being big is manipulating the exponent $K$ , that appears in $\mathsf{num^{\prime}}$ . It turns out that, for instance, to have ${\mathsf{num^{\prime}}\over\mathsf{den^{\prime}}}<10^{-2}$ we need $N\geqslant 6.084\cdot 10^{9}$ . More general numerical values are provided in table 1 below.

Define $\mathsf{est}^{\prime}=1-{\mathsf{num^{\prime}}\over\mathsf{den^{\prime}}}$ . The cells in table 1 contain the approximations of minimal values of $N$ which guarantee that the estimate $\mathsf{est}^{\prime}$ is better than the value given in the leftmost column with the value of $T$ given in the top row. For instance the bottom-right cell tells us that in order to have the estimate $\mathsf{est}^{\prime}$ better than $99.9\%$ with $T=0.1$ one needs to have $N\geqslant 2.394\cdot 10^{15}$ .

In other words the values appearing in table 1 tell us that if we want to have a guarantee that e.g. $99.9\%$ of numbers $\omega$ satisfy the inequality

[TABLE]

for all $n$ “large enough” then “large enough” means “greater than $2.394\cdot 10^{15}$ ”. Observe, however, that for a given value of $T$ the entries of the table are of the same order of magnitude. This means that in order to reach a sharp measure estimate one does not pay a significantly greater price than that of crossing the threshold given by the value in the “ $\mathsf{est}^{\prime}>1\%$ ” line, the “currency” here being the amount of initial numbers that need to be excluded from our considerations.

The values for the $\mathtt{KL}^{+}$ -sets are provided in table 2, $\mathsf{est}$ is defined analogously to $\mathsf{est^{\prime}}$ .

7. Concluding remarks

Since theorem 2.2 holds for both $(M_{n})$ and $(M_{n}^{\prime})$ it is natural to ask whether it also does for $(q_{n})$ . The sequence of denominators of convergents also enjoys exponential growth almost everywhere, with rate $\ell={\pi^{2}\over 12\log 2}$ ([12]). We were, however, not able to reproduce the reasoning of section 3 due to a slightly different nature of this sequence, compared to either $(M_{n})$ or $(M_{n}^{\prime})$ . The first difference between $(M_{n})$ and $(q_{n})$ is in the averages: we have $\mathbb{E}_{\gamma}\log M_{n}=n\kappa$ , while $\mathbb{E}_{\gamma}\log q_{n}=n\ell+R_{n}$ with a remainder $R_{n}$ bounded in $n$ . More importantly, however, the success of the reasoning in section 3 relies on the fact that $\log M_{n}$ can be expressed as the sum of $n$ summands $X_{j}=\log a_{j}$ , which satisfy both the mixing assumption and the Markov chain association assumption. For $\log q_{n}$ one can use the sequence $\left(\log\left(s_{j}^{-1}\right)\right)$ as a counterpart of $(X_{j})$ , but this sequence does not have the mixing property and this way the whole food chain of lemmas we used in 3 breaks apart. To see this we need to take a closer look at the structure of the “past” and the “future” $\sigma$ -algebras of the sequence $(s_{n})$ .(xvi)(xvi)(xvi)note that they are the same as the same $\sigma$ -algebras for the sequence $\left(\log\left(s_{j}^{-1}\right)\right)$ The latter is given by $\sigma_{t}^{\infty}=\sigma(s_{t},s_{t+1},\ldots)$ (xvii)(xvii)(xvii)By $\sigma(\ldots)$ with no indices we mean the $\sigma$ -algebra generated by random variables or sets in brackets. for $t>0$ and is thus generated by the preimages of singletons of rationals $s_{j}^{-1}(\{r\})$ with $j=t,t+1,\ldots$ . Due to how $s_{j}$ is constructed from $a_{1},\ldots,a_{j}$ the sets $s_{j}^{-1}(\{r\})$ , however, are actually finite intersections of the preimages of singletons of positive integers through functions $a_{1},\ldots,a_{j}$ . As a consequence $\sigma_{t}^{\infty}$ can actually be written as $\sigma(a_{1},a_{2},\ldots)$ for any $t>0$ , which in particular means that $\sigma_{t}^{\infty}$ contains all of the “past” $\sigma$ -algebras $\sigma_{1}^{s}=\sigma(s_{1},\ldots,s_{s})$ with $s<t$ as they are actually equal to $\sigma(a_{1},\ldots,a_{s})$ by a similar argument. This inclusion of $\sigma$ -algebras is what prevents the mixing coefficients of $(s_{j})$ from converging to [math] just as was the case in example 3.10, since $\sigma_{1}^{s}$ admits sets of arbitrarily small measure.

We also made a choice of sticking to $\varphi$ -mixing instead of $\psi$ -mixing even though we only consider quantities which are $\psi$ -mixing if they exhibit any kind of mixing. This is because in the formula for $\Lambda(\cdot,\cdot)$ in definition 3.8 there is a dependence on $f(s,s)$ for a mixing function $f$ and an integer index $s$ . In example 3.10 we learnt, however, that $\psi(s,s)$ may be infinite, which would yield no control over $\Lambda(\cdot,\cdot)$ and in consequence no control over the measure of $\mathtt{KL}$ sets. This is not the case if we consider $\varphi$ -mixing.

The price we pay for this detail is, however, quite significant. The type of mixing we employ has an impact on the quality of the cumulant estimates in theorem 3.11. This result also has a $\psi$ -mixing counterpart ([17, Theorem 4.21, second inequality]) in which the cumulant estimates are better - instead of a $(k!)^{2+\gamma_{1}}$ factor in (33) there appears $(k!)^{1+\gamma_{1}}$ . This decrease of the exponent at $k!$ , plugged into the rest of the food chain of theorems, would have an impact on theorem 2.2.

Observe that in the proof of this theorem we sum terms of the form $e^{-\alpha\sqrt{n}}$ (with $e^{-\alpha}=\Xi(T)$ ) which gives a slowly converging series and large values in tables 1 and 2. Using $\psi$ -mixing would switch the summands to a geometric progression $e^{-\alpha n}$ whose series converges much faster and gives much better values in the counterpart of tables 1 and 2. The orders of magnitudes (i.e. the exponents at $10$ ) in said table would reduce roughly by half. We will delve into this matter in further research of the subject as the problem seems to stem directly from the fact that we are using very general large deviation theorems which do not take into account the specifics of the very well studied sequence $(a_{n})$ .

Some evidence that the numbers obtained in tables 1 and 2 are far from optimal comes also from the analysis of the continued fraction of $\pi-3$ . A (non-rigorous) analysis of its $\mathtt{range}:=4.38\cdot 10^{8}$ initial partial quotients provided in [3] gives the maximal value of $W_{n}(\pi):=\left|{1\over n}S_{n}(\pi-3)-n\kappa\right|$ equal to $W_{4}(\pi)\approx 1.598$ (stemming from the unusually large $a_{4}(\pi)=292$ ). We also have $W_{n}(\pi)<0.1$ for $15\leqslant n\leqslant\mathtt{range}$ , which is inconsistent with table 2 by several orders of magnitude if we assume that $\pi-3$ has a somewhat “generic” continued fraction expansion. For larger $n$ the difference is even more striking: considering $10000\leqslant n\leqslant\mathtt{range}$ gives an oscillation of the order of magnitude of $W_{n}$ between $10^{-2}$ and $10^{-4}$ . Using the data in [3] and the estimates of the current paper one can only derive a rather ineffective result on $\pi-3$ in spirit of the ones provided by tables 1 and 2.

Corollary 7.1.

The number $\pi-3$ satisfies the inequality

[TABLE]

for all $n\geqslant 1$ with probability(xviii)(xviii)(xviii)As in the rest of the paper by probability we mean the Gauss measure $\gamma$ . $99.9\%$ .

Proof.

The data in [3] give the estimate $M_{n}(\pi-3)\leqslant e^{(\kappa+1.598)n}<14^{n}$ for $1\leqslant n\leqslant\mathtt{range}$ . For $n>\mathtt{range}$ we can use theorem 2.2 with $T=5.62$ which gives the base $e^{\kappa+5.62}<743$ . ∎

Acknowledgements

During a part of the research, which led to preparation of this article, the author was supported by the Foundation for Polish Science under the MPD Programme Geometry and Topology in Physical Models, co-financed by the EU European Regional Development Fund, Operational Program Innovative Economy 2007-2013.

Appendix A Auxiliary identities

Lemma A.1 (Sum of $e^{-\alpha\sqrt{n}}$ ).

For any $N\geqslant 2$ and $\alpha>0$ the following inequality holds:

[TABLE]

where $K=\left\lceil\sqrt{N}\right\rceil$ . In particular when $N$ is a square of an integer the estimate takes form

[TABLE]

Proof.

The proof relies on the following identity:

[TABLE]

We have

[TABLE]

∎

Bibliography18

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1[1] V. I. Arnol’d. Small denominators. I. Mapping the circle onto itself. Izvestiya Akademii Nauk SSSR. Seriya Matematicheskaya , 25:21–86. English translation in Amer. Math. Soc. Transl. (2), 46:213–284, 1965.
2[2] R. Bentkus and R. Rudzkis. On exponential estimates of the distributions of random variables. Lithuanian Mathematical Journal , 20:15–30, 1980.
3[3] N. Bickford. Pi CF and the Continued Fraction of Pi. http://neilbickford.com/picf.htm, 2010. [Online, accessed 3-Oct-2018].
4[4] R. C. Bradley. Basic Properties of Strong Mixing Conditions. A Survey and Some Open Questions. Probab. Surveys , 2:107–144, 2005.
5[5] R. de la Llave, A. González, À. Jorba, and J. Villanueva. KAM theory without action-angle variables. Nonlinearity , 18:855–895, 2005.
6[6] S. Hörmann. Berry-Esseen bounds for econometric time series. Alea , 6:377–397, 2009.
7[7] I. A. Ibragimov. A theorem from the metric theory of continued fractions. Vestnik Leningrad. Univ. , 16(1):13–24, 1961.
8[8] M. Iosifescu and C. Kraaikamp. Metrical Theory of Continued Fractions . Springer Netherlands, 2002.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

A quantitative version of the theorem on Khintchine’s constant

Abstract.

1. Motivation

2. Main result

Definition 2.1** (Khintchine-Lévy condition).**

Theorem 2.2** (Estimates on the measure of KL\mathtt{KL}KL-sets).**

3. Proof of theorem 2.2

3.1. KL\mathtt{KL}KL-sets as tails of probability distributions

Definition 3.1**.**

3.2. Moment estimates

Lemma 3.2** (Estimates of the moments of XnX_{n}Xn​).**

Proof.

3.3. Mixing properties of (Xn)(X_{n})(Xn​)

Definition 3.3** (φ\varphiφ-mixing sequence of r.vs, φ\varphiφ-mixing function and φ\varphiφ-mixing coefficients).**

Definition 3.4** (ψ\psiψ-mixing sequence of r.vs, ψ\psiψ-mixing function and ψ\psiψ-mixing coefficients).**

Lemma 3.5** (Quantitative estimates on the mixing coefficients of (an)(a_{n})(an​), [8, Proposition 2.3.7]).**

3.4. Markov chain association

Definition 3.6** (Sequence of r.vs. associated to a Markov chain).**

Lemma 3.7**.**

Proof.

3.5. The Λ(f,n)\Lambda(f,n)Λ(f,n) quantity

Definition 3.8** (The Λn(f,u)\Lambda_{n}(f,u)Λn​(f,u) quantity).**

Lemma 3.9** (Estimates on Λn(φ,2)\Lambda_{n}(\varphi,2)Λn​(φ,2) for the sequence (Xn)(X_{n})(Xn​)).**

Example 3.10**.**

Proof of lemma 3.9.

3.6. Estimating the cumulants of the centered sum

Theorem 3.11** (Moment estimates imply cumulant estimates for the sum, [17, Theorem 4.21]).**

Theorem 3.12** (Cumulant estimates for SnS_{n}Sn​).**

3.7. Estimating the tails of the centered sum

Theorem 3.13** (Cumulant estimates imply tail estimates, [17, Lemma 2.4], [2]).**

Theorem 3.14** (Tail estimates for Sn−nκS_{n}-n\kappaSn​−nκ).**

Proof of theorem 2.2.

4. The case of incremented partial quotients

Theorem 4.1** (Estimates on the measure of KL′\mathtt{KL}^{\prime}KL′-sets).**

Proof.

5. Properties of Khintchine-Lévy numbers

Definition 5.1** (Diophantine number).**

Lemma 5.2** (Diophanticity in terms of the continued fraction expansion).**

Proof.

Lemma 5.3**.**

Proof.

Lemma 5.4** (Measure of the set of Diophantine numbers).**

Proof.

Lemma 5.5**.**

Proof.

Example 5.6** (A non-KL\mathtt{KL}KL Diophantine number).**

6. Measure of KL-sets: a practical point of view

7. Concluding remarks

Corollary 7.1**.**

Proof.

Acknowledgements

Appendix A Auxiliary identities

Lemma A.1** (Sum of e−αne^{-\alpha\sqrt{n}}e−αn​).**

Proof.

Definition 2.1 (Khintchine-Lévy condition).

Theorem 2.2 (Estimates on the measure of $\mathtt{KL}$ -sets).

3.1. $\mathtt{KL}$ -sets as tails of probability distributions

Definition 3.1.

Lemma 3.2 (Estimates of the moments of $X_{n}$ ).

3.3. Mixing properties of $(X_{n})$

Definition 3.3 ( $\varphi$ -mixing sequence of r.vs, $\varphi$ -mixing function and $\varphi$ -mixing coefficients).

Definition 3.4 ( $\psi$ -mixing sequence of r.vs, $\psi$ -mixing function and $\psi$ -mixing coefficients).

Lemma 3.5 (Quantitative estimates on the mixing coefficients of $(a_{n})$ , [8, Proposition 2.3.7]).

Definition 3.6 (Sequence of r.vs. associated to a Markov chain).

Lemma 3.7.

3.5. The $\Lambda(f,n)$ quantity

Definition 3.8 (The $\Lambda_{n}(f,u)$ quantity).

Lemma 3.9 (Estimates on $\Lambda_{n}(\varphi,2)$ for the sequence $(X_{n})$ ).

Example 3.10.

Theorem 3.11 (Moment estimates imply cumulant estimates for the sum, [17, Theorem 4.21]).

Theorem 3.12 (Cumulant estimates for $S_{n}$ ).

Theorem 3.13 (Cumulant estimates imply tail estimates, [17, Lemma 2.4], [2]).

Theorem 3.14 (Tail estimates for $S_{n}-n\kappa$ ).

Theorem 4.1 (Estimates on the measure of $\mathtt{KL}^{\prime}$ -sets).

Definition 5.1 (Diophantine number).

Lemma 5.2 (Diophanticity in terms of the continued fraction expansion).

Lemma 5.3.

Lemma 5.4 (Measure of the set of Diophantine numbers).

Lemma 5.5.

Example 5.6 (A non- $\mathtt{KL}$ Diophantine number).

Corollary 7.1.

Lemma A.1 (Sum of $e^{-\alpha\sqrt{n}}$ ).