Continued fractions, the Chen-Stein method and extreme value theory

Anish Ghosh; Maxim Kirsebom; Parthanil Roy

arXiv:1904.07582·math.PR·August 6, 2019

Continued fractions, the Chen-Stein method and extreme value theory

Anish Ghosh, Maxim Kirsebom, Parthanil Roy

PDF

TL;DR

This paper applies probability, ergodic theory, and real analysis to improve bounds on the convergence rate of extreme values in continued fraction digit distributions, enhancing understanding of their asymptotic behavior.

Contribution

It introduces new bounds for convergence rates in extreme value theory for continued fractions, utilizing the Chen-Stein method and ergodic theory techniques.

Findings

01

Improved upper bounds for convergence rates in Doeblin-Iosifescu asymptotics.

02

Enhanced understanding of the extremal behavior of continued fraction digits.

03

Methodology applicable to order statistics and extremal point processes.

Abstract

In this work, we deal with extreme value theory in the context of continued fractions using techniques from probability theory, ergodic theory and real analysis. We give an upper bound for the rate of convergence in the Doeblin-Iosifescu asymptotics for the exceedances of digits obtained from the regular continued fraction expansion of a number chosen randomly from $(0, 1)$ according to the Gauss measure. As a consequence, we significantly improve the best known upper bound on the rate of convergence of the maxima in this case. We observe that the asymptotics of order statistics and the extremal point process can also be investigated using our methods.

Equations107

T (x) = {1/ x},

T (x) = {1/ x},

A_{1} (x)

A_{1} (x)

A_{j + 1} (x)

\int_{X} T (f) g d λ = \int_{X} f (g \circ T) d λ

\int_{X} T (f) g d λ = \int_{X} f (g \circ T) d λ

P (d x) = ((1 + x) lo g 2)^{- 1} d x

P (d x) = ((1 + x) lo g 2)^{- 1} d x

P\big{(}(A_{m_{1}},A_{m_{2}},\ldots,A_{m_{k}})\in B\big{)}=P\big{(}(A_{m_{1}+l},A_{m_{2}+l},\ldots,A_{m_{k}+l})\in B\big{)}.

P\big{(}(A_{m_{1}},A_{m_{2}},\ldots,A_{m_{k}})\in B\big{)}=P\big{(}(A_{m_{1}+l},A_{m_{2}+l},\ldots,A_{m_{k}+l})\in B\big{)}.

E_{n}^{u} := # {1 \leq i \leq n : A_{i} lo g 2 > n u} ⟶ d E_{*}^{u} \sim Poi (u^{- 1})

E_{n}^{u} := # {1 \leq i \leq n : A_{i} lo g 2 > n u} ⟶ d E_{*}^{u} \sim Poi (u^{- 1})

P\big{(}\mathcal{E}^{u}_{\ast}=k\big{)}=\frac{u^{-k}e^{-u^{-1}}}{k!},\,\;k=0,1,2,\ldots\,.

P\big{(}\mathcal{E}^{u}_{\ast}=k\big{)}=\frac{u^{-k}e^{-u^{-1}}}{k!},\,\;k=0,1,2,\ldots\,.

P (\frac{lo g 2 max _{i = 1}^{n} A _{i}}{n} \leq u) \to e^{- u^{- 1}},

P (\frac{lo g 2 max _{i = 1}^{n} A _{i}}{n} \leq u) \to e^{- u^{- 1}},

n P (\frac{A _{1} lo g 2}{n} \in \cdot) ⟶ v ν

n P (\frac{A _{1} lo g 2}{n} \in \cdot) ⟶ v ν

P (\frac{A _{1} lo g 2}{n} > u)

P (\frac{A _{1} lo g 2}{n} > u)

n P (\frac{A _{1} lo g 2}{n} > u)

P (\frac{A _{1} lo g 2}{n} > u) = \frac{lo g ( 1 + \frac{1}{⌈ n u / l o g 2 ⌉} )}{lo g 2} \leq \frac{lo g ( 1 + \frac{l o g 2}{n u} )}{lo g 2} \leq \frac{1}{n u} .

P (\frac{A _{1} lo g 2}{n} > u) = \frac{lo g ( 1 + \frac{1}{⌈ n u / l o g 2 ⌉} )}{lo g 2} \leq \frac{lo g ( 1 + \frac{l o g 2}{n u} )}{lo g 2} \leq \frac{1}{n u} .

∣ P (F \cap H) - P (F) P (H) ∣ \leq ψ (n) P (F) P (H) = C θ^{- n} P (F) P (H),

∣ P (F \cap H) - P (F) P (H) ∣ \leq ψ (n) P (F) P (H) = C θ^{- n} P (F) P (H),

P (\frac{M _{n}^{(k)}}{n} \leq u) = P (E_{n}^{u} \leq k - 1) \to P (E_{*}^{u} \leq k - 1) = e^{- u^{- 1}} i = 0 \sum k - 1 \frac{u ^{- i}}{i !} .

P (\frac{M _{n}^{(k)}}{n} \leq u) = P (E_{n}^{u} \leq k - 1) \to P (E_{*}^{u} \leq k - 1) = e^{- u^{- 1}} i = 0 \sum k - 1 \frac{u ^{- i}}{i !} .

l_{n} θ^{l_{n}} = n

l_{n} θ^{l_{n}} = n

u \in [δ, \infty) sup d_{T V} (E_{n}^{u}, E_{*}^{u}) \leq \frac{κ}{min { δ , δ ^{2} }} \frac{l _{n}}{n},

u \in [δ, \infty) sup d_{T V} (E_{n}^{u}, E_{*}^{u}) \leq \frac{κ}{min { δ , δ ^{2} }} \frac{l _{n}}{n},

k \in N sup u \in [δ, \infty) sup P (\frac{M _{n}^{(k)}}{n} \leq u) - e^{- u^{- 1}} i = 0 \sum k - 1 \frac{u ^{- i}}{i !} \leq \frac{κ}{min { δ , δ ^{2} }} \frac{l _{n}}{n} .

k \in N sup u \in [δ, \infty) sup P (\frac{M _{n}^{(k)}}{n} \leq u) - e^{- u^{- 1}} i = 0 \sum k - 1 \frac{u ^{- i}}{i !} \leq \frac{κ}{min { δ , δ ^{2} }} \frac{l _{n}}{n} .

\frac{l _{n}}{n} = o (e^{- (l o g n)^{δ}})

\frac{l _{n}}{n} = o (e^{- (l o g n)^{δ}})

Q_{n} := i = 1 \sum n δ_{\frac{A _{i} l o g 2}{n}} ⟶ d Q_{*} \sim P R M ((0, \infty], ν) .

Q_{n} := i = 1 \sum n δ_{\frac{A _{i} l o g 2}{n}} ⟶ d Q_{*} \sim P R M ((0, \infty], ν) .

b_{1}

b_{1}

b_{2}

b_{3}

W_{j} = α \in I_{j} \sum X_{α}, Z_{j} = α \in I_{j} \sum Y_{α} \mbox an d λ_{j} = α \in I_{j} \sum p_{α} .

W_{j} = α \in I_{j} \sum X_{α}, Z_{j} = α \in I_{j} \sum Y_{α} \mbox an d λ_{j} = α \in I_{j} \sum p_{α} .

\displaystyle d_{TV}\big{(}\mathcal{L}(W_{1},W_{2},\ldots,W_{k}),\mathcal{L}(Z_{1},Z_{2},\ldots,Z_{k})\big{)}

\displaystyle d_{TV}\big{(}\mathcal{L}(W_{1},W_{2},\ldots,W_{k}),\mathcal{L}(Z_{1},Z_{2},\ldots,Z_{k})\big{)}

\leq min {2, 2.8 1 \leq j \leq k max λ_{j}^{- 1/2}} (2 b_{1} + 2 b_{2} + b_{3}),

d_{T V} (E_{n}^{u}, E_{*}^{u}) \leq d_{T V} (E_{n}^{u}, \tilde{E}_{n}^{u}) + d_{T V} (\tilde{E}_{n}^{u}, E_{*}^{u})

d_{T V} (E_{n}^{u}, E_{*}^{u}) \leq d_{T V} (E_{n}^{u}, \tilde{E}_{n}^{u}) + d_{T V} (\tilde{E}_{n}^{u}, E_{*}^{u})

d_{T V} (E_{n}^{u}, \tilde{E}_{n}^{u}) \leq κ_{1} max {\frac{1}{u}, \frac{1}{u ^{2}}} \frac{l _{n}}{n},

d_{T V} (E_{n}^{u}, \tilde{E}_{n}^{u}) \leq κ_{1} max {\frac{1}{u}, \frac{1}{u ^{2}}} \frac{l _{n}}{n},

D = (u, \infty] .

D = (u, \infty] .

α \in I \sum p_{α} = n P (n^{- 1} A_{1} lo g 2 \in D) = n P (n^{- 1} A_{1} lo g 2 > u) .

α \in I \sum p_{α} = n P (n^{- 1} A_{1} lo g 2 \in D) = n P (n^{- 1} A_{1} lo g 2 > u) .

b_{1}

b_{1}

\displaystyle=\sum_{\alpha=1}^{n}\sum_{\beta\in B_{\alpha}}\big{(}P(n^{-1}A_{1}\log{2}\in D)\big{)}^{2},

b_{1}

E (X_{α} X_{β})

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Full text

Continued fractions, the Chen-Stein method and extreme value theory

Anish Ghosh

Anish Ghosh, School of Mathematics, Tata Institute of Fundamental Research, Mumbai 400005, India

[email protected]

,

Maxim Sølund Kirsebom

Maxim Sølund Kirsebom, Department of Mathematics, University of Hamburg, 20146 Hamburg, Germany

[email protected]

and

Parthanil Roy

Parthanil Roy, Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, Bangalore 560059, India

[email protected]

Abstract.

In this work, we deal with extreme value theory in the context of continued fractions using techniques from probability theory, ergodic theory and real analysis. We give an upper bound for the rate of convergence in the Doeblin-Iosifescu asymptotics for the exceedances of digits obtained from the regular continued fraction expansion of a number chosen randomly from $(0,1)$ according to the Gauss measure. As a consequence, we significantly improve the best known upper bound on the rate of convergence of the maxima in this case. We observe that the asymptotics of order statistics and the extremal point process can also be investigated using our methods.

Key words and phrases:

Continued fractions, Gauss map, Chen-Stein method, Poisson approximation, rate of convergence, extreme value theory

1991 Mathematics Subject Classification:

Primary 60G70; Secondary 11K50

A. G. gratefully acknowledges support from a grant from the Indo-French Centre for the Promotion of Advanced Research; a Swarnajayanti Fellowship from Department of Science and Technology, Government of India and a MATRICS grant from the Science and Engineering Research Board. P. R. acknowledges the support from a MATRICS grant from the Science and Engineering Research Board and a Swarnajayanti Fellowship from Department of Science and Technology, Government of India.

1. introduction

This short paper establishes an upper bound for the Doeblin-Iosifescu asymptotics for exceedances (defined below) arising from the Gauss dynamical system. We briefly recall the basic facts about continued fraction expansions and the Gauss map. The reader is referred to the classic text Khintchine (1964) for more details. Let $X=(0,1)$ and for all $x\in X,$ let $[A_{1}(x),A_{2}(x),\ldots]$ denote the regular continued fraction expansion. Define a transformation $T:X\to X$ by

[TABLE]

where $\{\cdot\}$ denotes the fractional part. With the notations above, for all $x\in X\backslash\mathbbm{Q}$ ,

[TABLE]

It is easy to check that $T$ defines a nonsingular transformation on $(X,\lambda)$ , where $\lambda$ denotes the Lebesgue measure. This means that for all measurable $B\subseteq X$ , we have $\lambda\circ T^{-1}(B)=0$ if and only if $\lambda(B)=0$ .

Let $\widehat{T}:L^{1}(X,\lambda)\to L^{1}(X,\lambda)$ denote the dual operator (see, for example, Page 33 of Aaronson (1997)) corresponding to $T$ that satisfies

[TABLE]

for all $f\in L^{1}(X,\lambda)$ and for all $g\in L^{\infty}(X,\lambda)$ . It is easy to extend the domain of definition of $\widehat{T}$ to all nonnegative measurable functions. Solving the functional equation $\widehat{T}(h)=h$ , we get $h(x)=(1+x)^{-1}\in L^{\infty}(X,\lambda)$ . Hence by Proposition 1.4.1 of Aaronson (1997), the probability measure

[TABLE]

on $X$ is $T$ -invariant making $T$ a positive transformation (see, for example, Aaronson (1997)). The measure $P$ is known as the Gauss measure.

From now on, we shall think of $\{A_{n}\}_{n\geq 1}$ as a sequence of random variables $A_{n}:X\to\mathbbm{N}$ defined on the probability space $(X,P)$ . The $T$ -invariance of $P$ makes this a stationary sequence, i.e., for all $k,l\in\mathbb{N}$ , for all $m_{1},m_{2},\ldots,m_{k}\in\mathbb{N}$ and for all Borel subset $B\subseteq X^{k}$ ,

[TABLE]

We are interested in the extreme value theory for this stationary stochastic process. To the best of our knowledge, the first work in this direction was carried out by Doeblin (1940), who, among many other results, rightly observed that exceedances have Poissonian asymptotics: for all $u>0$ ,

[TABLE]

under $P$ . Here $\stackrel{{\scriptstyle d}}{{\longrightarrow}}$ denotes convergence in distribution and the notation $\mathcal{E}^{u}_{\ast}\sim\operatorname{Poi}(u^{-1})$ means that

[TABLE]

However, Doeblin’s proof of (1.2) had a subtle error, which was corrected much later in Theorem 2 of Iosifescu (1977). Therefore, we shall refer to (1.2) as the Doeblin-Iosifescu asymptotics; they form the background of this paper.

Seemingly unaware of the work of Doeblin (1940), three decades later Galambos (1972) showed that for all $u>0$ ,

[TABLE]

which is a restatement of $P(\mathcal{E}^{u}_{n}=0)\to P(\mathcal{E}^{u}_{\ast}=0)$ and hence an easy consequence of (1.2). However, because of the subtle mistake of Doeblin (1940), the above result of Galambos (1972) stands as the first correctly proven result on extreme value theory of continued fractions. This has remained a topic of current interest; see, for example, the generalizations of (1.3) to fibred systems by Nakada and Natsui (2003) and to Oppenheim continued fractions by Chang and Ma (2017).

In view of the above, the following question arises naturally:

What is the rate of convergence in the of the asymptotics in (1.2)?

In this paper, we give an upper bound on the rate of convergence using the Chen-Stein method of Arratia et al. (1989) (more specifically, Theorem 2.1 below). As far as we are aware, our work is the first to specifically employ the Chen-Stein method in the context of Gauss map and continued fractions.

The Chen-Stein method is a very useful technique which yields an upper bound that is uniform in $u$ bounded away from zero; see, Theorem 1.1 below. As a consequence, we also get a locally uniform (in $(0,\infty]$ ) upper bound for the convergence of distribution functions in (1.3) and this bound is much better than the best known bound given in Philipp (1976) (we improve a slowly varying rate of convergence to a polynomial one; see Remark 1.4 below). In fact, we give a bound on the rate of convergence of the $k^{th}$ maxima, not just the maxima, and the Chen-Stein method is powerful enough to ensure that this locally uniform upper bound turns out to be uniform over $k\in\mathbb{N}$ as well (see Corollary 1.2).

Note that (1.3) implies that the $A_{i}$ ’s are in the Fréchet(1) maximal domain of attraction. It is not difficult to observe that (1.3) holds because the $A_{i}$ ’s enjoy a very strong exponential mixing property (see (1.7) below), and each $A_{i}$ (which are anyway identically distributed because of stationarity) is regularly varying with index $-1$ , i.e.,

[TABLE]

as measures on $(0,\infty]$ . Here “ $\stackrel{{\scriptstyle v}}{{\longrightarrow}}$ ” denotes vague convergence and $\nu$ is the unique measure on $(0,\infty]$ satisfying $\nu\big{(}(u,\infty]\big{)}=u^{-1}$ for all $u\in(0,\infty)$ . This was essentially the proof given in Galambos (1972) except that he did not use the language of vague convergence, and presented a direct proof instead.

The above vague convergence will play a very important role in this paper. Since $A_{1}$ is an integer-valued random variable, it follows that for each $u>0$ ,

[TABLE]

as $n\to\infty$ . From the above convergence, (1.4) follows by invoking Theorem 3.6 of Resnick (2007). Further, using the inequality $\log{(1+x)}\leq x$ whenever $x>0$ , we get the following upper bound, which will also be very useful in this paper: for all $u>0$ ,

[TABLE]

In some sense, the $A_{i}$ ’s behave very much like an i.i.d. sequence because of the following exponential mixing property. For all $m,n\in\mathbb{N}$ , for all $F\in\sigma(A_{1},A_{2},\ldots,A_{m})$ , and for all $H\in\sigma(A_{m+n},A_{m+n+1},\ldots)$ ,

[TABLE]

where $\psi(n)=C\theta^{-n}$ for some $C>0$ and $\theta>1$ ; see, for example, Lemma 2 of Galambos (1972).

In order to state our main result and its corollary, we need to introduce some notation as described below. For each $n\in\mathbb{N}$ and for each $k\in\{1,2,\ldots,n\}$ , denote by $M_{n}^{(k)}$ , the $k^{th}$ largest in the set $\{A_{i}\log{2}:1\leq i\leq n\}$ . Then it follows from (1.2) that for all $u>0$ ,

[TABLE]

Obviously, the $k=1$ case has already been taken care of in (1.3) above. Also, let $\{l_{n}\}$ be a sequence of positive real numbers such that

[TABLE]

for all $n\in\mathbb{N}$ (here $\theta$ is as in (1.7) above). Clearly, such a sequence exists by the intermediate value theorem and it increases to infinity at a rate strictly slower than $\log{n}$ .

We are now ready to state our main result.

Theorem 1.1.

With the notation as above, we have the following upper bound on the rate of convergence in (1.2): there exists $\kappa>0$ such that for all $\delta>0$ and for all $n\in\mathbb{N}$ ,

[TABLE]

where $d_{TV}$ denotes the total variation distance.

We would like to mention that we blend probability theory (namely, the Chen-Stein method; see Theorem 2.1), ergodic theory (specifically, the exponential mixing property (1.7)) and real analysis (more precisely, a second order regular variation estimate; the second inequality in (2.11)) to prove the result above.

Theorem 1.1 has the following very strong consequence on the rate of weak convergence of scaled $k^{th}$ maxima. The upper bound here is uniform over $u$ bounded away from zero and uniform over $k\in\mathbb{N}$ at the same time.

Corollary 1.2.

With $\kappa$ as in Theorem 1.1, we get that for all $\delta>0$ and for all $n\in\mathbb{N}$ ,

[TABLE]

The above corollary follows from Theorem 1.1 by restricting the supremum in the definition of total variation distance to sets of the form $\{0,1,\ldots,k-1\}$ with $k$ running over the set of all positive integers.

Remark 1.3.

Note that if $A_{i}$ ’s were i.i.d. with same marginal distribution, then by Resnick and de Haan (1989), we would have obtained an upper bound of $O\left(\frac{1}{n}\right)$ on the rate of convergence of the maxima sequence. The Chen-Stein method gives the same rate in the i.i.d. case. In the Gauss dynamical system, we get an extra factor of $l_{n}$ because of the dependence of the $A_{i}$ ’s. However, since $l_{n}=o(\log{n})$ , it follows that our bound on the rate of convergence is $o\left(\frac{\log{n}}{n}\right)$ . Therefore, we almost attain the rate obtained in the i.i.d. case.**

Remark 1.4.

The best known rate of convergence for the maxima in our setup was obtained by Philipp (1976), who gave an upper bound of $O\left(e^{-(\log{n})^{\delta}}\right)$ with $\delta\in(0,1)$ (the constant in $O$ depends on $\delta$ ). Note that $e^{-(\log{n})^{\delta}}$ is a slowly varying function of $n$ . Therefore, by the Potter bound (see, for example, Page 32 of Resnick (2007)), it follows that $n^{-\eta}=o\left(e^{-(\log{n})^{\delta}}\right)$ for all $\eta>0$ and for all $\delta\in(0,1)$ . Hence, by Remark 1.3, it follows that

[TABLE]

for all $\delta\in(0,1)$ . Therefore, our bound on the rate of convergence is significantly better than the one obtained by Philipp (1976). More precisely, we improve a slowly varying rate of convergence to a polynomial one, bettering an error term that was used by Philipp in his proof of a conjecture of Paul Erdös.**

Note that the $D$ and $D^{\prime}$ conditions of Davis (1983) follow from (1.7). Therefore, by Example 5.1 in Davis and Hsing (1995), the following extremal point process weak convergence holds in the space $\mathcal{M}_{p}((0,\infty])$ of all Radon point measures (on $(0,\infty]$ ) equipped with the vague metric:

[TABLE]

Here the limit $Q_{\ast}$ is a Poisson random measure on $(0,\infty]$ with mean measure $\nu$ ; see Section 4.1 of Tyran-Kamińska (2010) for a direct proof of (1.9). In this paper, we observe that a tiny detour of our proof of Theorem 1.1 yields (1.9); see Section 2.3 below.

Acknowledgements

This work was initiated during a visit by M.K. and P.R. at the Tata Institute of Fundamental Research, Mumbai and a significant portion of the work was carried out when the authors were at the International Centre for Theoretical Sciences, Bangalore for the program Probabilistic Methods in Negative Curvature (ICTS/pmnc2019/03). We thank both institutes for their hospitality and the lovely working conditions. We would also like to acknowledge an anonymous reviewer and an executive editor for their careful reading of the paper. Their detailed comments have significantly improved our work (especially, Remarks 1.3 and 2.2).

2. Proofs

As mentioned earlier, the proof of Theorem 1.1 relies on the Chen-Stein method of Arratia et al. (1989). We first state their result and then present our proof. Finally, we observe how a tiny detour of the proof also establishes the weak convergence of the extremal point process of the digits arising in the continued fraction expansion.

2.1. The Chen-Stein Method of Arratia et al. (1989)

Let $\mathcal{I}$ be an index set and $\{X_{\alpha}\sim Ber(p_{\alpha})\}_{\alpha\in\mathcal{I}}$ be a collection of possibly dependent Bernoulli random variables. Suppose, for each $\alpha\in\mathcal{I}$ , there exists a subset $B_{\alpha}\subseteq\mathcal{I}$ such that roughly speaking, $X_{\alpha}$ is nearly independent of $\{X_{\beta}:\,\beta\in\mathcal{I}\setminus B_{\alpha}\}$ . Arratia et al. (1989) called $B_{\alpha}$ the “neighborhood of dependence” of $X_{\alpha}$ . Following their notation, we define

[TABLE]

where $\mathcal{H}_{\alpha}$ is the $\sigma$ -field generated by $\{X_{\beta}:\,\beta\in\mathcal{I}\setminus B_{\alpha}\}$ .

Theorem 2.1 (Theorem 2 of Arratia et al. (1989)).

Partition $\mathcal{I}$ into disjoint nonempty subsets $\mathcal{I}_{1},\mathcal{I}_{2},\ldots,\mathcal{I}_{k}$ . Let $\{Y_{\alpha}\sim Poi(p_{\alpha})\}_{\alpha\in\mathcal{I}}$ be a collection of independent Poisson random variables. Set

[TABLE]

Then

[TABLE]

where $\mathcal{L}$ denotes the joint law.

We would like to elaborate a bit on the phrase “nearly independent” used above in the context of neighborhood of dependence $B_{\alpha}$ . In many examples (e.g., $m$ -dependent time-series models, certain random graph asymptotics, etc.) where Theorem 2.1 is used, $X_{\alpha}$ is totally independent of $\{X_{\beta}:\beta\not\in B_{\alpha}\}$ making $b_{3}=0$ . In our case, however, we need to bound $b_{3}$ tightly using the “near independence” property (1.7).

2.2. Proof of Theorem 1.1

Define a new Poisson random variable $\tilde{\mathcal{E}}^{u}_{n}$ with mean $nP(n^{-1}A_{1}\log{2}>u)$ . The basic strategy of the proof is to use that

[TABLE]

and to estimate each term separately. The bound on $d_{TV}(\mathcal{E}^{u}_{n},\tilde{\mathcal{E}}^{u}_{n})$ will need Chen-Stein method and the exponential mixing property (1.7) while the second term $d_{TV}(\tilde{\mathcal{E}}^{u}_{n},\mathcal{E}^{u}_{\ast})$ will be estimated using a hard analytic bound on the second order term of the convergence in (1.5). Thus, our proof combines tools from probability theory, ergodic theory and real analysis in a systematic manner.

We will first show that there exists $\kappa_{1}>0$ such that for all $u>0$ and for all $n\geq 1$ ,

[TABLE]

where $l_{n}$ is as in (1.8). To this end, set

[TABLE]

We shall use Theorem 2.1 with $\mathcal{I}=\{1,2,\ldots,n\}$ , $k=1$ , $X_{\alpha}=I_{(n^{-1}A_{\alpha}\log{2}\in D)}$ (and hence $p_{\alpha}=E(X_{\alpha})=E(X_{1})=P(n^{-1}A_{1}\log{2}\in D)$ ) and $B_{\alpha}=(\alpha-l_{n},\alpha+l_{n})\cap\mathcal{I}$ for each $\alpha\in\mathcal{I}$ . Note that with these choices we have $W_{1}=\mathcal{E}^{u}_{n}$ and $Z_{1}$ may be thought of, intuitively, as “ $W_{1}$ if the $X_{\alpha}$ ’s were independent”.

Because of stationarity, we get

[TABLE]

In order to establish (2.6), we have to estimate the quantities defined by (2.1), (2.2) and (2.3). For the first one, observe that

[TABLE]

In order to bound the second term in (2.4), note that for any $\alpha,\beta\in\mathbb{N}$ such that $\alpha\neq\beta$ ,

[TABLE]

where the last step follows from (1.7). Applying stationarity, (1.6) and the inequality $\psi(n)\leq C$ , we get from the above bound that

[TABLE]

for all $\alpha\neq\beta$ . Hence

[TABLE]

Finally, we need to estimate (2.3). Fixing $\alpha\in\mathcal{I}$ and taking $F=(n^{-1}A_{\alpha}\log{2}\in D)$ with $D$ as in (2.7), we see that (1.7) yields

[TABLE]

for all $H\in\mathcal{H}_{\alpha}=\sigma\{X_{\beta}:\,\beta\in\mathcal{I}\setminus B_{\alpha}\}$ . The above pair of inequalities can be rewritten as

[TABLE]

yielding

[TABLE]

which holds for all $H\in\mathcal{H}_{\alpha}$ and hence

[TABLE]

almost surely. Therefore, we get

[TABLE]

where we used (1.6) and the last step follows from the choice of $l_{n}$ as given in (1.8). The above upper bound, along with (2.8) and (2.9), yields (2.6) thanks to Theorem 2.1.

We now move on to estimating the second term in (2.5). We first use Taylor’s theorem to obtain the inequality $|\log(1+x)-x|\leq\frac{x^{2}}{2}$ , which can be rewritten as

[TABLE]

for all $x>0$ . Using this inequality, we shall now bound the second order term of the convergence in (1.5).

To this end, note that

[TABLE]

By virtue of (2.10), the second term above is bounded by $\frac{\log{2}}{2u^{2}n}.$ On the other hand, using the mean value theorem, we can estimate the first term as follows:

[TABLE]

Therefore, by Lemma (8) of Freedman (1974), it follows that

[TABLE]

The above inequality, (2.6) and (2.5) imply that there exists a constant $\kappa\in(0,\infty)$ such that for all $u>0$ and for all $n\geq 1$ ,

[TABLE]

from which Theorem 1.1 follows.

Remark 2.2.

We would like to mention here an alternative approach pointed out to us by an anonymous referee. Namely, Theorem 1 of Smith (1988) gives a similar Chen-Stein type upper bound in the more general setup of non-stationary processes. It is possible to use this result to give a bound on $d_{TV}(\mathcal{E}^{u}_{n},\tilde{\mathcal{E}}^{u}_{n})$ in our work leaving the estimation of $d_{TV}(\tilde{\mathcal{E}}^{u}_{n},\mathcal{E}^{u}_{\ast})$ (based on hard analysis) as it is. This will involve (in the notation of Smith (1988)) coming up with the function $g(n,r)$ , the subsets $I_{nrk}$ and $I^{\ast}_{nrk}~{}(\subseteq I_{nrk}$ ), the latter being very similar to a neighborhood of dependence, and verifying the Condition $D^{\prime}$ of Smith (1988). We think that this will be more involved than the estimation of the terms $b_{1}$ and $b_{2}$ of our paper. On the other hand, Condition $D$ of Smith (1988) will follow directly from the exponential mixing property (1.7) of our paper and this verification will be shorter than bounding the term $b_{3}$ in our work. Overall, we feel that an application of Theorem 1 of Smith (1988), instead of Theorem 2 of Arratia et al. (1989), will perhaps result in an argument of similar length. However, we have not compared the rates obtained by these two results in our setup.**

2.3. New Proof of (1.9)

By Theorem 4.7 of Kallenberg (1983), in order to establish (1.9), it is enough to show the following:

(i)

For all $u,v\in(0,\infty]$ with $u<v$ ,

[TABLE]

as $n\to\infty$ . Of course, we follow the convention $\infty^{-1}=0$ . 2. (ii)

Whenever $0<u_{1}<v_{1}<u_{2}<v_{2}<\cdots<u_{k}<v_{k}\leq\infty$ ,

[TABLE]

as $n\to\infty$ .

By linearity of expectation, in order to establish (2.12), it is enough to do so with $u\in(0,\infty)$ and $v=\infty$ . This special case follows using stationarity of $A_{i}$ ’s and (1.5) as shown below:

[TABLE]

as $n\to\infty$ ). This proves (2.12).

On the other hand, verification of (2.13) will need a tiny detour of the proof of Theorem 1.1 (as carried out in Chiarini et al. (2015) in the context of Gaussian free fields) and Theorem 2.1 will again play a significant role in the proof. To this end, fix $0<u_{1}<v_{1}<u_{2}<v_{2}<\cdots<u_{k}<v_{k}\leq\infty$ and set

[TABLE]

Note that by (1.4) and Proposition 3.12 of Resnick (1987), it follows that

[TABLE]

as $n\to\infty$ for $D$ as in (2.14). Therefore by changing the definition of $D$ from (2.7) to (2.14) in the proof of Theorem 1.1 and using (2.15), it is easy to show that

[TABLE]

as $n\to\infty$ . In particular, $P(Q_{n}(D)=0)\to P(Q_{\ast}(D)=0)=e^{-\nu(D)}$ , which is a restatement of (2.13). This completes the proof of (1.9) based on the Chen-Stein method of Arratia et al. (1989).

Bibliography19

The reference list from the paper itself. Each links out to its DOI / PubMed record.

1Aaronson (1997) J. Aaronson (1997): An Introduction to Infinite Ergodic Theory . American Mathematical Society, Providence.
2Arratia et al. (1989) R. Arratia, L. Goldstein and L. Gordon (1989): Two moments suffice for Poisson approximations: the Chen-Stein method. Ann. Probab. 17:9–25.
3Chang and Ma (2017) Y. Chang and J. Ma (2017): Some distribution results of the Oppenheim continued fractions. Monatsh. Math. 184(3): 379–399.
4Chiarini et al. (2015) A. Chiarini, A. Cipriani and R. S. Hazra (2015): A note on the extremal process of the supercritical Gaussian free field. Electron. Commun. Probab. 20:paper no. 74, 10 pages.
5Davis (1983) R. Davis (1983): Stable limits for partial sums of dependent random variables. Ann. Probab. 11(2):262–269.
6Davis and Hsing (1995) R. Davis and T. Hsing (1995): Point processes for partial sum convergence for weakly dependent random variables with infinite variance. Ann. Probab. 23(2):879–917.
7Doeblin (1940) W. Doeblin (1940): Remarques sur la théorie métrique des fractions continues. Compositio Mathematica 7:353–371.
8Freedman (1974) D. Freedman (1974): The Poisson approximation for dependent events. Ann. Probab. 2:256–269.

TL;DR

Contribution

Findings

Abstract

Peer Reviews

Videos

Continued fractions, the Chen-Stein method and extreme value theory

Abstract.

Key words and phrases:

1991 Mathematics Subject Classification:

1. introduction

Theorem 1.1**.**

Corollary 1.2**.**

Remark 1.3**.**

Remark 1.4**.**

Acknowledgements

2. Proofs

2.1. The Chen-Stein Method of Arratia et al. (1989)

Theorem 2.1** (Theorem 2 of Arratia et al. (1989)).**

2.2. Proof of Theorem 1.1

Remark 2.2**.**

2.3. New Proof of (1.9)

Theorem 1.1.

Corollary 1.2.

Remark 1.3.

Remark 1.4.

Theorem 2.1 (Theorem 2 of Arratia et al. (1989)).

Remark 2.2.