Continued fractions, the Chen-Stein method and extreme value theory
Anish Ghosh, Maxim Kirsebom, Parthanil Roy

TL;DR
This paper applies probability, ergodic theory, and real analysis to improve bounds on the convergence rate of extreme values in continued fraction digit distributions, enhancing understanding of their asymptotic behavior.
Contribution
It introduces new bounds for convergence rates in extreme value theory for continued fractions, utilizing the Chen-Stein method and ergodic theory techniques.
Findings
Improved upper bounds for convergence rates in Doeblin-Iosifescu asymptotics.
Enhanced understanding of the extremal behavior of continued fraction digits.
Methodology applicable to order statistics and extremal point processes.
Abstract
In this work, we deal with extreme value theory in the context of continued fractions using techniques from probability theory, ergodic theory and real analysis. We give an upper bound for the rate of convergence in the Doeblin-Iosifescu asymptotics for the exceedances of digits obtained from the regular continued fraction expansion of a number chosen randomly from according to the Gauss measure. As a consequence, we significantly improve the best known upper bound on the rate of convergence of the maxima in this case. We observe that the asymptotics of order statistics and the extremal point process can also be investigated using our methods.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Continued fractions, the Chen-Stein method and extreme value theory
Anish Ghosh
Anish Ghosh, School of Mathematics, Tata Institute of Fundamental Research, Mumbai 400005, India
,
Maxim Sølund Kirsebom
Maxim Sølund Kirsebom, Department of Mathematics, University of Hamburg, 20146 Hamburg, Germany
and
Parthanil Roy
Parthanil Roy, Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, Bangalore 560059, India
Abstract.
In this work, we deal with extreme value theory in the context of continued fractions using techniques from probability theory, ergodic theory and real analysis. We give an upper bound for the rate of convergence in the Doeblin-Iosifescu asymptotics for the exceedances of digits obtained from the regular continued fraction expansion of a number chosen randomly from according to the Gauss measure. As a consequence, we significantly improve the best known upper bound on the rate of convergence of the maxima in this case. We observe that the asymptotics of order statistics and the extremal point process can also be investigated using our methods.
Key words and phrases:
Continued fractions, Gauss map, Chen-Stein method, Poisson approximation, rate of convergence, extreme value theory
1991 Mathematics Subject Classification:
Primary 60G70; Secondary 11K50
A. G. gratefully acknowledges support from a grant from the Indo-French Centre for the Promotion of Advanced Research; a Swarnajayanti Fellowship from Department of Science and Technology, Government of India and a MATRICS grant from the Science and Engineering Research Board. P. R. acknowledges the support from a MATRICS grant from the Science and Engineering Research Board and a Swarnajayanti Fellowship from Department of Science and Technology, Government of India.
1. introduction
This short paper establishes an upper bound for the Doeblin-Iosifescu asymptotics for exceedances (defined below) arising from the Gauss dynamical system. We briefly recall the basic facts about continued fraction expansions and the Gauss map. The reader is referred to the classic text Khintchine (1964) for more details. Let and for all let denote the regular continued fraction expansion. Define a transformation by
[TABLE]
where denotes the fractional part. With the notations above, for all ,
[TABLE]
It is easy to check that defines a nonsingular transformation on , where denotes the Lebesgue measure. This means that for all measurable , we have if and only if .
Let denote the dual operator (see, for example, Page 33 of Aaronson (1997)) corresponding to that satisfies
[TABLE]
for all and for all . It is easy to extend the domain of definition of to all nonnegative measurable functions. Solving the functional equation , we get . Hence by Proposition 1.4.1 of Aaronson (1997), the probability measure
[TABLE]
on is -invariant making a positive transformation (see, for example, Aaronson (1997)). The measure is known as the Gauss measure.
From now on, we shall think of as a sequence of random variables defined on the probability space . The -invariance of makes this a stationary sequence, i.e., for all , for all and for all Borel subset ,
[TABLE]
We are interested in the extreme value theory for this stationary stochastic process. To the best of our knowledge, the first work in this direction was carried out by Doeblin (1940), who, among many other results, rightly observed that exceedances have Poissonian asymptotics: for all ,
[TABLE]
under . Here denotes convergence in distribution and the notation means that
[TABLE]
However, Doeblin’s proof of (1.2) had a subtle error, which was corrected much later in Theorem 2 of Iosifescu (1977). Therefore, we shall refer to (1.2) as the Doeblin-Iosifescu asymptotics; they form the background of this paper.
Seemingly unaware of the work of Doeblin (1940), three decades later Galambos (1972) showed that for all ,
[TABLE]
which is a restatement of and hence an easy consequence of (1.2). However, because of the subtle mistake of Doeblin (1940), the above result of Galambos (1972) stands as the first correctly proven result on extreme value theory of continued fractions. This has remained a topic of current interest; see, for example, the generalizations of (1.3) to fibred systems by Nakada and Natsui (2003) and to Oppenheim continued fractions by Chang and Ma (2017).
In view of the above, the following question arises naturally:
What is the rate of convergence in the of the asymptotics in (1.2)?
In this paper, we give an upper bound on the rate of convergence using the Chen-Stein method of Arratia et al. (1989) (more specifically, Theorem 2.1 below). As far as we are aware, our work is the first to specifically employ the Chen-Stein method in the context of Gauss map and continued fractions.
The Chen-Stein method is a very useful technique which yields an upper bound that is uniform in bounded away from zero; see, Theorem 1.1 below. As a consequence, we also get a locally uniform (in ) upper bound for the convergence of distribution functions in (1.3) and this bound is much better than the best known bound given in Philipp (1976) (we improve a slowly varying rate of convergence to a polynomial one; see Remark 1.4 below). In fact, we give a bound on the rate of convergence of the maxima, not just the maxima, and the Chen-Stein method is powerful enough to ensure that this locally uniform upper bound turns out to be uniform over as well (see Corollary 1.2).
Note that (1.3) implies that the ’s are in the Fréchet(1) maximal domain of attraction. It is not difficult to observe that (1.3) holds because the ’s enjoy a very strong exponential mixing property (see (1.7) below), and each (which are anyway identically distributed because of stationarity) is regularly varying with index , i.e.,
[TABLE]
as measures on . Here “” denotes vague convergence and is the unique measure on satisfying \nu\big{(}(u,\infty]\big{)}=u^{-1} for all . This was essentially the proof given in Galambos (1972) except that he did not use the language of vague convergence, and presented a direct proof instead.
The above vague convergence will play a very important role in this paper. Since is an integer-valued random variable, it follows that for each ,
[TABLE]
as . From the above convergence, (1.4) follows by invoking Theorem 3.6 of Resnick (2007). Further, using the inequality whenever , we get the following upper bound, which will also be very useful in this paper: for all ,
[TABLE]
In some sense, the ’s behave very much like an i.i.d. sequence because of the following exponential mixing property. For all , for all , and for all ,
[TABLE]
where for some and ; see, for example, Lemma 2 of Galambos (1972).
In order to state our main result and its corollary, we need to introduce some notation as described below. For each and for each , denote by , the largest in the set . Then it follows from (1.2) that for all ,
[TABLE]
Obviously, the case has already been taken care of in (1.3) above. Also, let be a sequence of positive real numbers such that
[TABLE]
for all (here is as in (1.7) above). Clearly, such a sequence exists by the intermediate value theorem and it increases to infinity at a rate strictly slower than .
We are now ready to state our main result.
Theorem 1.1**.**
With the notation as above, we have the following upper bound on the rate of convergence in (1.2): there exists such that for all and for all ,
[TABLE]
where denotes the total variation distance.
We would like to mention that we blend probability theory (namely, the Chen-Stein method; see Theorem 2.1), ergodic theory (specifically, the exponential mixing property (1.7)) and real analysis (more precisely, a second order regular variation estimate; the second inequality in (2.11)) to prove the result above.
Theorem 1.1 has the following very strong consequence on the rate of weak convergence of scaled maxima. The upper bound here is uniform over bounded away from zero and uniform over at the same time.
Corollary 1.2**.**
With as in Theorem 1.1, we get that for all and for all ,
[TABLE]
The above corollary follows from Theorem 1.1 by restricting the supremum in the definition of total variation distance to sets of the form with running over the set of all positive integers.
Remark 1.3**.**
Note that if ’s were i.i.d. with same marginal distribution, then by Resnick and de Haan (1989), we would have obtained an upper bound of on the rate of convergence of the maxima sequence. The Chen-Stein method gives the same rate in the i.i.d. case. In the Gauss dynamical system, we get an extra factor of because of the dependence of the ’s. However, since , it follows that our bound on the rate of convergence is . Therefore, we almost attain the rate obtained in the i.i.d. case.**
Remark 1.4**.**
The best known rate of convergence for the maxima in our setup was obtained by Philipp (1976), who gave an upper bound of with (the constant in depends on ). Note that is a slowly varying function of . Therefore, by the Potter bound (see, for example, Page 32 of Resnick (2007)), it follows that for all and for all . Hence, by Remark 1.3, it follows that
[TABLE]
for all . Therefore, our bound on the rate of convergence is significantly better than the one obtained by Philipp (1976). More precisely, we improve a slowly varying rate of convergence to a polynomial one, bettering an error term that was used by Philipp in his proof of a conjecture of Paul Erdös.**
Note that the and conditions of Davis (1983) follow from (1.7). Therefore, by Example 5.1 in Davis and Hsing (1995), the following extremal point process weak convergence holds in the space of all Radon point measures (on ) equipped with the vague metric:
[TABLE]
Here the limit is a Poisson random measure on with mean measure ; see Section 4.1 of Tyran-Kamińska (2010) for a direct proof of (1.9). In this paper, we observe that a tiny detour of our proof of Theorem 1.1 yields (1.9); see Section 2.3 below.
Acknowledgements
This work was initiated during a visit by M.K. and P.R. at the Tata Institute of Fundamental Research, Mumbai and a significant portion of the work was carried out when the authors were at the International Centre for Theoretical Sciences, Bangalore for the program Probabilistic Methods in Negative Curvature (ICTS/pmnc2019/03). We thank both institutes for their hospitality and the lovely working conditions. We would also like to acknowledge an anonymous reviewer and an executive editor for their careful reading of the paper. Their detailed comments have significantly improved our work (especially, Remarks 1.3 and 2.2).
2. Proofs
As mentioned earlier, the proof of Theorem 1.1 relies on the Chen-Stein method of Arratia et al. (1989). We first state their result and then present our proof. Finally, we observe how a tiny detour of the proof also establishes the weak convergence of the extremal point process of the digits arising in the continued fraction expansion.
2.1. The Chen-Stein Method of Arratia et al. (1989)
Let be an index set and be a collection of possibly dependent Bernoulli random variables. Suppose, for each , there exists a subset such that roughly speaking, is nearly independent of . Arratia et al. (1989) called the “neighborhood of dependence” of . Following their notation, we define
[TABLE]
where is the -field generated by .
Theorem 2.1** (Theorem 2 of Arratia et al. (1989)).**
Partition into disjoint nonempty subsets . Let be a collection of independent Poisson random variables. Set
[TABLE]
Then
[TABLE]
where denotes the joint law.
We would like to elaborate a bit on the phrase “nearly independent” used above in the context of neighborhood of dependence . In many examples (e.g., -dependent time-series models, certain random graph asymptotics, etc.) where Theorem 2.1 is used, is totally independent of making . In our case, however, we need to bound tightly using the “near independence” property (1.7).
2.2. Proof of Theorem 1.1
Define a new Poisson random variable with mean . The basic strategy of the proof is to use that
[TABLE]
and to estimate each term separately. The bound on will need Chen-Stein method and the exponential mixing property (1.7) while the second term will be estimated using a hard analytic bound on the second order term of the convergence in (1.5). Thus, our proof combines tools from probability theory, ergodic theory and real analysis in a systematic manner.
We will first show that there exists such that for all and for all ,
[TABLE]
where is as in (1.8). To this end, set
[TABLE]
We shall use Theorem 2.1 with , , (and hence ) and for each . Note that with these choices we have and may be thought of, intuitively, as “ if the ’s were independent”.
Because of stationarity, we get
[TABLE]
In order to establish (2.6), we have to estimate the quantities defined by (2.1), (2.2) and (2.3). For the first one, observe that
[TABLE]
In order to bound the second term in (2.4), note that for any such that ,
[TABLE]
where the last step follows from (1.7). Applying stationarity, (1.6) and the inequality , we get from the above bound that
[TABLE]
for all . Hence
[TABLE]
Finally, we need to estimate (2.3). Fixing and taking with as in (2.7), we see that (1.7) yields
[TABLE]
for all . The above pair of inequalities can be rewritten as
[TABLE]
yielding
[TABLE]
which holds for all and hence
[TABLE]
almost surely. Therefore, we get
[TABLE]
where we used (1.6) and the last step follows from the choice of as given in (1.8). The above upper bound, along with (2.8) and (2.9), yields (2.6) thanks to Theorem 2.1.
We now move on to estimating the second term in (2.5). We first use Taylor’s theorem to obtain the inequality , which can be rewritten as
[TABLE]
for all . Using this inequality, we shall now bound the second order term of the convergence in (1.5).
To this end, note that
[TABLE]
By virtue of (2.10), the second term above is bounded by On the other hand, using the mean value theorem, we can estimate the first term as follows:
[TABLE]
Therefore, by Lemma (8) of Freedman (1974), it follows that
[TABLE]
The above inequality, (2.6) and (2.5) imply that there exists a constant such that for all and for all ,
[TABLE]
from which Theorem 1.1 follows.
Remark 2.2**.**
We would like to mention here an alternative approach pointed out to us by an anonymous referee. Namely, Theorem 1 of Smith (1988) gives a similar Chen-Stein type upper bound in the more general setup of non-stationary processes. It is possible to use this result to give a bound on in our work leaving the estimation of (based on hard analysis) as it is. This will involve (in the notation of Smith (1988)) coming up with the function , the subsets and ), the latter being very similar to a neighborhood of dependence, and verifying the Condition of Smith (1988). We think that this will be more involved than the estimation of the terms and of our paper. On the other hand, Condition of Smith (1988) will follow directly from the exponential mixing property (1.7) of our paper and this verification will be shorter than bounding the term in our work. Overall, we feel that an application of Theorem 1 of Smith (1988), instead of Theorem 2 of Arratia et al. (1989), will perhaps result in an argument of similar length. However, we have not compared the rates obtained by these two results in our setup.**
2.3. New Proof of (1.9)
By Theorem 4.7 of Kallenberg (1983), in order to establish (1.9), it is enough to show the following:
- (i)
For all with ,
[TABLE]
as . Of course, we follow the convention . 2. (ii)
Whenever ,
[TABLE]
as .
By linearity of expectation, in order to establish (2.12), it is enough to do so with and . This special case follows using stationarity of ’s and (1.5) as shown below:
[TABLE]
as ). This proves (2.12).
On the other hand, verification of (2.13) will need a tiny detour of the proof of Theorem 1.1 (as carried out in Chiarini et al. (2015) in the context of Gaussian free fields) and Theorem 2.1 will again play a significant role in the proof. To this end, fix and set
[TABLE]
Note that by (1.4) and Proposition 3.12 of Resnick (1987), it follows that
[TABLE]
as for as in (2.14). Therefore by changing the definition of from (2.7) to (2.14) in the proof of Theorem 1.1 and using (2.15), it is easy to show that
[TABLE]
as . In particular, , which is a restatement of (2.13). This completes the proof of (1.9) based on the Chen-Stein method of Arratia et al. (1989).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aaronson (1997) J. Aaronson (1997): An Introduction to Infinite Ergodic Theory . American Mathematical Society, Providence.
- 2Arratia et al. (1989) R. Arratia, L. Goldstein and L. Gordon (1989): Two moments suffice for Poisson approximations: the Chen-Stein method. Ann. Probab. 17:9–25.
- 3Chang and Ma (2017) Y. Chang and J. Ma (2017): Some distribution results of the Oppenheim continued fractions. Monatsh. Math. 184(3): 379–399.
- 4Chiarini et al. (2015) A. Chiarini, A. Cipriani and R. S. Hazra (2015): A note on the extremal process of the supercritical Gaussian free field. Electron. Commun. Probab. 20:paper no. 74, 10 pages.
- 5Davis (1983) R. Davis (1983): Stable limits for partial sums of dependent random variables. Ann. Probab. 11(2):262–269.
- 6Davis and Hsing (1995) R. Davis and T. Hsing (1995): Point processes for partial sum convergence for weakly dependent random variables with infinite variance. Ann. Probab. 23(2):879–917.
- 7Doeblin (1940) W. Doeblin (1940): Remarques sur la théorie métrique des fractions continues. Compositio Mathematica 7:353–371.
- 8Freedman (1974) D. Freedman (1974): The Poisson approximation for dependent events. Ann. Probab. 2:256–269.
