Rates of convergence for extremal spacings in Kakutani's random interval-splitting process
Fraser Daly, Andrew Wade

TL;DR
This paper analyzes the rate of convergence in the distribution of extremal spacings in Kakutani's interval-splitting process, providing quantitative bounds and connecting to branching processes.
Contribution
It offers the first quantitative bounds for the convergence rates of the maximum and minimum sub-interval lengths in Kakutani's process, including Berry-Esseen bounds and exponential convergence results.
Findings
Central limit theorem for maximum sub-interval length with quantitative bounds
Exponential distribution convergence for minimum sub-interval length
Quantitative error bounds using Hermite-Edgeworth expansion
Abstract
Kakutani's random interval-splitting process iteratively divides, via a uniformly random splitting point, the largest sub-interval in a partition of the unit interval. The length of the longest sub-interval after steps, suitably centred and scaled, is known to satisfy a central limit theorem as . We provide a quantitative (Berry-Esseen) upper bound for the finite- approximation in the central limit theorem, with conjecturally optimal rates in . We also prove convergence to an exponential distribution for the length of the smallest sub-interval, with quantitative bounds. The Kakutani process can be embedded in certain branching and fragmentation processes, and we translate our results into that context also. Our proof uses conditioning on an intermediate time, a conditional independence structure for statistics involving small sub-intervals, an Hermite-Edgeworth…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic processes and statistical mechanics
Rates of convergence for extremal spacings in Kakutani’s
random interval-splitting process
Fraser Daly111Department of Actuarial Mathematics and Statistics, and the Maxwell Institute for Mathematical Sciences, Heriot–Watt University, Edinburgh EH14 4AS; [email protected]
Andrew Wade222Department of Mathematical Sciences, Durham University, Durham DH1 3LE; [email protected]
Abstract
Kakutani’s random interval-splitting process iteratively divides, via a uniformly random splitting point, the largest sub-interval in a partition of the unit interval. The length of the longest sub-interval after steps, suitably centred and scaled, is known to satisfy a central limit theorem as . We provide a quantitative (Berry–Esseen) upper bound for the finite- approximation in the central limit theorem, with conjecturally optimal rates in . We also prove convergence to an exponential distribution for the length of the smallest sub-interval, with quantitative bounds. The Kakutani process can be embedded in certain branching and fragmentation processes, and we translate our results into that context also. Our proof uses conditioning on an intermediate time, a conditional independence structure for statistics involving small sub-intervals, an Hermite–Edgeworth expansion, and moments estimates with quantitative error bounds.
Key words: Interval division, Kakutani process, maximum/minimum gap, central limit theorem, Berry–Esseen theorem, Crump–Mode–Jagers process, branching random walk.
AMS Subject Classification: 60F05 (Primary) 60G18, 60J80 (Secondary).
1 Kakutani’s interval-splitting process
1.1 Main results
The subject of this paper is the following random process which takes values on partitions of the unit interval and evolves by successive uniform binary splitting of the maximal interval, attributed to Kakutani. Start with the unit interval ; at each subsequent step, choose the largest of the current collection of intervals, and split it into two random subintervals by inserting a uniform random splitting point, independently of previous steps. Ties can be broken arbitrarily, but, with probability , they do not occur.
We give some slightly more formal definitions, and some historical context and motivation, in Section 1.3 below. Our main result, Theorem 1.3, concerns the asymptotics of the random variable , the length of the largest among the subintervals in the partition resulting from steps of the process described above.
Since the sum of all the subintervals in the partition is always , it is clear that , a.s., for every . The strong law of large numbers
[TABLE]
is due to Lootgieter (Corollary 1.2 of [20, p. 397]) and Pyke (Lemma 1 in [26, p. 159]). Over 20 years later, Pyke and van Zwet [27, p. 414] obtained the following central limit theorem (CLT). Let denote the cumulative distribution function of the standard normal distribution, i.e., for .
Proposition 1.1** (Pyke & van Zwet, 2004).**
Let . Then
[TABLE]
Remark 1.2*.*
Via an embedding we describe in Section 1.2, and an inversion we describe in Section 1.4, Proposition 1.1 also follows from earlier work of Sibuya & Itoh [31]: see Remark 1.11.
Our main result gives Berry–Esseen bounds on the rate of convergence in the CLT of Proposition 1.1.
Theorem 1.3**.**
There exists a constant such that,
[TABLE]
where is as defined in Proposition 1.1.
Remark 1.4*.*
We conjecture that the error bound in Theorem 1.3 is optimal, as is generic in the classical Berry–Esseen theorem (see e.g. [10, §7.6]). While we do not have a proof of a lower bound of matching order to the upper bound in (1.2), Figure 1 presents some simulation evidence that appears to support this conjecture.
A second limit theorem that we deduce from some of the structural results that we develop in this paper concerns the length of the smallest gap after steps of Kakutani’s interval-splitting process. In particular, the next result shows that converges in distribution to a unit-mean exponential distribution as .
Theorem 1.5**.**
There exists a constant such that,
[TABLE]
Remark 1.6*.*
We suspect that the polynomial rate in (1.3) cannot be improved (see Figure 1), in contrast to the rate in the analogous limit for the Dirichlet process at (1.19) below, but we are unsure if the factor is sharp. Establishing the optimal rate of convergence in Theorem 1.5, including removing any possibly superfluous logarithmic factors and explaining apparent non-linearities in the right-hand plot of Figure 1, is left as a topic for future work.
We give an overview of our proof strategy, and the organization of the paper, in Section 1.5 below. First, in Section 1.2 we describe how our result can be interpreted in the context of a branching process, and its relationship to some adjacent models.
1.2 Related branching, fragmentation, and parking processes
The recursive interval division of the Kakutani process is reminiscent of branching and fragmentation structures. Indeed, the Kakutani process admits several embeddings into (perhaps more familiar) classes of probability models, and is adjacent to several others. In this section we indicate some of these links, where, typically, optimal Berry–Esseen results do not appear to be known, in part to draw attention to scope for possible extensions of the present work.
We believe that Kingman [16] was the first to explicitly use an embedding into a branching process to put the strong law (1.1) for the Kakutani process into a more general framework; essentially the same construction appears in Sibuya & Itoh [31] but without the explicit link to the Kakutani model. Subsequent work has also emphasized correspondences to fragmentation processes and branching random walks, where considerable technology has been developed. The Kakutani process translates to quite special versions of these general structures, and, as far as we are aware, our main results are not subsumed within the general literature.
Kingman [16] observed that applying the bijective map to the collection of subinterval lengths in the Kakutani process gives a process on in which interval splitting translates to additive displacement; there are several ways to exploit this. The first is Kingman’s original embedding.
Total population in a binary branching process.
In the Kakutani process, since , a.s., every subinterval present in the process at some time , say, will be split at some time , and hence have length equal to . In other words, is the collection of all (a.s. distinct) subinterval lengths observed during the Kakutani process, every appears in , and, typically, for many pairs , . The length is in position in viewed as an ordered list. That is, if , we have if and only if .
Applying the map gives the branching process interpretation as a population of individuals with birth times indexed by : the ancestor is born at time [math], and every individual gives birth to two offspring at two (different, correlated) times distributed as and , . (Throughout the paper, the notation stands for the continuous uniform distribution on interval , for real .) The quantity is the total population number observed before time .
This construction, relating in the Kakutani process to total population size in a Crump–Mode–Jagers branching process with binary offspring and correlated birth times, is due to Kingman [16]. The following is a translation of Theorem 1.3 (in fact, the translation is direct from Theorem 1.10 via the inversion described in Section 1.4 below).
Corollary 1.7**.**
There exists such that, with as defined in Proposition 1.1,
[TABLE]
A different perspective on Kingman’s construction, going back to Sibuya & Itoh [31], interprets as the height of a random fragmentation tree, a subject studied more generally in [13] (see also the references therein). That description is very close to the one we present in Section 1.4, and so we defer further discussion to Remark 1.11 below.
Recently, for a general class of Crump–Mode–Jagers processes, deep results [11, 12] have been obtained generalizing the convergence in distribution part of Corollary 1.7. Theorem 3.2 of [11] is specific to binary branching but does not apply directly to our case, as that work assumes independent birth times for siblings, while [12] does admit the correlations present here. Of course, the non-quantitative part of Corollary 1.7 is already known via a translation of Proposition 1.1, but the results of [11, 12] indicate that there is a much more general setting than the Kakutani process to which one might seek to extend the Berry–Esseen results of Theorem 1.3. We are not aware of any existing Berry–Esseen results in the Crump–Mode–Jagers context, and our approach in the present paper seems to make essential use of features of the Kakutani model.
Extremum-driven branching random walk.
Applying the bijective map to the collection of subinterval lengths, but retaining the original time indexing, translates the Kakutani process to the following special type of discrete-time branching random walk on . Start with a single particle at the origin. At each step, the leftmost particle is removed, and replaced with two independent offspring, displaced by independent unit-mean exponential random variables relative to the parent. Let be the location of the leftmost, respectively, rightmost particles after branching events. Then and in terms of in the Kakutani process described in Section 1.1. The following is then a consequence of Theorems 1.3 and 1.5.
Corollary 1.8**.**
There exists such that, with as defined in Proposition 1.1,
[TABLE]
Moreover,
[TABLE]
Other models with branching and rank-dependent dynamics, but a fixed population size, have been studied, motivated, for example, by selective pressures on individuals in an evolving population [7, 5] or on species in an evolving ecosystem [3].
A zero-length slack parking model.
Fix parameters and . Take a sequence of independent random variables, representing the left endpoints of length- cars that successively arrive at the kerb . Each car is allowed to park if and only if (i) its extent is contained in the subset of not already occupied by parked cars, and (ii) the gap in which it parks has length exceeding 1. The process becomes jammed once no gap between neighbouring cars has length exceeding , at which point no more cars can park. Let denote the (random) number of cars parked at jamming.
The version of this model was first studied by Rényi [28], who evaluated the asymptotic parking fraction . The generalization , which introduces some slack around each car, is included in [25, 31], with the asymptotic parking fraction being obtained in [17]. The parking model is part of a wide class of models motivated by various irreversible physical and chemical processes, known more broadly as random sequential adsorption.
When (zero-length cars) the random variable has the interpretation as the number of splitting events in the following procedure: starting with the interval , uniformly split any interval of length exceeding until there is no such interval left. It is then not hard to see that has the same distribution as the number of steps in the Kakutani model until all intervals have length at most . This quantity is described in more detail in Section 1.4 below; specifically, is defined at (1.14). In Theorem 1.10 below we give a Berry–Esseen theorem for as , and hence for as .
For (the original Rényi model) a CLT for is due to Dvoretzky & Robbins [9], and a Berry–Esseen theorem for was obtained by Schreiber, Penrose & Yukich [30], as a special case of a much more general result, but with factors in the rate that Theorem 1.10 below suggests one might be able to remove, in the one-dimensional case, by adapting the approach of the present paper. More recently, a refined general approach for Berry–Esseen bounds for functions of Poisson point processes, with presumably-optimal rates, has been given by [18], but, as far as we know, has yet to be successfully applied to random sequential adsorption.
1.3 Notation and background
To discuss the earlier work on the Kakutani model, and for later use in the present paper, we need some more notation. At time (that is, after splitting events) the process is represented by the ordered collection of interval end-points
[TABLE]
The associated gap lengths at time are
[TABLE]
The maximal and minimal gap lengths at time are, respectively,
[TABLE]
The dynamics are driven by , a sequence of independent random variables. Let and then, recursively, given
[TABLE]
define (which is uniquely defined). Then set
[TABLE]
The fact that the successive divisions are generated by continuous distributions ensures that, a.s., properties (1.6) and (1.9) persist for all .
Kakutani conjectured in a 1973 lecture, in response to a question from H. Araki (see [32, p. 341], [15, p. 571] and [1, p. 258]) that the empirical distribution of endpoints is asymptotically uniform, or equidistributed, i.e.
[TABLE]
Kakutani [14] obtained an analogous result for a deterministic model in which the splitting distribution is substituted by a fixed parameter for the relative location of the division point. For the -splitting process, establishing (1.11) was posed as a challenge by R. Dudley in 1976 [6, p. 2443]. The result (1.11) was proved by van Zwet in [33] (submitted in February 1977) and, independently, using similar ideas, by Lootgieter [19, 20] (submitted later the same year). Van Zwet [33, p. 137] also acknowledges that Komlós and Tusnády were aware that a proof could be constructed by the same method. A paper of Slud [32], submitted in October 1976, asserted a proof of (1.11), via a rather different approach, but was found to contain an error; Slud later produced a correction. Results on empirical distributions extending (1.11) to a much wider class of interval-division schemes can be found in [22]; see also references therein.
Define, for and ,
[TABLE]
the distribution function of the empirical measure for the normalized gap lengths. Pyke’s uniform limit theorem [26, Thm. 1, p. 161] shows that
[TABLE]
which shows that a typical gap is approximately in distribution. While (1.13) says that there are at most gaps of size bigger than , , Proposition 1.1 says that even the maximum gap will typically be only of order different from .
1.4 Inversion and threshold times
Associated to the Kakutani process are the random times , , defined by
[TABLE]
since , we have for all . Moreover, since a.s. (which follows of course from (1.1), but also via a short elementary argument, as given by Kingman [16, p. 148]), we have . The usefulness of as defined at (1.14) for analyzing is due, firstly, to the inversion relation
[TABLE]
and, secondly, a more readily accessible recursive structure. Indeed, by conditioning on the first split (through the variable ) we obtain the fundamental self-similarity relation
[TABLE]
In (1.16), the processes and are independent of and of each other, and each has the same distribution as . To see that (1.16) is true, observe that to reach time the intervals and undergo independent Kakutani processes (and one can choose to execute all splittings on one side first, for example) but scaled by the relevant length factor, i.e., or . When the identification of in (1.16) with the generating sequence of the process is not relevant, we can write (1.16) in distributional form as
[TABLE]
where, on the right-hand side of (1.17), , and are independent. The central role of the threshold times defined at (1.14) and their associated recursions (1.16)–(1.17) was already identified, independently, by van Zwet [33, p. 134] and Lootgieter [19, p. 404].
The CLT for , Proposition 1.1, was obtained by Pyke & van Zwet [27] via the inversion (1.15) from a corresponding CLT for .
Proposition 1.9** (Pyke & van Zwet, 2004).**
Let be as in Proposition 1.1. As , converges to the normal distribution with mean [math] and variance .
In a similar way, we will obtain our quantitative CLT, Theorem 1.3, via an inversion of a corresponding Berry–Esseen result for .
Theorem 1.10**.**
There exists a constant such that, for all ,
[TABLE]
where is as defined in Proposition 1.1.
Remark 1.11*.*
Proposition 1.9 is Corollary 3.3 in [27, p. 396]. By Kingman’s embedding (see Section 1.2), it can also be recovered from the earlier Theorem 2 of [31]. See also [13, pp. 435–6] for a framing in terms of general CLTs for the height of random fragmentation trees; we do not know of any Berry–Esseen results in that context.
1.5 Overview of the proofs and some further remarks
Overview of the proofs.
The main work of the paper is proving Theorem 1.10. Theorem 1.3 will be deduced from a careful (but not difficult) inversion of Theorem 1.10, and Theorem 1.5 from some results established in the course of the proof of Theorem 1.10. The proof of Theorem 1.10 is in part analytical, with some delicate estimates needed to obtain our presumably-optimal rates, and the overall structure is perhaps of interest more broadly, and is broken down into the following main steps.
- •
The first step in the main line of the proof is to apply the classical Berry–Esseen theorem to prove (in Section 4) a conditional Berry–Esseen theorem (Proposition 4.1) for given the first steps of the process, where , ensuring that can be expressed as a sum of independent variables, using the basic self-similarly (1.17). Eventually, we will take for a small .
- •
The centering and scaling quantities in the conditional Berry–Esseen bound are themselves random variables (functions of ), related to conditional means and variances, denoted by and defined at (4.2) and (4.7) below. To “uncondition” the bound needs detailed information about the joint distribution of and , summarized in Proposition 5.1 on their mixed moments. This is proved in Section 5, with groundwork laid in Sections 3 and 4 and making use of auxiliary results stated in Appendix A.
- •
To study the joint distribution of and , we exploit the fact that both can be expressed as sum-type statistics of small gaps, that enjoy a crucial conditional independence structure which we clarify in Section 3. This study of small gaps will also lead to a short proof of Theorem 1.5, given in Section 3.2.
- •
The “unconditioning” is achieved by a sort of Hermite–Edgeworth expansion, stated in Proposition 6.1. Combined with Proposition 5.1 to control the remaining error terms in the expansion leads to the proof of Theorem 1.10, given in Section 6.
- •
To prepare for all of the above, we first collect some results (some known, some new, making use of ideas from [27]) on moments of and in Section 2.
Following the proofs of our main theorems, the proofs of Corollaries 1.7 and 1.8 are given in Section 6.
Comparison to the Dirichlet process and uniform spacings.
A natural comparator to the Kakutani problem is the Dirichlet partition of the interval generated by random variables, which can also be generated sequentially, like the Kakutani process, but one splits an interval chosen at random with probability proportional to its length (rather than always the longest). Denote the maximal spacing in the Dirichlet process after divisions by . A result of Lévy from 1939 shows that has a Gumbel limit, and Slud [32] showed that , a.s., as .
Let denote the length of the smallest gap in the -division Dirichlet process. A direct calculation shows that , , from which one can show
[TABLE]
and this bound is of the optimal order in . It follows from (1.19) that converges to a unit-mean exponential, roughly half the smallest gap in the Kakutani process. Some intuition for this comes from observing that in the Dirichlet process, one is typically splitting a gap of length on average half the size of , the length split in the Kakutani process.
Other order statistics.
We expect that one can obtain some information about the length of near-maximal gaps, or near-minimal gaps, using our method and some extra work. It would be of interest to obtain results for more general order statistics of spacings, but it is not clear to us how to do this.
2 Means, variances, and moment bounds
In this section we study moments of the random variables and . For , let and . Since , a.s., for , we have for all , so of interest is only . The following exact results are known.
Proposition 2.1** (Lootgieter, 1977; van Zwet, 1978).**
It holds that
[TABLE]
Moreover, with , it holds that
[TABLE]
Remark 2.2*.*
Proposition 2.1 is due, independently, to van Zwet [33] (submitted February 1977) and Lootgieter [20] (September 1977). Proposition 1.2 of [20, p. 396] covers both results for and , while [33] has the result for , and showed that for , but had not evaluated . Both proofs go by an analysis of the recursion (1.17). The full result for was rediscovered by Pyke & van Zwet [27, p. 392]. Furthermore, coincides with the quantity in [31], which satisfies the same integral equation, and then (2.2) can be found in [31, pp. 75, 83].
To prepare for our later arguments, we build on analysis of [27] to state some estimates for (higher) moments of (Lemma 2.3) and for moments of (Lemma 2.4). The intuition in both cases is that the variables are concentrated about their respective means, namely (for small ) and (for large ). The following rough, but useful, upper bounds on the moments of are derived directly from results in [27].
Lemma 2.3**.**
For each there exists with , for all .
Proof.
Let . It follows from (1.14) that for , and for all , so that is non-increasing in . Lemma 2.1 of [27, p. 385] shows that, for every ,
[TABLE]
Moreover, Theorem 2.2 of [27, p. 386] and the algebra relating cumulants to moments [24, pp. 266–7] shows that there is a constant such that
[TABLE]
Combining (2.4) with the case of (2.3), we verify the statement in the lemma. ∎
Next are bounds on the moments of .
Lemma 2.4**.**
For each there exists with , for all .
Proof.
Let . By Lemma 2.3 and Markov’s inequality, for and ,
[TABLE]
From the integration by parts formula for moments [10, p. 75], combined with (1.15) and the fact that for all , we obtain
[TABLE]
Hence from (2) and (2.5), for ,
[TABLE]
which yields the claimed bound, with . ∎
We turn to centred moments of ; the intuition here, from the CLT in Proposition 1.1, is that is tight. The precise statement that we need is the following, which exploits some further ideas from Pyke & van Zwet [27].
Lemma 2.5**.**
There is a constant such that
[TABLE]
Proof.
Since , a.s., it holds that satisfies
[TABLE]
By (2.8) we see that , a.s. Then, by Markov’s inequality and the fact that from Lemma 2.4, we obtain
[TABLE]
Moreover, since , a.s., it follows that
[TABLE]
Next we follow [27, pp. 400–1]. Let ; recall that and, from (2.1), that . From the inversion relation (1.15) and Chebyshev’s inequality,
[TABLE]
Similarly,
[TABLE]
In particular, taking , which, for has for all , the formula (with ) from Proposition 2.1 yields
[TABLE]
Similarly, taking , for , we get, for ,
[TABLE]
If and , then , so we obtain
[TABLE]
Summing (2) and (2.11), we conclude that, for all ,
[TABLE]
For a random variable and a constant , we have (e.g. [10, p. 75])
[TABLE]
which, applied with and , yields
[TABLE]
Combining the above bound with (2) completes the proof. ∎
The last result of this section is more technical in nature, concerning moments of harmonic sums of the ; it plays an important role in the sections below.
Corollary 2.6**.**
There is a constant such that, for all and all ,
[TABLE]
Proof.
Define . Since , note that , a.s. Let , and observe that, for every ,
[TABLE]
using the bounds and . By Lemma 2.5, we thus obtain
[TABLE]
where . Using (2.13) and the triangle inequality,
[TABLE]
An induction on using the above relation, and the fact that , then shows that , for all . This yields (2.12). ∎
3 Small-gap statistics
3.1 Conditional independence structure and moments
To obtain Theorem 1.5 on the smallest gap, it is not surprising that we investigate the count of small gaps and use Poisson approximation. However, our approach to studying the fine fluctuations of the largest gap turns out to make essential use of more detailed information about small gaps, and the primary focus of this section is to present this detailed information. Later in this section we will then present the proof of Theorem 1.5, the main ingredient being Corollary 3.2 that we state shortly.
Of course a typical gap has length about , and is of the same order (about ; see (1.1) and Proposition 1.1); on the other hand, as Theorem 1.5 advertises, one expects to see small gaps all the way down to size around . The results in this section will give more information on small gaps, including those with lengths .
For , , and , define the statistic
[TABLE]
In the case where , then is a counting function; we use the particular notation , in that case. That is, for and ,
[TABLE]
the number of gaps of size in . Write . Some intuition for these quantities is provided by Pyke’s uniform limit theorem (1.13), a consequence of which is that, for bounded and measurable, and ,
[TABLE]
Another consequence of (1.13) is
[TABLE]
There are also second-order (fluctuation) results that complement (3.3)–(3.4), provided by [27]. However, these results are targeted at typical gaps, and are of limited value concerning when . Indeed, roughly speaking, the asymptotic (3.4) says that, , a.s., but for it is the term that dominates.
The aim of this section is to provide a sharper study of small gaps, which will allow us to conclude, for example, that even when (see Lemma 3.3 below). The additional structure we need is provided by the following important conditional independence result. In particular, representation (3.6) shows that can be represented as a sum of a random number of terms involving independent random variables.
Lemma 3.1**.**
Suppose that . There exist random variables and , such that (i) are i.i.d. random variables, independent of the ; (ii) given , the are independent with ; (iii) for every and every for which , we have the representation
[TABLE]
In other words, for fixed and with , we can write
[TABLE]
where are i.i.d. , independent of .
Proof.
Fix and with . Since and , a.s., we have for all . Hence splitting the interval of length () can never remove a gap of length in , and can create precisely zero or one gap of length in , and hence increase according to
[TABLE]
recalling from (1.10) that is the relative location of the split point in the maximal interval. Note that, since , we have , so that intervals and are disjoint, each of length . Moreover, conditional on , has the distribution; similarly for given . Thus we obtain the claimed representation (3.5) on setting
[TABLE]
for a sequence of i.i.d. random variables (merely to ensure that has the correct distribution even if ).
The second expression (3.6) is obtained by ignoring the terms in (3.5) where and re-labelling so that where . The number of non-zero terms is exactly , and the independence structure in (3.6) means that is independent of the in (3.6). ∎
Taking in (3.5) gives the following useful fact.
Corollary 3.2**.**
Suppose that and satisfy . Then
[TABLE]
where, given , the are independent with .
Corollary 3.2 is the basis for the proof of Theorem 1.5, which uses Poisson approximation and is presented later in this section. For Theorem 1.3 we need to further develop analysis of . The following result gives asymptotics for the moments of , which will be a key ingredient in the subsequent arguments. Part (i) gives an upper bound valid for a broad range of the parameters, part (ii) gives sharp asymptotics for a more restrictive range of parameters, and part (iii) gives a tail bound.
Lemma 3.3**.**
Suppose that . Then the following hold.
- (i)
For every , , and with ,
[TABLE] 2. (ii)
Let . There exist constants and such that, for all ,
[TABLE] 3. (iii)
For every with , we have .
Proof.
Recall the representation for from Corollary 3.2, and that for , since , a.s. Consequently, the moment generating function of is dominated by that of a random variable. Hence (see [2, §3]) we may apply the tail bound from Theorem 1 of [2], which yields part (i).
Next we prove part (ii). For let . By (3.7),
[TABLE]
Let be the set of all for which the coordinates are distinct. Then contains elements, while contains elements. For , every element of contains at least one pair of the coordinates that match, so
[TABLE]
when , clearly . Now, from (3.10),
[TABLE]
For the first term on the right-hand side of (3.12), by conditional independence,
[TABLE]
We bound the error between the sum on the right of (3.13) and the quantity
[TABLE]
from Corollary 2.6. Indeed, using (3.11) and the fact that for all ,
[TABLE]
Thus, taking expectations in (3.13), we obtain
[TABLE]
Then from Corollary 2.6, there is a (depending on ) such that, for all ,
[TABLE]
Fix with (recall that ). Then , and
[TABLE]
So we conclude that, for a constant , for all ,
[TABLE]
Similarly to (3.11), using the fact that the are -valued,
[TABLE]
using the cases of (3.10) and (3.8). Since for , and , we have that is uniformly bounded. Hence, combining the preceding display and (3.15) with (3.12), we obtain, for some and all with ,
[TABLE]
Since , another application of (3.14) yields (3.9), completing the proof of (ii).
Finally, we prove (iii). Let be the -algebra generated by the first divisions; is the trivial -algebra. Fix with . Note by (3.7). Let and, for , and set for . Then is -measurable, , and, provided ,
[TABLE]
Thus is a martingale. Moreover, and for , and so , a.s. We apply a one-sided Azuma–Hoeffding inequality (see e.g. [23, p. 46]) to obtain, for all , . Since , a.s.,
[TABLE]
which yields the tail bound in part (iii). ∎
3.2 Limit theorem for the smallest gap
In this section we use some standard Poisson approximation bounds, the representation given in Corollary 3.2 for counts of small gaps, defined at (3.2), and the reciprocal moments bounds in Corollary 2.6, to give a proof of Theorem 1.5 on the asymptotics of the smallest gap, defined at (1.8).
Recall from Corollary 3.2 that , where the are supported on , are conditionally independent given , and satisfy . Throughout this section we let denote a positive, finite constant which is independent of and and whose value may vary from line to line.
Proof of Theorem 1.5.
For non-negative, integer-valued random variables and , the corresponding total variation distance is denoted by
[TABLE]
Recall a classic bound of Le Cam [8] (see also [4, p. 3]): letting be independent Bernoulli random variables with , the total variation distance (denoted below by ) between and a random variable is bounded by . Let have a mixed Poisson distribution; that is, conditional on , the random variable has a Poisson distribution with this parameter. A conditioning argument combined with Le Cam’s result says that
[TABLE]
using the fact that , a.s., for all . We may now approximate by . By Theorem 1.C(i) of [4] we have that
[TABLE]
for some , where the final inequality follows from Corollary 2.6. Hence, by the triangle inequality there exists such that
[TABLE]
Then, we choose for some and note that
[TABLE]
to obtain that there exists such that
[TABLE]
which immediately gives us that
[TABLE]
for any . For we write
[TABLE]
By (3.16), the first term in this final maximum is at most , and thus (3.17) also holds for these values of and for a suitable choice of . ∎
4 Conditional Berry–Esseen bounds
The starting point of our proof of Theorem 1.10 is a decomposition of into a sum of independent, self-similar contributions, obtained by considering the evolution of the process subsequent to time . Fix and . Since , . Extending (1.16) gives the representation, for independent copies of , independent of gap lengths (recall that ),
[TABLE]
see e.g. Proposition 1.1 of [19] or [20]. As a starting-point for proving (non-quantitative) CLTs, there is some similarity between (4.1) and the approach of Dvoretzky & Robbins [9] in their proof of the CLT for Rényi’s parking model (see also Section 1.2).
Recall that defines the filtration to which the Kakutani process is adapted. For and , define
[TABLE]
We will use the classical Berry–Esseen theorem to obtain the following conditional Berry–Esseen estimate; note that in (4.4) not only is the probability conditional on , but so are the centering and scaling quantities and .
Lemma 4.1**.**
There is a constant such that, for all and all ,
[TABLE]
Proof.
Fix and . Conditional on , the summands in the expression given in (4.1) for are independent (although not identically distributed). Denoting
[TABLE]
the Berry–Esseen theorem for sums of independent random variables with finite third moments (see Theorem 7.6.2 of [10, p. 356]) yields, for an absolute constant ,
[TABLE]
Using the elementary inequality , , we have
[TABLE]
for constant , from (2.1) and Lemma 2.3. Since , it follows that
[TABLE]
On the other hand, by (2.2), provided that ,
[TABLE]
Using (4.6) and the preceding bound for in (4.5) yields (4.4). ∎
To deduce Theorem 1.10 starting from Lemma 4.1, we need to examine the quantities and that appear as centering and scaling in (4.4). To do so, we define
[TABLE]
where is defined at (4.3) and is given by (2.2). A significant part of the remaining technical work of the paper is to obtain good asymptotic estimates for mixed moments of and (see Section 5). To facilitate this we derive, in the rest of the present section, basic properties of and , and crucial representations for and in terms of small-gap statistics as described in Section 3.
Lemma 4.2**.**
Suppose that and , and define and by (4.2) and (4.7). Then , and the following hold:
[TABLE]
Proof.
Clearly, by (4.2). Since by (4.1), for , , and hence, by (4.7) and the fact that ,
[TABLE]
By the (conditional) total variance formula, using (4.2) and (4.7),
[TABLE]
Comparison with (4.10) yields (4.8). Finally, the case of (4.8) yields (4.9). ∎
Recall from (2.2) that , , where . Set
[TABLE]
The next result includes a representation for via two sum statistics of the form (3.1).
Lemma 4.3**.**
Suppose that and . Then, with defined at (4.11),
[TABLE]
Moreover, with defined at (3.2), whenever it holds that
[TABLE]
Proof.
From Proposition 2.1 and the definition of from (4.11), we see that
[TABLE]
The function as defined in (4.11) satisfies , and so
[TABLE]
Now let . We have from (4.1) and conditional independence that
[TABLE]
Hence, from (4.7) and (4.14), since ,
[TABLE]
which yields (4.12). If also , we may apply (4.6) in (4.16) to obtain , giving the first bound in (4.13). By (4.12), (4.15), and (3.2) we get , which gives the second bound in (4.13). ∎
Remark 4.4*.*
Consider , so . From (3.3), it follows that, for ,
[TABLE]
which, using the formula from (4.11) to evaluate the integral, takes the (negative) value
[TABLE]
Thus (4.12) says that we should expect to be genuinely of order .
Next, we show that can be represented as a sum statistic of the form (3.1).
Lemma 4.5**.**
Let and take . Then defined by (4.2) satisfies
[TABLE]
for , given by (3.1) with , and defined at (3.2).
Proof.
Taking conditional expectations in (4.1) and using (2.1), we obtain
[TABLE]
using and . Thus for we identify from (3.1) that , and since , we verify (4.17) using (3.2). ∎
Remark 4.6*.*
The bound from (4.17) shows that with high probability. This bound is poor, since (4.9) and Lemma 4.3 say , and , so one expects to be around . Indeed, if , the fluctuation results of Pyke & van Zwet (Theorem 6.2 of [27]) show that has a Gaussian limit. However, when this result says only that in probability. Proposition 5.1 below includes moments asymptotics that address these points, giving finer control on the asymptotics of for a broader range of .
5 Conditional means, variances, and their moments
The aim of this section is to establish the following asymptotics on the mixed moments of and defined at (4.2) and (4.7) respectively. The result is in two parts, depending on the parity of the exponent of ; recall that .
Proposition 5.1**.**
Suppose that . Then the following hold:
- (i)
There exist constants and such that, for all , all , and all and with ,
[TABLE] 2. (ii)
There exist constants and such that, for all , all , and all and with ,
[TABLE]
We give the proof of Proposition 5.1 later in this section. Lemma 3.1, on statistics of small gaps, combined with Lemmas 4.5 and 4.3 which represent, respectively, and in terms of small-gap functionals, enables us to represent and in the form
[TABLE]
where , , and, conditional on the and , the random variables and are all mutually independent. Write
[TABLE]
Then (5.3) is equivalent to
[TABLE]
The expressions for the moments of from Lemma 3.3, with the representation (5.5) and the associated conditional independence structure, made explicit in Lemma 5.3 below, is our starting point for the proof of Proposition 5.1.
Remark 5.2*.*
With as defined at (4.11), and , some calculus shows that the expectation of the each summand appearing in (5.4) is
[TABLE]
which, by comparison with the formula for from Proposition 2.1, shows that
[TABLE]
Thus from the representation (5.5) we confirm that (as is clear from (4.2)) and
[TABLE]
Moreover, it also follows from (5.5) that
[TABLE]
so the relation (5.7) recovers , as at (4.9).
The next result summarizes the structure of the components of expressed in (5.5).
Lemma 5.3**.**
Let and . Conditional on , the random variables , , have the representation
[TABLE]
where , the , and are all independent.
Proof.
The representation for from (5.5) (coming from Lemma 4.5 and Lemma 3.1) together with the definition of from (5.4) gives
[TABLE]
where, given and , the are all mutually independent with and . By (3.2), and, with the notation from (3.1), with . Hence Lemma 3.1 shows that, given , . ∎
We will use Lemma 5.3 to obtain, in Lemma 5.4 below, estimates for the mixed moments of , , and . In the proof, we will make use of auxiliary results stated in Appendix A below, including estimates of moments of random sums, like those appearing in the triple representation in Lemma 5.3, given in Lemma A.1.
Lemma 5.4**.**
- (i)
Suppose that and . Then, for ,
[TABLE] 2. (ii)
There is a constant such that for all , all , and all with ,
[TABLE] 3. (iii)
Let . There are constants and such that, for all , for all , and for all with ,
[TABLE]
In the following proof, and frequently later on, we will need simple inequalities relating and derived from
[TABLE]
Proof of Lemma 5.4.
Lemma 5.3 shows that and have the same distribution, and this yields (5.8), and hence proves part (i).
For parts (ii)–(iii), denote, similarly to Lemma A.1, the moments
[TABLE]
for independent sequences and , . By Lemma 5.3,
[TABLE]
Since , by (5.4), and by (3.2), so
[TABLE]
since . By Lemma A.1(i) applied with , noting that for , there is a constant such that, for all and all ,
[TABLE]
by (5.11). From (5.12)–(5.13) there is such that, for all and all ,
[TABLE]
Now using Lemma 3.3(i), we verify part (ii).
Finally, for part (iii), suppose . From Lemma 5.3, with ,
[TABLE]
Here, by Lemma A.1(ii) applied with , , and the fact that and , by (5.6), for all and all ,
[TABLE]
Combining this bound with (5.13), using , we obtain, for some and all and all ,
[TABLE]
Using the bound (5.15) in (5.14), we get, for ,
[TABLE]
by an application of Lemma A.2. Another application of Lemma A.2 shows that
[TABLE]
It follows that
[TABLE]
by Lemma 3.3(i), provided that . For , take as in Lemma 3.3(ii). Assuming that and , we have that, indeed, for all large enough. Moreover, applying Lemma 3.3(ii) to estimate , we obtain (5.10), noting that , so that the term in the error coming from that lemma is negligible compared to . This proves part (iii). ∎
Proof of Proposition 5.1.
Let and as in Lemma 5.4(iii). Suppose with . We can use (5.5) and a trinomial expansion to write
[TABLE]
where is defined at (5.4), and the sum is over whose sum is equal to . Taking expectations in (5.16) and applying (5.8), we obtain
[TABLE]
Provided , we have and so where for all but finitely many . Hence the hypotheses of parts (ii) and (iii) of Lemma 5.4 are both satisfied when considering .
First suppose that ; note that in this case. In the sum in (5.17), we show that the terms with are dominant. To this end, consider
[TABLE]
using the upper bound from Lemma 5.4(ii). Since , using (5.11),
[TABLE]
since for . Now
[TABLE]
using the fact that . Thus, for some and all with ,
[TABLE]
On the other hand, from Lemma 5.4(iii) it follows that
[TABLE]
Furthermore,
[TABLE]
using the relation (5.7) between and . Combined with (5.18), we verify (5.1).
Finally, suppose that . Then if we have and (5.2) is trivial, so suppose that ; note that in this case. In the sum in (5.17), we show that now the terms with are dominant. Now, by (5.17),
[TABLE]
using the upper bound from Lemma 5.4(ii). Now following a similar argument to that leading to (5.18) we obtain, for some and all with ,
[TABLE]
On the other hand, from Lemma 5.4(iii) it follows that
[TABLE]
Furthermore,
[TABLE]
using (5.7). Combined with (5.19), we verify (5.2). ∎
6 Completing the proofs of the main theorems
In this section we combine the ingredients developed so far with an expansion in Hermite polynomials to prove our quantitative CLTs for (Theorem 1.10) and (Theorem 1.3). A key intermediate result is Proposition 6.1 below. Recall the definitions of , and from (2.2), (4.2) and (4.7), respectively, and for define
[TABLE]
The following result reduces the proof of Theorem 1.10 to controlling (sums of) the quantities from (6.1). Let denote the Euler gamma function, so , .
Proposition 6.1**.**
Let , and let be as in Proposition 5.1. Then there exist constants and such that, for all and such that ,
[TABLE]
The main additional element to Proposition 6.1 is a sort of Hermite–Edgeworth expansion. Denote by the Hermite polynomial of degree , which satisfies
[TABLE]
where is the standard Gaussian density: see e.g. [29, §20.2]. We will need the following inequality from [21, p. 78]:
[TABLE]
note that shows this bound is not far from optimal.
Lemma 6.2**.**
Let . Then, for all and all ,
[TABLE]
Proof.
For and , let ; then
[TABLE]
Let denote the real part of . The random variable has characteristic function
[TABLE]
Using the Taylor series for the exponential function with complex argument
[TABLE]
we obtain, for ,
[TABLE]
A standard inversion formula (see e.g. Theorem 3.2.1 of [21, p. 31]) gives
[TABLE]
Using the same formula for the standard Gaussian characteristic function shows that
[TABLE]
Hence, by (6.5),
[TABLE]
provided . Now, using the binomial theorem to expand ,
[TABLE]
For we have the equality
[TABLE]
Hence,
[TABLE]
So we obtain
[TABLE]
Thus we conclude that
[TABLE]
The stated result now follows by re-expressing the sums over and . ∎
Proof of Proposition 6.1.
Let and . Recall and from (4.2) and (4.3). We apply the conditional Berry–Esseen result, Lemma 4.1, which has random centering and scaling, to obtain, for deterministic centering and scaling,
[TABLE]
for -measurable random variables satisfying , a.s., by (4.4). Recall that for , by (2.2). Define
[TABLE]
where is defined at (4.7). Thus we can re-write the preceding bound as
[TABLE]
Note that, by (6.6) with the first inequality in (4.13) and our choice , we have . Moreover, for we may express as defined at (6.1) in terms of and defined at (6.6) via
[TABLE]
Then we apply Lemma 6.2 with and to give
[TABLE]
Taking expectations in the above display and using the bound (6.4), we obtain
[TABLE]
Hence we take expectations in (6.7), noting that by Lemma 2.4, to get
[TABLE]
Now suppose , and take as specified in Proposition 5.1. From Proposition 5.1(i) with and , we have
[TABLE]
since . On the other hand, . For , we have from Proposition 5.1(i) with and ,
[TABLE]
Using Stirling’s formula, , and the fact that , it follows that
[TABLE]
For , using Proposition 5.1(i) (again) with and , plus Jensen’s inequality, leads to the same conclusion. Thus from (6) we get
[TABLE]
provided . Taking and , we have for all large enough; thus provided we have that . Then from (6.10) we conclude (6.2). ∎
With Proposition 6.1 in hand, the remaining task in the proof of Theorem 1.10 is to bound from (6.1) and hence control the sum on the right-hand side of (6.2). This is the purpose of the next result, which needs the full strength of Proposition 5.1.
Lemma 6.3**.**
Suppose that , and let be as in Proposition 6.1.
- (i)
There exists such that, for all , all , and all with ,
[TABLE] 2. (ii)
It holds that . Moreover, there exists such that, for all , all , and all with ,
[TABLE]
Proof.
Suppose that , and let be as in Proposition 6.1. Take and . Taking expectations in (6.1) and using (2.2), we have
[TABLE]
The idea is to apply Proposition 5.1 with and , so that for all large enough. For part (i), suppose that . Then , and so from Proposition 5.1(i) we get
[TABLE]
where in the bound we used , as follows from (5.11). Here
[TABLE]
and, similarly,
[TABLE]
From here we get (6.11). For part (ii), suppose is even. Then is odd, and so from Proposition 5.1(ii) we get
[TABLE]
using , by (5.11). Now
[TABLE]
and from here we obtain (6.12). ∎
Proof of Theorem 1.10.
Let be as in Proposition 5.1. Proposition 6.1 shows that, for constants and , for all and such that , say, the bound (6.2) holds. In particular, taking for which , the bound (6.2) holds whenever and is sufficiently large. For , we have for all large enough, and so Lemma 6.3, with , shows that
[TABLE]
Hence from (6.2) we get
[TABLE]
by our choice of and of . ∎
Proof of Theorem 1.3.
Consider the interval . We claim that it is a consequence of Theorem 1.10 and the relation from (1.15), which holds for all and all , that there exists such that
[TABLE]
Assuming (for now) that (6.13) holds, we extend the bound over all using monotonicity and comparison to the small Gaussian tails. Indeed, we have from (6.13) that
[TABLE]
by standard tail bounds for , and similarly for the positive tail. Then
[TABLE]
by (6.14). A similar argument applies to the values of , noting that
[TABLE]
where . Thus Theorem 1.3 follows from (6.13). It remains to verify the claim (6.13), which we deduce from a careful inversion of Theorem 1.10.
For and , define
[TABLE]
Observe that for . It follows from (1.15) that
[TABLE]
Also observe that
[TABLE]
the term in brackets in (6.16) is positive for , so has the same sign as , and indeed
[TABLE]
We have, by Theorem 1.10, there exists such that, for all and all ,
[TABLE]
Thus, by (6.15) and the fact that for , we get
[TABLE]
It remains to compare to . Write (as previously) for the standard Gaussian density. By the mean value theorem, there exists such that
[TABLE]
Here, for ,
[TABLE]
by (6.17). Moreover, from (6.16) we get for all and some constant . Hence there is a constant such that
[TABLE]
Combining (6.18) and (6.20) we verify (6.13). ∎
Finally, we give the proofs of the corollaries from Section 1.2.
Proof of Corollary 1.7.
This is direct from Theorem 1.10 since, as explained in Section 1.2, Kingman’s embedding gives , . ∎
Proof of Corollary 1.8.
Recalling the identification , we have, for ,
[TABLE]
where
[TABLE]
It follows that, for some , , for all and all . Hence there exists such that for all , and, in particular, and . Consequently, from (6),
[TABLE]
by Theorem 1.3; similarly for . On the other hand, by (6) and Theorem 1.3, there exists such that, for all and all ,
[TABLE]
By the mean value theorem, similarly to (6.19), for all ,
[TABLE]
which completes the proof of (1.4). The proof of (1.5) is direct from the fact that
[TABLE]
and applying Theorem 1.5. ∎
Appendix A Moments of partial sums
We give two auxiliary results needed in the proof of Lemma 5.4. While we suspect that neither result is new, we have not been able to locate a reference. Lemma A.1 gives first-order expansions of moments of sums of i.i.d. summands, with quantitative error bounds, that has some similarities to the Macinkiewicz–Zygmund and Rosenthal inequalities [10, pp. 146–153].
Lemma A.1**.**
Let be a random variable with for every . Consider , where are i.i.d. copies of . Write , and
[TABLE]
- (i)
Suppose that . Then for all and all ,
[TABLE] 2. (ii)
Suppose that . Then for all and all ,
[TABLE]
Proof.
First we prove (ii). Similarly to the proof of Lemma 3.3 (but with an index shift), write and for vectors in with no two coordinates the same. Then, by the fact that the are i.i.d.,
[TABLE]
On the other hand, given for which there are distinct indices appearing in multiplicities , by independence,
[TABLE]
where we used Lyapunov’s inequality for the middle step. Hence
[TABLE]
Moreover, by (3.11), we have . It follows that
[TABLE]
where is as defined at (A.1). This yields part (ii). For part (i), note that
[TABLE]
since the product has expectation zero whenever has a coordinate which appears exactly once, and the combinatorial factor comes from the number of ways of pairing up coordinates. The rightmost expression in the last display is for . Thus part (i) follows from part (ii), with in (A.1) being the quantity but with in place of . ∎
Lemma A.2**.**
Let . Then, for every with ,
[TABLE]
Proof.
Write and . If , it follows from the mean value theorem that , . If , then has for all , and hence . Combining the two cases gives, for every ,
[TABLE]
Using (A.2) we observe that, for ,
[TABLE]
Hence, for ,
[TABLE]
Then, using (A),
[TABLE]
By Lyapunov’s inequality, , so that
[TABLE]
Considering separately the cases and (and using for ) it is not hard to verify that as long as . ∎
Acknowledgements
AW was supported by EPSRC grant EP/W00657X/1. Part of this work was undertaken during the programme “Stochastic systems for anomalous diffusion” (July–December 2024) hosted by the Isaac Newton Institute, under EPSRC grant EP/Z000580/1.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] R.L. Adler and L. Flatto, Uniform distribution of Kakutani’s interval splitting procedure. Z. Wahrsch. Verw. Gebiete 38 (1977) 253–259.
- 2[2] T.D. Ahle, Sharp and simple bounds for the raw moments of the binomial and Poisson distributions. Statist. Probab. Lett. 182 (2022) 109306.
- 3[3] P. Bak and K. Sneppen, Punctuated equilibrium and criticality in a simple model of evolution. Phys. Rev. Letters 71 (1993) 4083–4086.
- 4[4] A. D. Barbour, L. Holst and S. Janson, Poisson Approximation , Oxford University Press, Oxford, 1992.
- 5[5] J. Bérard and J.-B. Gouéré, Brunet-Derrida behavior of branching-selection particle systems on the line. Commun. Math. Phys. 298 (2010) 323–342.
- 6[6] P. Bickel, M. Fiocco, M. de Gunst and F. Götze, Willem van Zwet’s research. Ann. Statist. 49 (2021) 2439–2447.
- 7[7] E. Brunet and B. Derrida, Microscopic models of traveling wave equations. Computer Phys. Commun. 121–122 (1999) 376–381.
- 8[8] L. Le Cam, An approximation theorem for the Poisson binomial distribution. Pacific J. Math. 10 (1960) 1181–1197.
