Error bounds for the normal approximation to the length of a Ewens partition
Koji Tsukuda

TL;DR
This paper derives error bounds for the normal approximation of the Ewens partition length, revealing how the approximation accuracy varies across different asymptotic regimes as parameters grow large.
Contribution
It provides the first explicit error bounds for the normal approximation of Ewens partition lengths under various asymptotic conditions.
Findings
Error bounds depend on the asymptotic regime.
Normal approximation improves as n and θ grow large.
Decay rate of error varies with the relationship between n and θ.
Abstract
Let be a positive integer-valued random variable whose distribution is given by , where is a positive number, is a positive integer, and is the coefficient of in for . This formula describes the distribution of the length of a Ewens partition, which is a standard model of random partitions. As tends to infinity, asymptotically follows a normal distribution. Moreover, as and simultaneously tend to infinity, if , also asymptotically follows a normal distribution. In this paper, error bounds for the normal approximation are provided. The result shows that the decay rate of the error changes due to asymptotic regimes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Error bounds for the normal approximation to
the length of a Ewens partition††thanks: This work was partly supported by Japan Society for the Promotion of Science KAKENHI Grant Number 16H02791, 18K13454.
Koji Tsukuda Graduate School of Arts and Sciences, The University of Tokyo, 3-8-1 Komaba, Meguro-ku, Tokyo 153-8902, Japan.
Abstract
Let be a positive integer-valued random variable whose distribution is given by , where is a positive number, is a positive integer, and is the coefficient of in for . This formula describes the distribution of the length of a Ewens partition, which is a standard model of random partitions. As tends to infinity, asymptotically follows a normal distribution. Moreover, as and simultaneously tend to infinity, if , also asymptotically follows a normal distribution. In this paper, error bounds for the normal approximation are provided. The result shows that the decay rate of the error changes due to asymptotic regimes.
1 Introduction
Consider a nonnegative integer-valued random variable that follows
[TABLE]
where is a positive value, is a positive integer, and is the coefficient of in . This distribution is known as the falling factorial distribution (Watterson, 1974a, equation (2.22)), STR1F (i.e., the Stirling family of distributions with finite support related to the Stirling number of the first kind) (Sibuya, 1986, 1988), and the Ewens distribution (Kabluchko, Marynych and Sulzbach, 2016). The formula (1.1) describes the distribution of the length of a Ewens partition, which is a standard model of random partitions. A random partition is called a Ewens partition when the distribution of the partition is given by the Ewens sampling formula. The Ewens sampling formula and (1.1) appear in a lot of scientific fields and have been extensively studied; see, e.g., Johnson, Kotz and Balakrishnan (1997, Chapter 41) or Crane (2016). In the context of population genetics, (1.1) was discussed in Ewens (1972) as the distribution of the number of allelic types included in a sample of size from the infinitely-many neutral allele model with scaled mutation rate ; see also Durrett (2008, Section 1.3). Moreover, in the context of nonparametric Bayesian inference, (1.1) describes the law of the number of distinct values in a sample from the Dirichlet process; see, e.g., Ghosal and van der Vaart (2017, Section 4.1). Furthermore, as introduced in Sibuya (1986), (1.1) relates to several statistical or combinatorial topics such as permutations, sequential rank order statistics and binary search trees.
Simple calculations imply that
[TABLE]
and
[TABLE]
as . Let be the distribution function of the random variable
[TABLE]
standardized by the leading terms of the mean and variance, and be the distribution function of the standard normal distribution. By calculating the moment generating function of , Watterson (1974b) proved that converges in distribution to the standard normal distribution; that is, as for any . For the history concerning this result, we refer readers to Arratia and Tavaré (1992, Remark after Theorem 3). In particular, when Goncharov (1944) proved that for any . From a theoretical perspective, it is important to derive error bounds for the approximation. Yamato (2013) discussed the first-order Edgeworth expansion of via the Poisson approximation (Arratia and Tavaré, 1992, Remark after Theorem 3) and proved that , where is the -norm defined by
[TABLE]
for a bounded function . Note that when , Hwang (1998, Example 1) showed that . Kabluchko, Marynych and Sulzbach (2016) derived the Edgeworth expansion of the probability function of , and provided the first-order Edgeworth expansion of .
As the standardization of comes from (1.2), the normal approximation only works well when is sufficiently large with respect to . However, this assumption has limited validity in practical cases, so it is important to consider alternative standardized variables; see, e.g., Yamato (2013) and Yamato, Nomachi and Toda (2015). In particular, we consider the random variables and defined by
[TABLE]
where
[TABLE]
[TABLE]
These are standardized random variables that use the exact moments and approximate moments, respectively. Denote the distribution functions of and by and , respectively. Then, Tsukuda (2017, Theorem 2 and Remark 6) proved that, under the asymptotic regime and as (see subsection 1.1 for the explicit assumptions), both and converge to as for any . The problem considered in this paper is to provide upper and lower bounds for the approximation errors and .
Remark 1**.**
It holds that and as with .
1.1 Assumptions and asymptotic regimes
As explained in the Introduction, the regime with fixed is sometimes unrealistic. Hence, we consider asymptotic regimes in which increases as increases. Such regimes have been discussed in Feng (2007, Section 4) and Tsukuda (2017, 2019). We follow these studies. In this subsection, let us summarize the assumptions on and .
First, is assumed to be nondecreasing with respect to . Moreover, when we take the limit operation, is assumed.
The following asymptotic regimes are discussed in this paper:
- •
Case A:
- •
Case B: , where
- •
Case C:
- •
Case C1: and
Remark 2**.**
Feng (2007)** was apparently the first to consider the asymptotic regimes in which and simultaneously tend to infinity. Specifically, Cases A, B, and C were considered by Feng (2007, Section 4). Case C1 was introduced by Tsukuda (2017).
Furthermore, let be the unique positive root of the equation
[TABLE]
Then, we introduce a new regime, Case B⋆, as follows:
- •
Case B⋆: , where and .
Remark 3**.**
Solving (1.3) numerically gives .
2 Main results
This section presents Theorems 2.1 and 2.4 which are the main results of this paper and their corollaries. Proofs of the results in this section are provided in Section 4.
2.1 An upper error bound
In this subsection, an upper bound for the error is given in Theorem 2.1, and its convergence rate is given in Corollary 2.2. Moreover, the convergence rate of the upper bound for the error is given in Corollary 2.3.
We now present the first main theorem of this paper.
Theorem 2.1**.**
Assume that there exists such that
[TABLE]
for all . Then, it holds that
[TABLE]
for all , where is a constant not larger than 0.5591 and
[TABLE]
Remark 4**.**
Under our asymptotic regime (), (2.1) is valid for sufficiently large .
Remark 5**.**
The constant in Theorem 2.1 is the universal constant appearing in the Berry–Esseen theorem.
Theorem 2.1 and asymptotic evaluations of the numerator and denominator of yield the following corollary.
Corollary 2.2**.**
In Cases A, B, and C1, it holds that
[TABLE]
Using Corollary 2.2, we can obtain the following convergence rate of the error bound for the normal approximation to .
Corollary 2.3**.**
It holds that
[TABLE]
2.2 Evaluation of the decay rate
In this subsection, a lower bound for the error is given in Theorem 2.4. Together with Theorem 2.1, this theorem yields the decay rate of , as stated in Corollary 2.5.
We now present the second main theorem of this paper.
Theorem 2.4**.**
(i) Assume that there exists such that, for all , (2.1), and
[TABLE]
Then, it holds that
[TABLE]
for all , where is some constant,
[TABLE]
and
[TABLE]
(ii) Assume that there exists such that, for all , (2.1), and
[TABLE]
Then, it holds that
[TABLE]
for all , where is some constant, is as defined in (2.5), and
[TABLE]
Remark 6**.**
Under our asymptotic regime (), is valid for sufficiently large . In Case A, (2.3) is valid for sufficiently large . In Case B⋆, if then (2.3) is valid for sufficiently large , and if then (2.6) is valid for sufficiently large . In Case C1, (2.6) is valid for sufficiently large .
Remark 7**.**
The constant in Theorem 2.4 is the universal constant introduced by Hall and Barbour (1984). Note that this constant was denoted as in their theorem.
As a corollary to Theorems 2.1 and 2.4, we can make the following statement regarding the decay rate of .
Corollary 2.5**.**
It holds that
[TABLE]
3 Some preliminary results
3.1 A representation of by a Bernoulli sequence
Consider an independent Bernoulli random sequence defined by
[TABLE]
Then,
[TABLE]
that is, equals ; see, e.g., Johnson, Kotz and Balakrishnan (1997, equation (41.12)) or Sibuya (1986, Proposition 2.1). By virtue of this relation, and after some preparation, we will prove the results presented in Section 2. To use the Berry–Esseen-type theorem for independent random sequences (see Lemma B.1), we will evaluate the sum of the second- and third-order absolute central moments of . That is, we will evaluate
[TABLE]
and
[TABLE]
To derive a lower bound result, we will evaluate
[TABLE]
and
[TABLE]
Remark 8**.**
It follows from the binomial theorem that
[TABLE]
for any .
3.2 Evaluations for moments
In this subsection, we evaluate several sums of moments of .
Lemma 3.1**.**
(i) It holds that
[TABLE]
(ii) If then it holds that
[TABLE]
(iii) In particular, it holds that
[TABLE]
Proof of Lemma 3.1.
(i) The desired inequality is an immediate consequence of (3.2) and Lemma A.1. (ii) As
[TABLE]
for any , it holds that
[TABLE]
whereas the remainder does not diverge to . This implies the assertion. (iii) The assertion is a direct consequence of (ii) (for Case C, the result follows from the Taylor expansion of as ). ∎
Lemma 3.2**.**
(i) It holds that
[TABLE]
(ii) If , then it holds that
[TABLE]
(iii) In particular, it holds that
[TABLE]
Proof of Lemma 3.2.
(i) The desired inequality is an immediate consequence of (LABEL:K3am) and Lemma A.1. (ii) As
[TABLE]
for any , it holds that
[TABLE]
whereas the remainder does not diverge to . This implies the assertion. (iii) The assertion is a direct consequence of (ii) (for Case C, the result follows from the Taylor expansion of as ). ∎
Lemma 3.3**.**
(i) It holds that
[TABLE]
(ii) In Case A, B⋆, or C, it holds that
[TABLE]
Proof of Lemma 3.3.
(i) The desired inequality is an immediate consequence of (3.4) and Lemma A.1. (ii) In Case A, the assertion holds because
[TABLE]
whereas the remainder does not diverge to . In Case B⋆, the assertion holds because
[TABLE]
and , whereas the remainder does not diverge to . In Case C, the assertion holds because
[TABLE]
whereas the remainder terms do not diverge to . ∎
Lemma 3.4**.**
It holds that
[TABLE]
Proof of Lemma 3.4.
The assertion is an immediate consequence of (LABEL:K2m2) and Lemma A.1. ∎
Remark 9**.**
*The asymptotic value of the RHS in (3.7) is given by (Case A),
(Case B), or (Case C).*
4 Proofs of the results in Section 2
4.1 Proof of the results in Subsection 2.1
In this subsection, we provide proofs of the results in Subsection 2.1.
Proof of Theorem 2.1.
Let be an arbitrary integer such that . From (3.1), Lemma B.1 yields that
[TABLE]
where is the constant appearing in Lemma B.1. Additionally, Lemmas 3.1-(i) and 3.2-(i) yield that
[TABLE]
∎
Proof of Corollary 2.2.
In Case A, B, or C1, it holds that
[TABLE]
Hence, Theorem 2.1, Lemmas 3.1 and 3.2 yield that
[TABLE]
This completes the proof. ∎
Proof of Corollary 2.3.
From
[TABLE]
and the triangle inequality, it follows that
[TABLE]
The first term on the RHS in (LABEL:pc2t1) is
[TABLE]
from Corollary 2.2. The second term on the RHS in (LABEL:pc2t1) is bounded above by
[TABLE]
from Lemma A.2-(i). This is because (Lemma A.1) and (Lemma 3.1). The third term of the RHS in (LABEL:pc2t1) is bounded above by
[TABLE]
from Lemma A.2-(ii). This is because, from for and
[TABLE]
(see Lemma 3.1-(i)), it follows that
[TABLE]
for . Note that the LHS and RHS of (4.2) are
[TABLE]
This completes the proof. ∎
4.2 Proof of the results in Subsection 2.2
In this subsection, we provide proofs of the results in Subsection 2.2.
Proof of Theorem 2.4.
(i) Let be an arbitrary integer such that . As for all , (3.1) and Lemma B.2 yield that
[TABLE]
where is the constant appearing in Lemma B.2. Additionally, Lemmas 3.1-(i) and 3.3-(i) yield that
[TABLE]
Moreover, Lemmas 3.1-(i) and 3.4-(i) yield that
[TABLE]
This completes the proof of (i).
(ii) Let be an arbitrary integer such that . From the same reason as (i), (3.1) and Lemma B.2 yields (4.3). Additionally, Lemmas 3.1-(i) and 3.3-(i) yield that
[TABLE]
Moreover, Lemmas 3.1-(i) and 3.4-(i) yield (4.4). This completes the proof of (ii). ∎
Proof of Corollary 2.5.
In Case A, it follows from
[TABLE]
that
[TABLE]
Moreover, it holds that
[TABLE]
Hence, Corollary 2.2 and Theorem 2.4 yield the desired result in Case A.
In Case B⋆, it follows from
[TABLE]
that
[TABLE]
Moreover, it holds that
[TABLE]
As
[TABLE]
either or exist in Case B⋆. Hence, Corollary 2.2 and Theorem 2.4 yields the desired result in Case B⋆.
In Case C1, it follows from
[TABLE]
that
[TABLE]
Moreover, it holds that
[TABLE]
Hence, Corollary 2.2 and Theorem 2.4 yield the desired result in Case C1.
This completes the proof. ∎
5 Concluding remarks
In this paper, we evaluated the approximation errors and . Deriving decay rates for when (i.e., Case B with ) and for is left for future research. Moreover, as normal approximations are refined by the Edgeworth expansion, it is also important to derive the Edgeworth expansion under our asymptotic regimes.
Appendix A Some evaluations
The following lemma is used in the main body.
Lemma A.1**.**
Let be a positive value and be a positive integer. (i) It holds that
[TABLE]
(ii) It holds that
[TABLE]
for any positive integer .
Proof.
For (i), see Tsukuda (2017, Proof of Proposition 1). For (ii), the conclusion follows from
[TABLE]
for any positive integer . This completes the proof. ∎
The next lemma provides some basic results on the standard normal distribution function.
Lemma A.2**.**
(i) For any , it holds that
[TABLE]
(ii) For any positive , it holds that
[TABLE]
Proof.
(i) For some between 0 and , it holds that
[TABLE]
(ii) As
[TABLE]
we prove the assertion for and , separately. First, we consider the case . For , it holds that . For ,
[TABLE]
For ,
[TABLE]
Next, we consider the case . For , it holds that . For ,
[TABLE]
For ,
[TABLE]
This completes the proof. ∎
Appendix B Error bounds for normal approximations
B.1 The Berry–Esseen-type theorem for independent sequences
In this subsection, we introduce the Berry–Esseen-type theorem for independent sequences. For further details, see Tyurin (2012).
Let be a sequence of independent random variables, and , , for all . The quantity is called the Lyapunov fraction. We denote the distribution function of by . Then, the following result holds.
Lemma B.1** (Tyurin (2012)).**
There exists a universal constant such that
[TABLE]
for all positive integers , where does not exceed .
Remark 10**.**
Here, we introduce the result given by Tyurin (2012). There have been many studies in which Berry–Esseen-type results are derived; see, e.g., Chen, Goldstein and Shao (2011, Chapter 3).
B.2 Lower bound
In this subsection, we introduce the result given by Hall and Barbour (1984) that considers reversing the Berry–Esseen inequality.
Let be a sequence of independent random variables satisfying and for all , and . We denote the distribution function of by . Letting
[TABLE]
the following result holds.
Lemma B.2** (Hall and Barbour (1984)).**
There exists a universal constant such that
[TABLE]
As
[TABLE]
we use the RHS as a lower bound. This bound is sufficient in Cases A, B⋆, and C1 to show the decay rate of .
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Arratia and Tavaré (1992) Arratia, R.; Tavaré, S. (1992). Limit theorems for combinatorial structures via discrete process approximations. Random Structures Algorithms 3, no.3, 321–345.
- 2Chen, Goldstein and Shao (2011) Chen, L. H. Y.; Goldstein, L.; Shao, Q.-M. (2011). Normal approximation by Stein’s method . Probability and its Applications (New York). Springer, Heidelberg. xii+405 pp.
- 3Crane (2016) Crane, H. (2016). The ubiquitous Ewens sampling formula. Statist. Sci. 31, no.1, 1–19.
- 4Durrett (2008) Durrett, R. (2008). Probability models for DNA sequence evolution. Second edition. Probability and its Applications (New York). Springer, New York.
- 5Ewens (1972) Ewens, W. J. (1972). The sampling theory of selectively neutral alleles. Theoret. Population Biology 3, 87–112; erratum, ibid. 3 (1972), 240; erratum, ibid. 3 (1972), 376.
- 6Feng (2007) Feng, S. (2007). Large deviations associated with Poisson–Dirichlet distribution and Ewens sampling formula. Ann. Appl. Probab. 17, no. 5–6, 1570–1595.
- 7Ghosal and van der Vaart (2017) Ghosal, S.; van der Vaart, A. (2017). Fundamentals of nonparametric Bayesian inference. Cambridge Series in Statistical and Probabilistic Mathematics, 44. Cambridge University Press, Cambridge.
- 8Goncharov (1944) Goncharov, V. L. (1944). Some facts from combinatorics. Izv. Akad. Nauk SSSR , Ser. Mat. 8, 3–48.
