Second Order Expansions for Sample Median with Random Sample Size
Gerd Christoph, Vladimir V. Ulyanov, Vladimir E. Bening

TL;DR
This paper develops second order asymptotic expansions for the sample median when the sample size is random, extending classical results to more realistic scenarios where sample size varies unpredictably.
Contribution
It introduces novel second order Chebyshev–Edgeworth and Cornish–Fisher expansions for the median with a specific type of random sample size, advancing asymptotic theory.
Findings
Derived second order expansions for median with random sample size
Applied expansions to Student's t- and Laplace distributions
Enhanced understanding of median's asymptotic behavior under randomness
Abstract
In practice, we often encounter situations where a sample size is not defined in advance and can be a random value. The randomness of the sample size crucially changes the asymptotic properties of the underlying statistic. In the present paper second order Chebyshev--Edgeworth and Cornish--Fisher expansions based of Student's - and Laplace distributions and their quantiles are derived for sample median with random sample size of a special kind.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Bayesian Methods and Mixture Models · Statistical Distribution Estimation and Applications
Second Order Expansions for Sample Median with Random Sample Size
Gerd Christoph
,
Vladimir V. Ulyanov
and
Vladimir E. Bening
Otto-von-Guericke University Magdeburg, Department of Mathematics,
Postfach 4120,
39016 Magdeburg, Germany.
Lomonosov Moscow State University,
Faculty of Computational Mathematics and Cybernetics
119991, Leninskie Gory, 1/52, Moscow, Russia.
National Research University Higher School of Economics,
101000, Myasnitskaya ulitsa, 20, Moscow, Russia
Lomonosov Moscow State University,
Faculty of Computational Mathematics and Cybernetics
119991, Leninskie Gory, 1/52, Moscow, Russia
Abstract.
In practice, we often encounter situations, where a sample size is not defined in advance and can be a random value. The randomness of the sample size crucially changes the asymptotic properties of the underlying statistic. In the present paper the second order Chebyshev–Edgeworth and Cornish–Fisher expansions based of Student’s - and Laplace distributions and their quantiles are derived for sample median with random sample size of a special kind.
Key words and phrases:
Sample median; samples with random sizes; second order expansions; Laplace distribution; Student’s -distribution; negative binomial distribution; discrete Pareto distribution.
2000 Mathematics Subject Classification:
60F05, 60G50, 62E17, 62H10.
1. Introduction
Usually in classical statistical inference the number of observations is known. But often we do not know in advance the sample sizes or there are missing observations. Therefore the sample size may be a realization of a random variable.
There are many practical situations, where it is almost impossible to have a fixed sample size. They often occur when observations are collected in a fixed time span. For example, in reliability testing this is the number of failed devices, in medicine – the number of patients with a specific disease, in finance – the number of market transactions, in queueing theory – the number of customers entering a store, in insurance – the number of claims. All these numbers are random variables.
The use of samples with random sample sizes has been steadily growing over the years. For an overview of statistical inferences with a random number of observations and some applications see, e.g. Esquível (2016) and the references therein.
Let and be the random variables on the same probability space . In statistics the random variables are observations. Let be a random size of the underlying sample, which depends on parameter . We suppose for each that is independent of and in probability as .
Let be some statistic of a sample with non-random sample size . Define the random variable for every :
[TABLE]
i.e. is some statistic obtained from a random sample .
Gnedenko (1989) considered the asymptotic properties of the distributions of sample quantiles for samples of random size. In Nunes et al. (2019a) unknown sample sizes are assumed in medical research for analysis of one-way fixed effects ANOVA models to avoid false rejections. Application of orthogonal mixed models to situations with sample of random sizes are investigated in Nunes et al. (2019b). Esquível (2016) considered inference for the mean with known and unknown variance and inference for the variance in the normal model. Prediction intervals for the future observations for generalized order statistics and confidence intervals for quantiles based on samples of random sizes are studied in Barakat et al. (2018) and Al-Mutairi and Raqab (2020), respectively. They illustrated their results with real biometric data set, the duration of remission of leukemia patients treated by one drug. General asymptotic expansions for statistics with random sample sizes are given in Bening et al. (2013) applying corresponding asymptotic expansions for the normalized statistic and the suitable scaled random sample size .
Many models lead to random sums and random means
[TABLE]
respectively. Wald’s identity for random sums if and have finite expectations is a powerful tool in statistical inference, particularly in sequential analysis, see e.g. Wald (1945) and Kolmogorov and Prokhorov (1949). Robbins (1948) proved that asymptotic normality of the index automatically implies asymptotic normality of the corresponding random sum .
The randomness of the sample size may crucially change asymptotic properties of random sums, see e.g. Gnedenko (1989) or Gnedenko and Korolev (1996). If the statistic is asymptotically normal, then the limit laws of normalized statistic are scale mixtures of normal distributions with zero mean, depending on the random sample size .
A fundamental introduction to asymptotic distributions of random sums is given in Döbler (2015). Using Stein’s method, quantitative Berry-Esseen bounds of random sums were proved in Chen et al. (2011, Theorem 10.6), Döbler (2015, Theorems 2.5 and 2.7) and Pike and Ren (2014, Theorem 1.3) in case of approximation by normal and Laplace distributions. Moderate and large derivations are investigated in Eichelsbacher and Löwe (2019), and Klüppelberg and Mikosch (1997). Many applications of geometric random sums when is geometrically distributed are given in Kalashnikov (1997). Bounds on the total variation distance between geometric random sum of independent, non-negative, integer-valued random variables and the geometric distribution are studied in Peköz et al. (2014, Section 3)
It is worth to mention that a suitable scaled factor by random sums or random means affects the type of limit distribution. In fact, consider random mean given in (1.2). For the sake of convenience let be independent standard normal random variables and be geometrically distributed with and independent of . Then one has
[TABLE]
We have three different limit distributions. The suitable scaled random mean is standard normal distributed or tends to the Student distribution with 2 degrees of freedom as the limit distributions depending on whether we take the random scaling factor or the non-random scaling factor , respectively. Moreover, we get the Laplace distribution with variance 1 if we use scaling with the mixed factor .
Assertion (1.3) we obtain by conditioning and the stability of the normal law. Student distribution as a limit for statistics from samples with a random sample size are proved e.g. in Bening and Korolev (2005) and Schluter and Trede (2016), hence relationship (1.4) holds. Since statement (1.5) follows e.g. from Bening and Korolev (2008) or Schluter and Trede (2016).
In Bening et al. (2013) first order expansions of the random mean are proved if the sample size is negative binomial distributed with success probability or it is the maximum of independent identically distributed discrete Pareto random variables with tail index 1, using first order Chebyshev-Edgeworth expansions for mean and the rate of convergence for the distribution of suitably normalized random sample size to the corresponding limit law. Second order asymptotic expansions of suitably normalized random sample size are proved in Christoph et al. (2020) which were used to derive second order Chebyshev-Edgeworth expansions for the random mean .
In the present paper we investigate the median of a sample with the random sizes mentioned above.
Let and be the known common distribution function and the probability density function of independent components of the sample , where is the unknown location parameter to be estimated from the given sample. By we denote the order statistics constructed from the original observations .
As statistic we consider the sample median , that is,
[TABLE]
Huang (1999) discussed the even-odd phenomenon for the median in statistical literature and gave a counterexample which contradicts the statistical folklore: “It never pays to base the median on an odd number of observations”.
Looking for change points in the location parameter in time series, tests for a change in mean may be susceptible to outliers in the data, whereas tests for a change in median could may show a change of the center of the marginal distribution, see Shao and Zang (2010), Vogel and Wendler (2017) and the references therein.
To perform statistical analysis of large data sets Minsker (2019) presents new results for the median-of-means estimator using new algorithms for distributed statistical estimation that exploit divide-and-conquer approach.
To estimate the location parameter one could use the random mean as well, but for its second order expansion more than the fourth moment of is required. For heavy tailed distributions of with tail index such second order Edgeworth expansions of the random mean cannot be obtained. If the tail index , then the mean does not exist: . The mean need not always exist, whereas the median always exists.
In Peña and Kim (2019) confidence region for median of in the nonparametric measurement error model are constructed and several applications are given when a confidence interval about the center of a distribution is desired.
Therefore, it is reasonable to use the sample median .
The asymptotic normality of the normalized sample median is well known, see e.g. Cramér (1946, Chapter 28.5): If , and the density is continuous and has a continuous derivative in some neighborhood of , then
[TABLE]
where is the standard Gaussian distribution function having density :
[TABLE]
Instead of moment conditions now regularity assumptions on the density are required:
Assumption A: *The density is symmetric around zero, i.e., and . Moreover, the density has three continuous bounded derivatives in some interval .
Define
The regularity conditions in Assumption A are fulfilled, for example, for
normal density (1.8),
heavy tailed Student’s -distribution with density function
[TABLE]
including Cauchy distribution in case , where the degree of freedom parameter determines the heaviness of the distribution tail,
the triangular distribution with density
[TABLE]
the continuous uniform distribution or rectangular distribution with density
[TABLE]
and symmetric Laplace distribution having density
[TABLE]
The corresponding coefficients , and in these examples are:
[TABLE]
Under Assumption A Burnashev (1997, Theorem 1) proved in relation (1.7) an asymptotic expansion in terms of orders and with remainder as . In the present paper we prove a similar second order expansion for the sample median constructed from a sample with random sample size . Therefore in Section 2 we clarify the result of Burnashev (1997) in the sense that we get non-asymptotic relations for any integer estimating the closeness of the sample median and the corresponding second order expansion by inequalities. In Section 3 we give a transition proposition from non-random to random sample size and in Sections 4 and 5 the cases of Student - and Laplace distributions as limit laws for the random median are considered. In Section 6 the Cornish-Fisher expansions for the quantiles of sample medians and are derived from the corresponding Edgeworth-type expansions.
2. Non-Asymptotic Expansions for Sample Median
Let denote the integer part of value . Define
[TABLE]
Proposition 2.1**.**
Let Assumption A be satisfied, then for all :
[TABLE]
where does not depend on ,
[TABLE]
Since for and or an immediate consequence of inequality (2.2) is
[TABLE]
where (2.4) for is trivial and does not dependent on .
Remarks: 1. If the parent distributions of the sample have the normal density (1.8), Student’s -density (1.9) or continuous uniform density (1.11), then with respect to (1.13) the first term vanishes since in these cases . Therefore the convergence rate of the distribution of sample median to normality has order . The triangular density (1.10) and the Laplacian density (1.12) have discontinuous derivatives at , nevertheless and the convergence rate to normality has the order .
In Cramér (1946, Chapter 28.5) for asymptotic normality (1.7) it is required, that density has a continuous derivative in some neighborhood of .
2. As in Burnashev (1997) the natural normalizing factor in (2.2) is , i.e., for odd and for even . He proved also for all
[TABLE]
Hence, for the sample median each odd observation adds an amount of information of order and not as usual if the normalizing factor by is .
Proof of Proposition 2.1: Following the detailed proof of Burnashev (1997, Theorem 1) one has to change Stirling’s formula of the Gamma functions and as by inequalities, proved in Nemes (2015, Theorem 1.3):
[TABLE]
with and .
Here is the Riemann zeta function with
Finally, when ever Taylor’s formula is used with remainder in big notation, then the remainder has to be estimated in Lagrange form by an inequality. The constants in (2.2) and (2.4) depend only on and the upper bound of in some interval .
3. Transfer Proposition from Non-Random to Random Sample Sizes
Suppose that distribution functions of the random sample size satisfy the following condition.
Assumption B: There exist a distribution function with , a function of bounded variation with , a sequence and real numbers and such that for all
[TABLE]
Theorem 3.1**.**
Let both Assumptions A and B be satisfied. Then the following inequality holds for all :
[TABLE]
[TABLE]
[TABLE]
[TABLE]
where are given in (2.3) and (3.1) and
[TABLE]
The positive constants do not depend on .
Remarks: 1. The scaling factor seems to be the natural one in case of the median of a sample with a random sample size since the distribution of has a known limit distribution and the same structure as in Burnashev (1997).
2. Without the quotient in the scaling factor an additional term in the expansion occurs:
[TABLE]
3. The lower bound of the integral in (3.3) depends on which can affect the coefficients at and in the approximation. For example the proof of Theorem 4.2 in Section 4 shows that among other integrals
[TABLE]
Proof of Theorem 3.1: The proof follows along the similar arguments of the more general transfer theorem in Bening et al. (2013, Theorem 3.1) under conditions of our Theorem 3.1. Then conditioning on , we have
[TABLE]
Using now (2.4) with :
[TABLE]
Taking in account \mathbb{P}\Big{(}N_{n}/g_{n}<1/g_{n}\Big{)}=\mathbb{P}\Big{(}N_{n}<1\Big{)}=0 we obtain
[TABLE]
where , is defined in (3.3) and
[TABLE]
Estimating integral we use integration by parts for Lebesgue-Stieltjes integrals.
[TABLE]
First we calculate . Obviously and
[TABLE]
[TABLE]
where , and , see (2.3).
The functions and , , are bounded, we suppose
[TABLE]
To estimate defined in (3.4) we consider for since . Because for and for we find with (3.8) for . Therefore inequality (3.4) holds with . It follows now from (3.1) and (3.8) that
[TABLE]
and . Theorem 3.1 is proved.
Theorem 3.2**.**
Under the conditions of Theorem 3.1 and the additional conditions to functions and , depending on the convergence rate in (3.1):
[TABLE]
[TABLE]
we obtain for the function defined in (3.3):
[TABLE]
with
[TABLE]
[TABLE]
and
[TABLE]
Remarks: If then (3.9ii) implies (3.9i). If then (3.9iii) implies (3.9ii) and (3.9i). Conditions (3.9) and (3.10) lead to the range of the integrals in (3.12) which ensures (3.11). The length of the asymptotic expansion is defined by (3.12).
Proof of Theorem 3.2: Using condition (3.9i) we find
[TABLE]
It follows from (3.8), (3.9ii) and (3.9iii) that for
[TABLE]
Integration by parts, , (3.10i) and (3.10ii) lead to
[TABLE]
Taking into account (3.3), (3.12), (3.13) and (3.14) we obtain (3.11).
In the next two sections we use both Theorems 3.1 and 3.2 when the scale mixture as limiting distribution of can be expressed in terms of the well-known distributions. We obtain non-asymptotic results like in Proposition 2.1 for the sample median , using second order approximations for both the statistic and for the random sample size . In both cases the jumps of the distribution function of the random sample size only affect the function in formula (3.1).
4. Student’s Distribution as Limit for Random Sample Median
Let the sample size be the negative binomial distributed (shifted by 1) with parameters and , having probability mass function
[TABLE]
with . Schluter and Trede (2016, Section 2.1) pointed out that the negative binomial distribution is one of the two leading cases for count models, it accommodates the over-dispersion typically observed in count data (which the Poisson model cannot) and they showed in a general unifying framework
[TABLE]
where is the Gamma distribution function with the shape parameter which coincides with the scale parameter and equals , having density
[TABLE]
The statement (4.2) was proved earlier in Bening and Korolev (2005, Lemma 2.2).
The convergence rate in (4.2) for is given in Bening et al. (2013, Formula (21)) or Gavrilenko et al. (2017, Formula (17)):
[TABLE]
In Schluter and Trede (2016) and Gavrilenko et al. (2017) the negative binomial random variable is not shifted: with . Then we have as instead of . Moreover
[TABLE]
The statements (4.2) and (4.4) still hold when is shifted by a fixed integer. From Taylor expansion with Lagrange remainder term it follows that for
[TABLE]
Hence, for shifting has influence of a term by . Second order asymptotic expansions for where proved in Christoph et al. (2020, Theorem 1):
Proposition 4.1**.**
Let , discrete random variable have probability mass function (4.1) and . For and all there exists a real number such that
[TABLE]
where
[TABLE]
[TABLE]
Remark: The jumps of the sample size have an effect only on the function in the term . The function is periodic with period 1, it is right-continuous with jump height 1 at each integer point . The Fourier series expansion of at all non-integer points is
[TABLE]
see formula 5.4.2.9 in Prudnikov et al. (1992, p. 726) with .
In Theorem 3.1 an estimate for the negative moment of the random sample size is required. Proposition 4.1 is used in Bening (2020, Corollary 2) to obtain an asymptotic expansion of negative moments for and . Such expansions are applied in the mentioned paper to to analyze asymptotic deficiencies and risk functions of estimates based on random-size samples. An improved result is given here, i ncluding the correct bounds for :
Corollary 4.2**.**
Let and . Then for all the following expansions hold for negative moments:
[TABLE]
*where for some constants , .
Remark: The leading terms in (4.9) and the bound (4.5) lead to the estimate
[TABLE]
Proof of Corollary 4.2: Integrating by parts and substituting , we obtain
[TABLE]
where with (4.5) of Proposition 4.1
[TABLE]
Next we calculate the first part of the integral in (4):
[TABLE]
with
[TABLE]
where for
[TABLE]
In case we split the integral in into three parts, the first one leads to the leading term in (4.12) for and , respectively:
[TABLE]
Then we obtain
[TABLE]
Now we calculate the second part of the integral in (4) in case of :
[TABLE]
First we show that the integral has the order of the remainder:
[TABLE]
Let where . Then
[TABLE]
where since for
[TABLE]
and considering (4.8) and interchange integral and sum
[TABLE]
Applying formula 2.5.31.4 in Prudnikov et al. (1992, p. 446) with and then
[TABLE]
Hence
[TABLE]
with Riemann zeta function and and
[TABLE]
In case the Fourier series expansion (4.8) of and integration by parts lead to
[TABLE]
and
[TABLE]
If using we find
[TABLE]
and (4.15) is proved.
It remains to calculate the first term on the right-hand side of (4.14), say . Since the integrals in and have the same structure, one get with the above method
[TABLE]
where , with some constants , .
Estimates (4), (4.12) and (4.16) lead to (4.9) and Corollary 4.2 is proved.
If the statistic is asymptotically normal the limit distribution of the standardized statistic with random size is Student’s -distrib̃ution having density (1.9) with , see Bening and Korolev (2005) or Schluter and Trede (2016).
Theorem 4.3**.**
Let . Consider the sample median with random sample size having probability mass function (4.1) and . If inequalities (2.4) and (4.5) hold for the mean and the random sample size , respectively, then there exists a constant such that
[TABLE]
for all uniformly in , where is defined in (3.5),
[TABLE]
[TABLE]
[TABLE]
Remark: Under the condition (4.4) with a first order expansions of \mathbb{P}_{\theta}\big{(}2p_{0}\sqrt{g_{n}}(M_{N_{n}}-\theta)\leq x\big{)} was announced in the conference paper Bening et al. (2016). Note that the convergence rate in Theorems 3.1 and 3.2 as well as in Corollaries 3.1 and 3.2 in case has to be instead of as announced. Moreover, in case the convergence order in (4.17) improves the rate given in Bening et al. (2016).
Proof of Theorem 4.3: We use Theorems 3.1 and 3.2 with , and defined in Proposition 4.1 .
It follows from (4.10) with that
[TABLE]
The conditions (3.9) and (3.10) follow from (4.13) with , k = 0, 1, 2, respectively , .
Now we estimate the integrals (3.13) and (3.14) to obtain a bound in inequality (3.11). Using (3.8) for and defined in (2.3) we find
[TABLE]
for . If then with
[TABLE]
Consider the second term in (3.13). Let . Using now , then
[TABLE]
If we define the polynomial by with and put . Then and using we obtain
[TABLE]
and for uniform in
[TABLE]
It remains to estimate in (3.14) for . Integration by parts for Lebesgue-Stieltjes integrals and (3.10i) lead to
[TABLE]
with
[TABLE]
where for functions and are bounded, see (3.8).
Moreover for and for .
If with above defined we find
[TABLE]
If with we obtain
[TABLE]
For the above estimates of lead to an exponential integral:
[TABLE]
In the latter case may be obtained with an analogous procedure as for estimating the above integral for in (4.20). This proof is omitted because the rate of convergence in Theorem 4.3, see (4.17), is determined by the negative moment (4.19), where the term cannot be omitted.
To obtain (4.18) we calculate integrals in (3.12), which are similar to that in the proof of Theorem 2 in Christoph et al. (2020). Using formula 2.3.3.1 in Prudnikov et al. (1992, p. 322) with and :
[TABLE]
we compute the first integral in (3.12) with in (4.25):
[TABLE]
Hence
[TABLE]
For we find with defined in (2.3) and in (4.25)
[TABLE]
For we obtain with from (2.3) and in (4.25)
[TABLE]
The integral in (3.12) is the same as the integral in the proof of Theorem 2 in Christoph et al. (2020) where is shown:
[TABLE]
With (4.26) and the term by in (4.18) follows.
5. Laplace Distribution as Limit for Random Sample Median
Let be discrete Pareto II distributed with parameter , having probability mass and distribution functions
[TABLE]
which is a particular class of a general model of discrete Pareto distributions, obtained by discretization continuous Pareto II (Lomax) distributions on integers, see Buddana and Kozubowski (2014).
Now, let , be independent random variables with the same distribution (5.1). Define for and the random variable
[TABLE]
The distribution of is extremely spread out on the positive integers.
In Christoph et al. (2020) the following Edgeworth expansion was proved:
Proposition 5.1**.**
Let the discrete random variable have distribution function (5.2). For , fixed and all then there exists a real number such that
[TABLE]
[TABLE]
where is defined in (4.7).
Remarks: 1. Lyamin (2010) proved a first order bound in (5.3) for integer
[TABLE]
In case and we have for and
[TABLE]
2. The continuous function with parameter is the distribution function of the inverse exponential random variable , where is exponentially distributed with rate parameter . Both and are heavy tailed with shape parameter 1.
Therefore \mathbb{E}\big{(}N_{n}(s)\big{)}=\infty for all and \mathbb{E}\big{(}W(s)\big{)}=\infty. Moreover:
First absolute pseudo moment \nu_{1}=\int_{0}^{\infty}x\big{|}d\big{(}\mathbb{P}\big{(}N_{n}(s)\leq n\,x\big{)}-e^{-s/x}\big{)}\big{|}=\infty,
Absolute difference moment \chi_{u}=\int_{0}^{\infty}x^{u-1}\big{|}\mathbb{P}\big{(}N_{n}(s)\leq n\,x\big{)}-e^{-s/x}\big{|}dx<\infty
for . These statements are proved in Christoph et al. (2020, Lemma 2). On pseudo moments and some of their generalizations see e.g. Christoph and Wolf (1992, Chapter 2).
Next we estimate the negative moment , , for the random sample size :
Corollary 5.2**.**
Let and . Then for all the following expansions hold for negative moments:
[TABLE]
*where for some constants , .
Remarks: 1. The leading terms in (5.6) and the bound (5.3) lead to the estimate
[TABLE]
where for the order of the bound is optimal.
2. In Bening (2020, Corollary 3) the expansion (5.6) for is given with with an additional term at .
Proof of Corollary 5.2: As in the beginning of the proof of Corollary 4.2 we obtain
[TABLE]
where
[TABLE]
[TABLE]
considering (4.8)
[TABLE]
and with (5.3) of Proposition 5.1
[TABLE]
Since for and
[TABLE]
we find with and
[TABLE]
Both remainders decrease exponentially with order respectively .
It remains to estimate . Partial integration leads to
[TABLE]
Considering (5.8) and we obtain .
For an asymptotically normally distributed statistic the limit distribution of the standardized is Laplace distribution having density (1.12) with , therefore . See Bening and Korolev (2008) or Schluter and Trede (2016).
Theorem 5.3**.**
Let . Consider the statistic with random sample size having distribution function (5.2). If for the statistic inequality (2.4) holds and , then there exists a constant such that
[TABLE]
where is defined in (3.5) and
[TABLE]
Remark: Under the condition (5.5) a first order expansions was announced in the conference paper Bening et al. (2016, Theorem 4.1).
Proof of Theorem 5.3: We use Theorems 3.1 and 3.2 with and defined in (5.4), and .
Considering (5.8) the functions , and the corresponding integrals decrease even exponentially with order or , . Moreover, . Hence conditions (3.9) and (3.10) are fulfilled.
It remains to estimate given in (3.14). Changing only by in the estimations (4.23) and (4.24) of the corresponding in the proof of Theorem 4.3, using partial integration, the relations (5.4), (3.8) and for , then we obtain
[TABLE]
To obtain (5.10) we calculate integrals in (3.12) for as in the proof of Theorem 5 in Christoph et al. (2020). Here we use formula 2.3.16.3 in Prudnikov et al. (1992, p. 344) with , , :
[TABLE]
where
[TABLE]
In the mentioned proof we obtained with (5.12) for
[TABLE]
and with (5.12) for
[TABLE]
Moreover, using (5.12) for , we find
[TABLE]
and, finally, with (5.12) for , we calculate
[TABLE]
Together with for all , proved in Christoph et al. (2020, Lemma 3) we proved (5.9).
6. Cornish-Fisher Expansions for Quantiles of and
In statistical inference it is of fundamental importance to obtain the quantiles of the distribution of statistics under consideration. Statistical applications and modeling with quantile functions are discussed extensively by Gilchrist (2000). There are very few quantile functions which can be expressed in closed form. The Cornish-Fisher expansions provide tools to approximate the quantiles of probability distributions.
Let be a distribution function admitting a Chebyshev-Edgeworth expansion in powers of with as :
[TABLE]
where is the density of a three times differentiable limit distribution .
Proposition 6.1**.**
Let be given by (6.1) and let and be quantiles of distributions and with the same order , i.e. . Then the following relation holds for :
[TABLE]
with
[TABLE]
Proposition 6.1 is a direct consequence of more general statements, see e.g. Ulyanov (2011, p. 311-315), Fujikoshi et al. (2010, Chapter 5.6.1) or Ulyanov et al. (2016) and the references therein.
First we consider random median if sample size is negative binomial distributed with probability mass function (4.1) and Student’s -distribution is the limit law. The second order expansion (4.17) in Theorem 4.3 admits a relation like (6.1) with and , . The transfer Proposition 6.1 implies the following statement:
Corollary 6.2**.**
Suppose . Let and be -quantiles of standardized statistic \mathbb{P}\Big{(}2p_{0}\sqrt{g_{n}}(M_{N_{n}(r)}-\theta)\leq x\Big{)} and of the limit Student’s -distribution , respectively. Then with previous definitions the following Cornish-Fischer expansion holds as :
[TABLE]
where B_{2}(u)=\,\frac{\displaystyle p_{1}^{2}\,u^{3}}{\displaystyle 8\,p_{0}^{4}}-\frac{\displaystyle(5-r)\,u^{3}\,+\,(5r+2)u)}{\displaystyle 4(2\,r\,-\,1)}-\frac{\displaystyle u^{3}}{\displaystyle 4}\Big{(}1+\frac{\displaystyle p_{2}}{\displaystyle 6\,p_{0}^{3}}\Big{)}.
Next we study the approximation of quantiles for the random mean if sample size is based on discrete Pareto distributions with probability mass function (5.2) and Laplace distribution is the limit law. Relation (5.9) in Theorem 5.3 admits a expansion like (6.1) with and , . The transfer Proposition 6.1 leads now to:
Corollary 6.3**.**
Suppose . Let and be -quantiles of standardized statistic \mathbb{P}\Big{(}2p_{0}\sqrt{n}(M_{N_{n}(s)}-\theta)\leq x\Big{)} and of the limit Laplace distribution , respectively. Then with previous definitions the following Cornish-Fisher expansion holds
[TABLE]
where B_{2}(u)=\frac{\displaystyle p_{1}^{2}\,u^{3}}{\displaystyle 8\,p_{0}^{4}}\,+\,\frac{\displaystyle(4-s)\,u\,(1+\sqrt{2s}|u|)}{\displaystyle 8\,s}-\frac{\displaystyle u^{3}}{\displaystyle 4}\Big{(}1+\frac{\displaystyle p_{2}}{\displaystyle 6p_{0}^{3}}\Big{)}.
For the sake of completeness let us consider the Cornish-Fischer expansion for the median , too. Using (2.4) with , , defined in (2.3).
Corollary 6.4**.**
Let and be -quantiles of the standardized statistic \mathbb{P}_{\theta}\Big{(}2p_{0}\sqrt{2[m/2]}(M_{m}-\theta)\leq x\Big{)} and of the limit normal distribution , respectively. Then with previous definitions the classical Cornish-Fischer expansion holds as :
[TABLE]
7. Acknowledgement
Proposition 2.1, Theorems 3.1 and 4.3 and Corollary 4.2 have been obtained under support of the RSF Grant No. 18-11-00132. The paper was prepared within the framework of the Moscow Center for Fundamental and Applied Mathematics, Moscow State University and HSE University Basic Research Programs.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Al-Mutairi and Raqab (2020) Al-Mutairi, J.S. and Raqab, M.Z. Confidence intervals for quantiles based on samples of random sizes. Statist. Papers . 61 (1), 261-277 (2020). MR 4056802
- 2Barakat et al. (2018) Barakat, H.M., Nigm, E.M., El-Adll, M.E. and Yusuf, M. Prediction of future generalized order statistics based on exponential distribution with random sample size. Statist. Papers . 59 (2), 605-631 (2018). MR 3800816.
- 3Bening (2020) Bening, V.E. On risks of estimates based on random-size samples. Moscow University Computational Mathematics and Cybernetics . 44 (1), 16-26 (2020)
- 4Bening and Korolev (2005) Bening, V.E. and Korolev, V.Yu. On the use of Student’s distribution in problems of probability theory and mathematical statistics. Theory Probab. Appl. 49 (3), 377-391 (2005). MR 2144862.
- 5Bening and Korolev (2008) Bening, V.E. and Korolev, V.Yu. Some statistical problems related to the Laplace distribution (Russian). Informatics and its Applications , IPI RAN. 2 (2), 19-34 (2008).
- 6Bening et al. (2013) Bening, V.E., Galieva N.K. and Korolev V.Yu. Asymptotic expansions for the distribution functions of statistics constructed from samples with random sizes (Russian). Informatics and its Applications . IPI RAN. 7 (2), 75-83 (2013).
- 7Bening et al. (2016) Bening, V.E., Korolev, V.Yu. and Zeifman, A.I. Asymptotic expansions for the distribution function of the sample median constructed from a sample with random size. In Proceedings 30th ECMS 2016 Regensburg , edited by Claus, T. et al. 669-675(2016). doi:10.7148/2016-0669.
- 8Buddana and Kozubowski (2014) Buddana, A. and Kozubowski, T.J. Discrete Pareto distributions. Econ. Qual. Control . 29 (2), 143-156 (2014).
