Estimates for order statistics in terms of quantiles
Alexander E. Litvak, Konstantin Tikhomirov

TL;DR
This paper establishes a relationship between the median of the k-th order statistic of independent non-negative variables and the quantile of an averaged distribution, under mild conditions.
Contribution
It provides a novel equivalence between the median of order statistics and specific quantiles of an averaged distribution for independent non-negative variables.
Findings
Median of k-th order statistic approximates the quantile of order (k-1/2)/n.
Results hold under mild distributional conditions.
Applicable to diverse non-negative random variables.
Abstract
Let be independent non-negative random variables with cumulative distribution functions , each satisfying certain (rather mild) conditions. We show that the median of -th smallest order statistic of the vector is equivalent to the quantile of order with respect to the averaged distribution .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Distribution Estimation and Applications · Probability and Risk Models · Bayesian Methods and Mixture Models
Estimates for order statistics in terms of quantiles
Alexander E. Litvak and Konstantin Tikhomirov
Abstract
Let be independent non-negative random variables with cumulative distribution functions , each satisfying certain (rather mild) conditions. We show that the median of -th smallest order statistic of the vector is equivalent to the quantile of order with respect to the averaged distribution .
AMS 2010 Classification: 62G30, 60E15
Keywords: Order statistics, INID case
1 Introduction
The goal of this note is to provide sharp estimates for order statistics of independent, not necessarily identically distributed random variables, whose distributions satisfy certain (rather mild) conditions. Order statistics are among very important objects in probability and statistics with many applications. We refer to [AN, BC, DN] and references therein for information on the subject, especially in the case of i.i.d. random variables. The case of independent but not identically distributed random variables is less studied, we refer to [DN, Chapter 5] for some results in this direction. Understanding this setting is important in some applications, for example in connection with the Mallat–Zeitouni problem [MZ, LT], the study of asymptotic behaviour of some classes normed spaces [GLSW1], some problems in reconstruction [GLMP], to name a few.
Given and a sequence of real numbers , let and denote its -th smallest and -th largest elements, in particular,
[TABLE]
Let be cdf (cumulative distribution function) of a non-negative random variable. We employ the following condition:
[TABLE]
(see the next section for discussion and examples).
The main result of this note, Theorem 3.1, states that given , , and independent non-negative random variables with cdf’s , each satisfying condition (1) with parameter , one has
[TABLE]
where is the quantile of order with respect to the averaged distribution .
This result improves and complements the results from [GLSW2, GLSW3, GLSW4], where, under somewhat stronger conditions on distributions, the authors proved estimates for the corresponding expectations up to a factor logarithmic in . More precisely, in [GLSW2, GLSW3] it was shown that given , , real numbers , and independent random variables satisfying
[TABLE]
one has
[TABLE]
where , and is an absolute positive constant. In [GLSW4] this was extended further to a larger class of distributions, namely it was shown that the expectation above is equivalent to some Orlicz norm of the sequence , again up to a factor logarithmic in .
We would also like to mention that order statistics of random vectors with independent but not identically distributed coordinates were studied in [Sen], where a result of Hoeffding [Ho] was used, in particular, to estimate the difference between the median of and the median of the -th order statistic of a random vector with i.i.d. coordinates distributed according to the law (see also [DN, pp 96–97]). However, the results of [Sen] do not seem to directly imply the relations which we prove in Theorem 3.1.
2 Notation and preliminaries
Given a subset , we denote its cardinality by . Next, for a natural number and a set , we denote by the complement of inside . Similarly, for an event we denote by the complement of the event. Further, we say that a collection of sets is a partition of if each is non-empty, the sets are pairwise disjoint and their union is . The canonical Euclidean norm and the canonical inner product in will be denoted by and , respectively. We adopt the conventions and throughout the text.
Let be a real-valued random variable. As usual, we use the abbreviation cdf for the cumulative distribution function (that is, the cdf of is ). Given , by we denote a quantile of order , that is a number satisfying
[TABLE]
(note that in general is not uniquely defined).
Now we discuss our main condition on the distributions, the condition (1). Clearly, if the cdf of a non-negative random variable satisfies condition (1) with some then for every the cdf of satisfies (1) with the same . Note that (1) is equivalent to
[TABLE]
where is the probability measure on (actually, on ) induced by . It is not difficult to see that the uniform distribution on satisfies the condition (1) with . Another example of a random variable satisfying (1) (with ) is a random variable taking values in with , , where is a fixed parameter. Next we show that the absolute value of any log-concave random variable satisfies (1). In particular, this includes Gaussian and exponential distributions.
Lemma 2.1**.**
Let be a log-concave variable. Then the cdf of satisfies (1) with .
The lemma is an immediate consequence of the following statement and the fact that conditions (1) and (2) are equivalent.
Lemma 2.2**.**
Let be a non-degenerate log-concave probability measure on and let . Then
[TABLE]
and
[TABLE]
In particular, we have
[TABLE]
where is defined by , .
[Proof]We prove the first inequality only, the second one is similar. Note that
[TABLE]
By log-concavity of this implies
[TABLE]
Thus
[TABLE]
**which implies the result. **
Remark 2.3**.**
We would also like to notice that (1) implies that
[TABLE]
This (weaker) assumption on was employed in [LT].
3 Main result
In this section we prove our main result, stating that medians of order statistics in case of independent components are equivalent to corresponding quantiles of an averaged distribution.
Theorem 3.1**.**
Let and . Let be independent non-negative random variables with cdf’s , each satisfying condition (1) with parameter . Set . Then for one has
[TABLE]
and for one has
[TABLE]
In particular,
[TABLE]
In the proof of the theorem, we will use two following auxiliary statements.
Lemma 3.2**.**
Let be a non-decreasing function satisfying (1). Let , , and . Then
[TABLE]
and, assuming that ,
[TABLE]
[Proof]Applying (1) times we obtain
[TABLE]
which implies (3). Fix a parameter , which will be specified later. If then the above inequality implies
[TABLE]
Otherwise, if , we get
[TABLE]
**Choosing , we get (4) and complete the proof. **
The next simple lemma can be verified by considering the expectation and the variance of the sum of random Bernoulli variables and using the Chebyshev inequality.
Lemma 3.3**.**
Let be independent Bernoulli random variables with probabilities of success . Then for every we have
[TABLE]
[Proof of Theorem 3.1.] We start with the first bound. Take any positive q<q_{F}\big{(}\frac{k-1/2}{n}\big{)}. By definition of the quantile, we have To estimate from below it is enough to show that the set of indices corresponding to “small” ’s has cardinality at most .
Fix such that and put , . Further, set
[TABLE]
We want to estimate the number of indices corresponding to “small” . Denote
[TABLE]
Applying (3) to , , we get that . Therefore, if then
[TABLE]
If then, applying Lemma 3.3, we get
[TABLE]
Thus, in both cases we have
[TABLE]
Next, we estimate the number of indices corresponding to “small” ’s. If , then we have
[TABLE]
and from (5) we obtain
[TABLE]
Now, assume that . Set . Applying (4) to , , we get . Note that . Therefore, by Lemma 3.3 we obtain
[TABLE]
Combining the last relation with (5) and using that , we obtain
[TABLE]
where in the last inequality we used the assumption and the identity . This proves
[TABLE]
Finally, by the choice of we have , which implies the first part of the theorem.
The second part is somewhat similar. To make comparison with the first part of the proof straightforward, we will use the same letters for corresponding sets or numbers, just adding a bar. Let . By definition, we have To estimate from above we will show that the set of indices corresponding to “small” typically has cardinality at least . Fix such that such that , and set , . Further, let
[TABLE]
Let us bound the number of indices corresponding to “small” . Denote
[TABLE]
Assume that . Applying Lemma 3.3, we get
[TABLE]
where we used the estimate , which follows from (3). Thus, in both cases and we have
[TABLE]
Next, we estimate the number of indices corresponding to “small” ’s. Fix
[TABLE]
If , then In this situation we have
[TABLE]
if and only if for all . Note also that for every one has
[TABLE]
This and independence of ’s imply
[TABLE]
Together with (6), it gives
[TABLE]
It remains to consider the case . Set
[TABLE]
Applying (4) to , , we get that . Note that . Therefore, by Lemma 3.3, we obtain
[TABLE]
Combining this with (6), we obtain
[TABLE]
Since and in view of the definitions of and , in both cases and one has
[TABLE]
Note that , thus the last estimate implies
[TABLE]
**Finally, observe that by the choice of we have , which implies the second estimate in the theorem. **
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[AN] B. C. Arnold, N. Narayanaswamy , Relations, Bounds and Approximations for Order Statistics , Lecture Notes in Statistics, 53, Berlin etc.: Springer-Verlag. viii (1989).
- 2[BC] N. Balakrishnan, A.C. Cohen , Order Statistics and Inference , New York, NY: Academic Press (1991).
- 3[DN] H. A. David, H. N. Nagaraja , Order statistics , 3rd ed., Wiley Series in Probability and Statistics. Chichester: John Wiley & Sons, 2003.
- 4[GLMP] Y. Gordon, A. E. Litvak, S. Mendelson, A. Pajor , Gaussian averages of interpolated bodies and applications to approximate reconstruction , J. Approx. Theory, 149 (2007), 59–73.
- 5[GLSW 1] Y. Gordon, A. E. Litvak, C. Schütt, E. Werner , Geometry of spaces between zonoids and polytopes , Bull. Sci. Math., 126 (2002), 733–762.
- 6[GLSW 2] Y. Gordon, A. E. Litvak, C. Schütt, E. Werner , Minima of sequences of Gaussian random variables , C. R. Acad. Sci. Paris, Sér. I Math., 340 (2005), 445–448.
- 7[GLSW 3] Y. Gordon, A. E. Litvak, C. Schütt, E. Werner , On the minimum of several random variables , Proc. Amer. Math. Soc. 134 (2006), 3665–3675.
- 8[GLSW 4] Y. Gordon, A. E. Litvak, C. Schütt, E. Werner , Uniform estimates for order statistics and Orlicz functions , Positivity, 16 (2012), 1–28.
