On discrimination between two close distribution tails
Igor Vladimirovich Rodionov

TL;DR
This paper introduces a new goodness-of-fit test based on higher order statistics to distinguish between two similar distribution tails, proving its consistency without assuming maximum domain of attraction.
Contribution
It proposes a novel tail discrimination test that is consistent under various conditions, expanding applicability beyond traditional assumptions.
Findings
Test is consistent for different alternatives
Does not require maximum domain of attraction assumption
Applicable to distinguishing close distribution tails
Abstract
The goodness-of-fit test for discrimination of two tail distribution using higher order statistics is proposed. The consistency of proposed test is proved for two different alternatives. We do not assume belonging the corresponding distribution function to a maximum domain of attraction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Distribution Estimation and Applications · Financial Risk and Volatility Modeling · Advanced Statistical Methods and Models
On discrimination between two close distribution tails.
Rodionov I. V Moscow State University, Faculty of Mathematics and Mechanics and Moscow Institute of Physics and Technology, Faculty of Innovations and High Technologies. E-mail: [email protected]
1 Introduction. Main result.
Statistics deals often with discrimination of close distributions based on censored or truncated data, in particular, for high-risk insurances and reliability problems. The situation when one observes data exceeding a pre-determined threshold is well-studied, see [1], [2], [3] and references therein. On the other hand statistics of extremes says that only higher order statistics should be used for discrimination of close distribution tails, wherein moderate sample values can be modeled with standard statistical tools. In particular, such approach for distributions from Gumbel maximum domain of attraction (for the definitions see [4]) is considered in [5], [6], [7]. As well, any estimators of the extreme value indices and (see [8]) can be used also to discriminate the distribution tails. Notice that we do not assume belonging the corresponding distribution function to a maximum domain of attraction.
Definition 1
The distribution functions and are said to be satisfied the condition if for some and
[TABLE]
Denote by the class of continuous distribution functions satisfying either or Consider the simple hypothesis and the alternative hypothesis where is continuous. Notice that if distribution functions satisfy either or for some then it holds for all So denote
[TABLE]
Denote by the class of continuous distribution functions satisfying either or with and consider another alternative hypothesis Let be i.i.d. random variables with a common distribution function . Denote by the order statistics for them. Introduce the Hill-like statistics
[TABLE]
which we are going to use for the problem of discrimination between the two introduced above hypotheses when higher order statistics are known. Remark that if is Pareto distribution function with parameter , then
[TABLE]
where is the Hill estimator of If furthermore belongs to Fréchet max-domain of attraction, then behaves asymptotically as that is, theirs ratio tends to one as We will show that the distributions of if either or fulfilled are different which can give a statistical for discrimination the hypotheses. The following two results describe the behavior of as with provided or is fulfilled.
Theorem 1
If holds then
[TABLE]
where is standard normal random variable, i.e.
This theorem gives obvious goodness-of-fit test for the tail of Besides, the following result provides some information about the consistency of this test. Assume that does not hold and is equal to which is different from Denote , the right endpoint of , that is, Assume that and any have the same right endpoint (how to discriminate distributions with different endpoints, see [10], [4]). Further consider otherwise change variables gives the assumption. The following theorem shows consistency of the proposed test.
Theorem 2
- (i)
If holds then
[TABLE]
provided as 2. (ii)
If holds then under the same conditions
[TABLE]
The considered test makes it possible to discriminate, for example, two normal distributions with different variances, but we should weaken the condition (1) to discriminate two normal distributions with the same variance and different means. But weakening the condition (1) imposes some conditions on behavior of the sequence
Definition 2
The distribution functions and are said to satisfy the condition if for some and
[TABLE]
Denote by the class of continuous distribution functions satisfying either or and the following condition: for some
[TABLE]
See, if distribution functions satisfy either or for some then it holds for all Denote
[TABLE]
Denote by the class of continuous distribution functions satisfying (3) and either or with As before, consider the simple hypothesis and two alternative hypotheses with continuous
Theorem 3
- (i)
If holds then
[TABLE]
provided for some as 2. (ii)
If holds then under the same conditions
[TABLE]
2 Auxiliary results and proofs.
2.1 Auxiliary results.
Since depends on the higher order statistics we cannot immediately use independence of the random variables Therefore consider the conditional distribution of given applying the following lemma.
Lemma 1
([4]) Let be i.i.d. random variables with common distribution function and let be the th order statistics. For any , the conditional joint distribution of given is equal to the (unconditional) joint distribution of the corresponding set of order statistics for i.i.d. random variables having the distribution function
[TABLE]
We call the tail distribution function linked with the distribution function Consider two continuous distribution functions and and a random variable with distribution function where is some parameter. Let
[TABLE]
Clear, for all
The crucial point in the proof of Theorem 2 is studying of asymptotical behavior of
Proposition 1
Let and are tail distribution functions of and respectively. Then
(i)
If for some , and any , then is standard exponential.
(ii)
* for any if and only if is stochastically smaller than a standard exponential random variable.*
* for any if and only if is stochastically larger than a standard exponential random variable.*
(iii)
* for any and some if and only if is nonincreasing function as *
2.2 Proof of Proposition 1.
(i) Let for all then we have for the distribution function of ,
[TABLE]
[TABLE]
Furthermore, for the same ,
[TABLE]
(ii) Now assume that for all and some , Then from (4), since for all it follows that
[TABLE]
[TABLE]
Conversely, assume that is stochastically smaller than a standard exponential random variable, that is, for all With (4) we get that
[TABLE]
[TABLE]
[TABLE]
Denote and Since and we have,
[TABLE]
Further, since then
[TABLE]
This observation completes the proof since The proof of the second assertion is similar.
(iii) We have,
[TABLE]
[TABLE]
2.3 Proof of Theorem 1.
Under the conditions of Theorem 1, is uniformly distributed on , that is, hence is standard exponential random variable. It follows from Rényi’s representation (see [4]), that
[TABLE]
where are independent standard exponential variables. Therefore the distribution of the left-hand side does not depend on and
[TABLE]
where are the th order statistics of the sample Finally we have,
[TABLE]
and the assertion follows from the Central Limit Theorem.
2.4 Proof of Theorem 2.
We first prove (i). The steps of the proof are similar to corresponding steps in [6] and [7]. Consider asymptotic behavior of as Denote
[TABLE]
where are i.i.d. random variables introduced in Lemma 1 with the distribution function
[TABLE]
Taking and we have, . Notice that, in view of Lemma 1, the joint distribution of order statistics of the sample is equal to the joint conditional distribution of order statistics of given where
[TABLE]
Clear,
[TABLE]
So, the conditional distribution of given is equal to the distribution of Further, distribution functions and satisfy or First suppose that the condition holds for some and Since a.s., we may consider the case only. Proposition 1 (iii) implies, that
[TABLE]
With (5), we get that,
[TABLE]
[TABLE]
hence is stochastically larger than a random variable write Further, let are i.i.d. random variables with distribution function then
[TABLE]
Since (6) holds for all and a.s. as , we have under the conditions of Theorem 2, that
[TABLE]
It follows from Lindeberg-Feller theorem, that
[TABLE]
therefore
[TABLE]
Finally, with (7), we have,
[TABLE]
If the condition holds, then
[TABLE]
and the proof is similar. The second assertion easily follows from (7) and (8).
2.5 Proof of Theorem 3.
Firstly we prove (i). Denote In notation of the proof of Theorem 2, find the distribution of First assume that holds. With (5) and Proposition 1 (iii) we have,
[TABLE]
[TABLE]
For
[TABLE]
and is the distribution function. Hence,
[TABLE]
Further, let be i.i.d. random variables with this distribution function. Therefore, like the proof of Theorem 2,
[TABLE]
Clear,
[TABLE]
so we have,
[TABLE]
Consider now the statistic Denote Since is continuous, are i.i.d. standard uniform random variables and Theorem 2.2.1 [4] implies, that
[TABLE]
Using the delta method (see [11]) for the function we have
[TABLE]
since under the conditions of theorem
[TABLE]
Further,
[TABLE]
[TABLE]
and (11) implies that the first summand in the right hand side tends to [math] in probability. Therefore,
[TABLE]
and under the conditions of Theorem 3,
[TABLE]
On the other hand, from (3) it follows that
[TABLE]
as Further, it follows from the Law of large numbers for triangular arrays (see [9]), that for any
[TABLE]
It means that the term in the left hand side is asymptotically smaller in probability than Hence for any given
[TABLE]
and finally,
[TABLE]
If the condition holds, then
[TABLE]
and the proof is the same. The second assertion clearly follows from (9), (10) and (12).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1[1] Dufour R., Maag U.R. Distribution Results for Modified Kolmogorov-Smirnov Statistics for Truncated or Censored Samples. — Technometrics, 1978, v. 20, p. 29–32.
- 2[2] Guilbaud O. Exact Kolmogorov-Type Test for Left-Truncated and/or Right-Censored Data. — Journal of American Statistical Association, 1998, v. 83, p. 213–221.
- 3[3] Chernobai A., Menn C., Rachev S. T., Truck S. Estimation of operational value-at-risk in the presence of minimum collection thresholds. — Tech. Rep., University of California, Santa Barbara, Calif, USA, 2005.
- 4[4] Fereira A., Haan L. de. Extreme value theory. An introduction. N. Y.: Springer, Springer Series in Operations Research and Financial Engineering, 2006.
- 5[5] Gardes L., Girard S., Guillou A. Weibull tail-distributions revisited: a new look at some tail estimators. — Journal of Statistical Planning and Inference, 2009, v. 141, p. 429–444.
- 6[6] Rodionov I. V. A discrimination test for tails of Weibull-like distributions. — to appear in Probability Theory and its Applications.
- 7[7] Rodionov I. V. Discrimination of close hypotheses on distribution tails using higher order statistics. — to appear in Extremes.
- 8[8] Haan L. de, Resnick S. Second-order regular variation and rates of convergence in extreme value theory. — The Annals of Probability, 1996, v. 24, i. 1, p. 97–124.
