Extreme value theory based confidence intervals for the parameters of a symmetric L\'evy-stable distribution
Djamel Meraghni, Louiza Soltane

TL;DR
This paper develops confidence intervals for symmetric Lévystable distribution parameters using EVT estimators and assesses their accuracy via simulations.
Contribution
It introduces EVT-based confidence intervals for Lévystable parameters and evaluates their performance through simulation studies.
Findings
Confidence intervals are constructed using EVT estimators.
Simulation results demonstrate the intervals' accuracy.
Method provides a new approach for parameter inference in stable distributions.
Abstract
We exploit the asymptotic normality of the extreme value theory (EVT) based estimators of the parameters of a symmetric L\'evy-stable distribution, to construct confidence intervals. The accuracy of these intervals is evaluated through a simulation study.
| abs bias | mse | conf int | length | cov prob | |||
|---|---|---|---|---|---|---|---|
| abs bias | mse | conf int | length | cov prob | |||
|---|---|---|---|---|---|---|---|
| abs bias | mse | conf int | length | cov prob | |||
|---|---|---|---|---|---|---|---|
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFinancial Risk and Volatility Modeling · Probability and Risk Models · Stochastic processes and financial applications
Extreme value theory based confidence intervals for the parameters of a symmetric Lévy-stable distribution
Djamel Meraghni*∗*, Louiza Soltane
*Laboratory of Applied Mathematics, Mohamed Khider University, Biskra, Algeria
Abstract
We exploit the asymptotic normality of the extreme value theory (EVT) based estimators of the parameters of a symmetric** **Lévy-stable distribution, to construct confidence intervals. The accuracy of these intervals is evaluated through a simulation study.
Keywords: Asymptotic normality; Confidence bounds; Lévy-stable law; Extreme values; Hill Estimator.
MSC 2010 Subject Classification: 60E07; 62G20; 62G32; 62G05
Corresponding author
*E-mail addresses :
*[email protected] (D. Meraghni)
[email protected] (L. Soltane)
1. Introduction
1.1. Lévy-stable Distributions
The Lévy-stable distribution, also called stable, -stable or stable Paretian, represents a rich class of probability distributions. Introduced in 1920’s by Lévy Paul (1925), while investigating the behavior of normalized sums of independent identically distributed (iid) random variables (rv’s), it has got an increased attention in the last decades for at least two good reasons. First, it is theoretically supported by the generalized central limit theorem which states that the -stable law is the only possible limit distribution for properly normalized and centered sum of iid rv’s. Second, it allows skewness and fat tails meaning that it is suitable for data collected in areas as diverse as finance, hydrology, meteorology,… Indeed, a great deal of empirical evidence indicates that these data can be so heavy-tailed that they are poorly described by the largely used Gaussian distribution. In other words, the stable model provides a much better fit for heavy-tailed observations sets than the commonly adopted normal one does.
The extreme value theory (EVT), which proved to be an excellent tool in risk management, could be applied to estimate the parameters characterizing a stable distribution in order to determine the appropriate model for a given data set. In the sequel, let {\normalsize\overset{p}{\rightarrow}}\and stand for equality in distribution, convergence in probability and convergence in distribution respectively and let denote the normal distribution with mean and variance
A rv is said to be Lévy-stable if and only if, for such that
[TABLE]
where are independent copies of It is shown that such that (see, e.g., Feller, 1971).
Except from three special cases, a stable rv suffers from the lack of closed-form expressions for its distribution function (df) and probability density function (pdf). However, it is typically described by its characteristic function which has many representations. The most famous one is defined for by
[TABLE]
where
[TABLE]
As we may see, this family of distributions is characterized by four parameters :
stability index, tail exponent or shape parameter.
scale parameter.
skewness parameter.
location parameter.
Using a notation of Samorodnitsky and Taqqu (1994), a rv with stable distribution will be written as The three cases where we have explicit formulas for the pdf are the very popular Gaussian distribution and the lesser known models of Cauchy and Lévy The tail exponent which is the most important among all four parameters, indicates the rate at which the tails of the distribution taper off. For the th moment of a stable rv is finite if and only if whereas for all the moments exist. In particular, the distribution mean only exists when and is equal to the location parameter For the variance is infinite and the distribution tails are asymptotically equivalent to those of a Pareto distribution, i.e., they exhibit a power-law behavior.
1.2. Heavy Tails Property of
In general, the upper and lower tails of a Lévy-stable distribution asymptotically exhibit a Pareto-like behavior, i.e. they fall off like a power function. The rate of decay is governed by the stability index : the smaller the slower the decay and hence the heavier the distribution tails, as shown in Figure 1.1.
More precisely, for a rv the following result holds (see e.g., Samorodnitsky and Taqqu, 1994, page ).
[TABLE]
where
[TABLE]
with being the gamma function defined, for by From equations we get what is specifically called tail balance conditions. That is, we have, as
[TABLE]
Let and denote the df’s of and respectively. It is obvious that and are related by
[TABLE]
From relation we get that the distribution tail of satisfies
[TABLE]
and
[TABLE]
The latter means that is regularly varying at infinity with index For full details on regular variation, see, for instance, Appendix B in de Haan and Ferreira (2006). From Gnedenko (1943), relation is equivalent to say that is in Fréchet maximum domain of attraction. More precisely, for a sample from the rv we have
[TABLE]
is the generalized inverse or quantile function of and
[TABLE]
For further details and a complete description of this class of distributions, we refer to the textbooks of Feller (1971), Zolotarev (1986), Samorodnitsky and Taqqu (1994) and Nolan (2001). On the other hand, there are available some very useful computer programs, such that ”STABLE”, ”Xplore” and the package ”stabledist” of the statistical software R (Ihaka and Gentleman, 1996), specially developed for numerical purposes (computing stable df’s and pdf’s, generating stable rv’s, estimating stable parameters,…).
In this work, we concentrate on the case where that is when the distribution is symmetric about In this case, the characteristic function and the tail balance conditions respectively reduce to the simpler forms
[TABLE]
and
[TABLE]
The rest of the paper is organized as follows. Section 2, is devoted to a brief reminder on EVT-based estimators of the stable parameters. In Section 3, we use the asymptotic normality property of the estimators to build confidence intervals for parameters and Finally, the accuracy of such intervals is investigated in a simulation study in Section 4.
2. EVT-based estimation
The lack of explicit forms for the df and pdf severely hampers the estimation of the distribution parameters. Nevertheless, several numerical procedures of estimation based on the sample quantiles, the sample characteristic function and maximum likelihood approaches, are proposed in the literature. In a comparative study Ojeda (2001) notices that maximum likelihood based methods are the most accurate but the slowest of all others. On the other hand, the nature of the Lévy-stable distribution tails suggests that EVT could play a major role in estimating its parameters. EVT is a classical topic in probability theory and mathematical statistics, developed for the estimation of occurrence probability of rare events. It permits to extrapolate the behavior of distribution tails from the largest observed data. EVT techniques have proven to be very useful where estimation of tail-related quantities such as extreme value index, high quantiles, small exceedance probabilities and mean excess function, is needed. The domains of application of EVT include insurance (premium computation, large losses,…), finance (asset returns, exchange rate,…), hydrology (floods, drought,…), meteorology (extreme weather conditions,…), ecology (pollution peaks,…), telecommunications (network traffic,…), physics (nuclear reactions,…). EVT-based estimation approach has at least three advantages. It focuses only on tail behavior and does not assume a parametric form for the entire distribution. It provides estimators of explicit forms making estimate computation easier and more direct. Finally, it produces estimators which enjoy the asymptotic normality property leading to the construction of confidence bounds for the unknown parameters. A very good variety of textbooks may be consulted for a review of this topic and its multiple applications. We can cite, for instance, de Haan and Ferreira (2006), Embrechts et al. (1997), Reiss and Thomas (1997) and Beirlant et al. (2004).
2.1. Estimating the Stability Index
The characteristic exponent is the main parameter as it governs the behavior of the distribution tails. Many estimators are proposed for via the EVT approach, among which the most popular is that introduced by Hill (Hill, 1975) as follows :
[TABLE]
where are the order statistics pertaining to a sample from the rv and is an integer sequence such that
[TABLE]
The consistency of is proved in Mason (1982), while its almost sure convergence is established in Necir (2006a). For the asymptotic normality of (and other related estimators), it is required an additional assumption, known as the second-order condition of regular variation (see de Haan and Stadtmüller, 1996), which specifies the rate of convergence in That is, we assume that there exist a constant called second-order parameter, and a function tending to zero and not changing sign near infinity, such that for any we have
[TABLE]
Note that when the condition is fulfilled. Indeed, using the expansion (to the second order) given in top of page 95 in Zolotarev (1986), yields that belongs to Hall’s class of heavy-tailed distributions (Hall, 1982), which in turn implies that holds. A df is said to belong Hall’s class if
[TABLE]
where and Hall’s class, which is a subset of the more general family of models with second-order regularly varying tails, includes distributions (Burr, Fréchet,…) that are most commonly used in extreme event modelling. Among the works on the asymptotic normality of we can cite that of Peng (1998) who proved that, if holds, then for an integer sequence satisfying and with finite, then
[TABLE]
Weron (2001) discussed the performance of Hill’s estimator and noted that for the estimation is quite reasonable but as approaches there is a significant overestimation when considering samples of typical size (for an illustration, see Figure 4.2 and Table 4.1). For such values of a very large number of observations (a million or more) is needed in order to obtain acceptable estimates and avoid misleading inference on the stability index, because the true heavy tail nature of the distribution is visible only for extremely large datasets. Fortunately, this kind of datasets are available nowadays and their storage and treatment are made possible thanks to a very sophisticated technology.
The behavior of Hill’s estimator (and therefore that of EVT-based estimators) is affected by the number of upper order statistics to be used in estimate computations. One needs to locate where the distribution tails really begin because using too many data results in a big bias and too few observations lead to a substantial variance. Consequently, one has to make a trade-off between bias and variance in order to get an accurate estimate. To this end, it seems reasonable that minimizing the mean squared error allows for a compromise between the bias and variance components. On the other hand, there exist several algorithms and data-adaptive procedures for the selection of the optimal sample fraction of extreme values that guarantees the best possible estimate (see, for instance, Cheng and Peng, 2001, Danielsson et al., 2001, Fereira and Vries, 2004 and Neves and Fraga Alves, 2004).
2.2. Estimating the Location Parameter
The empirical mean which is the natural estimator of the mean, is, in virtue of the central limit theorem, asymptotically normal provided that the second moment is finite. However, for with the latter theorem is not applicable because the variance of is infinite. Therefore, the asymptotic normality of the sample mean is not established. To solve this problem, Peng (2001) proposed an asymptotically normal estimator for based on the the order statistics associated to a sample from as follows :
[TABLE]
where
[TABLE]
[TABLE]
with
[TABLE]
and
[TABLE]
being consistent estimators of as well. The strong limiting behavior of is studied in Necir (2006b) when constructing a nonparametric sequential test with power for For the asymptotic normality of we notice that, by the expansion (to the second order) and the relationship between the tails of respectively given in pages 95 and 65 of Zolotarev (1986), both tails of satisfy the definition of Hall’s model Peng (2001) proved that, with a suitable choice of
[TABLE]
where
[TABLE]
and
[TABLE]
It is shown in Peng (2001) that, as
[TABLE]
which in our case may be rewritten into
[TABLE]
2.3. Estimating the Scale Parameter
By combining relations and with some approximations, Meraghni and Necir (2007) provided a consistent estimator to the scale parameter as follows :
[TABLE]
and proved that, with an adequate sequence
[TABLE]
3. Confidence bounds
Let us fix the confidence level of estimation to be and let denote the -quantile of the standard Gaussian distribution. The first step, in the process of confidence interval construction, is to determine the optimal sample fraction, that we denote by of extreme observations involved in estimate computation. To this end, we adopt the methodology of Neves and Fraga Alves (2004) who discussed and evaluated the performance of the procedure proposed by Reiss and Thomas (1997). The latter consists in taking as optimal the value of that minimizes
[TABLE]
where stands for the median and In other words, we have
[TABLE]
Since we will be interested in the range then we choose as indicated in Neves and Fraga Alves (2004).
The second step is to compute the estimate values which correspond to the optimal number Note that, for parameters and we only use the top observations in the -sample, whereas for we use the whole -sample. Finally, we exploit the asymptotic normality results and to get asymptotic confidence bounds for and respectively. If we set and then the respective -confidence intervals for parameters and are
[TABLE]
[TABLE]
and
[TABLE]
where
[TABLE]
4. Simulation study
We carry out a simulation study, by means of the statistical software R (Ihaka and Gentleman, 1996), to illustrate the finite sample behaviors of the three estimators and by computing their absolute biases (abs bias) and mean squared errors (mse). We also evaluate the accuracy of the confidence intervals (conf int) through their lengths and coverage probabilities (cov prob). But first, we start by graphically checking Weron’s note (Weron, 2001) on Hill’s estimator of the stability index. It is noteworthy that, for each experiment, we make replications then we take our overall results by averaging over all the individual results obtained at the end of each repetition.
We see on the right graph of Figure 4.2, which is based on samples of size from stable distributions and that there is no intermediate number which gives a good estimate for and that estimates can even be above the Lévy-stable regime. On the other side, the left panel shows that for accurate estimates could be obtained for
For the estimation of the shape parameter we generate observations of symmetric -stable distributions with several values for parameters and The results are summarized in Table 4.1, where we note that, as expected, the smaller the parameter values, the better the estimation. The bottom of the table (corresponding to ) shows that the estimation is very poor for large -values and confirms the graphical conclusion we made about the irrelevance of Hill’s estimator, for large stability indices, when built on the basis of datasets of typical sizes. For this reason, only the values of that are less than or equal to will be considered thereafter. We gather the simulation results in Table 4.2 for the location parameter and in Table 4.3 for the scale parameter The former shows that the more gets away from the estimation of gets better and better while the latter indicates that the estimation of is not good when is around but for smaller values, it might be considered as acceptable. It is to be noted that, in regards to the estimation of the results are extremely poor when the stability index is very close to This may be explained by the fact that, in this work, we only consider -values lying between and and in this case the location parameter is equal to the distribution mean and is estimated as such. When is less than or equal to the mean does not exist and therefore the EVT-based estimation of is very bad when is near
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Beirlant et al . (2004) Beirlant, J., Goegebeur, Y., Segers, J. and Teugels, J., 2004. Statistics of extremes: Theory and applications. Wiley.
- 2Cheng and Peng (2001) Cheng, S. and Peng, L., 2001. Confidence intervals for the tail index. Bernoulli 7 , 751-760.
- 3Danielsson et al. (2001) Danielsson, J., de Haan, L., Peng, L. and de Vries, C.G., 2001. Using a bootstrap method to choose the sample fraction in tail index estimation. J. Multivariate Analysis 76 , 226-248.
- 4Embrechts et al. (1997) Embrechts, P., Klüppelberg, C. and Mikosch, T., 1997. Modelling extremal events for insurance and finance. Springer-Verlag, New York.
- 5Feller (1971) Feller, W., 1971. An introduction to probability theory and its applications, 2nd ed. Wiley, New York.
- 6Fereira and Vries (2004) Fereira, A. and de Vries, C. G., 2004. Optimal confidence intervals for the tail index and high quantiles. Tinberg Institute Discussion Paper 090/2.
- 7Gnedenko (1943) Gnedenko, B.V., 1943. Sur la distribution limite du terme maximum d’une série aléatoire. Annales de Mathématiques 44 , 423-453.
- 8de Haan and Stadtmüller (1996) de Haan, L. and Stadtmüller, U., 1996. Generalized regular variation of second order. J. Australian Math. Soc. (Series A) 61 , 381-395.
