Extreme value statistics for censored data with heavy tails under competing risks
Julien Worms (LM-Versailles), Rym Worms (LAMA)

TL;DR
This paper introduces a novel estimator for the extreme value index in censored data with competing risks, demonstrating its asymptotic normality and finite-sample performance through simulations.
Contribution
It proposes the first estimator based on an Aalen-Johansen integral for extreme value index in this context, addressing heavy tails and censoring.
Findings
Estimator is asymptotically normal.
Performs well in finite-sample simulations.
Enables estimation of extreme quantiles in competing risks.
Abstract
This paper addresses the problem of estimating, in the presence of random censoring as well as competing risks, the extreme value index of the (sub)-distribution function associated to one particular cause, in the heavy-tail case. Asymptotic normality of the proposed estimator (which has the form of an Aalen-Johansen integral, and is the first estimator proposed in this context) is established. A small simulation study exhibits its performances for finite samples. Estimation of extreme quantiles of the cumulative incidence function is also addressed.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Distribution Estimation and Applications · Financial Risk and Volatility Modeling · Statistical Methods and Inference
Extreme value statistics for censored data with heavy tails under competing risks
Julien Worms (1) & Rym Worms111Corresponding author (2)
(1) Université Paris-Saclay / Université de Versailles-Saint-Quentin-En-Yvelines
Laboratoire de Mathématiques de Versailles (CNRS UMR 8100),
F-78035 Versailles Cedex, France,
e-mail : [email protected]
(2) Université Paris-Est
Laboratoire d’Analyse et de Mathématiques Appliquées
(CNRS UMR 8050),
UPEMLV, UPEC, F-94010, Créteil, France,
e-mail : [email protected]
Extreme value statistics for censored data with heavy tails under competing risks
**Abstract
This paper addresses the problem of estimating, in the presence of random censoring as well as competing risks, the extreme value index of the (sub)-distribution function associated to one particular cause, in the heavy-tail case. Asymptotic normality of the proposed estimator (which has the form of an Aalen-Johansen integral, and is the first estimator proposed in this context) is established. A small simulation study exhibits its performances for finite samples. Estimation of extreme quantiles of the cumulative incidence function is also addressed.
**
*AMS Classification. * Primary 62G32 ; Secondary 62N02
*Keywords and phrases. * Extreme value index. Tail inference. Random censoring. Competing Risks. Aalen-Johansen estimator.
1 Introduction
The study of duration data (lifetime, failure time, re-employment time…) subject to random censoring is a major topic of the domain of statistics, which finds applications in many areas (in the sequel we will, for convenience, talk about lifetimes to refer to these observed durations, but without restricting our scope to lifetime data analysis). In general, the interest lies in obtaining informations about the central characteristics of the underlying lifetime distribution (mean lifetime or survival probabilities for instance), often with the objective of comparing results between different conditions under which the lifetime data are acquired. In this work, we will address the problem of inferring about the (upper) tail of the lifetime distribution, for data subject both to random (right) censoring and competing risks.
Suppose indeed that we are interested in the lifetimes of individuals or items, which are subject to different causes of death or failure, and to random censorship (from the right) as well. We are particularly interested in one of these causes (this main cause will be considered as cause number thereafter, where ), and we suppose that all causes are exclusive and are likely to be dependent on the others. The censoring time is assumed to be independent of the different causes of death or failure and of the observed lifetime itself. However, since the other causes (different from the -th cause of interest) generally cannot be considered as independent of the main cause, in no way they can be included in the censoring mechanism. This prevents us from relying on the basic independent censoring statistical framework, and we are thus in the presence of what is called a competing risks framework (see Moeschberger and Klein (1995)).
For instance, if a patient is suffering from a very serious disease and starts some treatment, then the final outcome of the treatment can be death due to the main disease, or death due to other causes (nosocomial infection for instance). And censoring can occur due to loss of follow up or end of the clinical study. Another example, in a reliability experiment, is that the failure of some mechanical system can be due to the failure of a particular subpart, or component, of the system : since separating the different components for studying the reliability of only one of them is generally not possible, accounting for these different competing causes of failure is necessary. Another field where competing risks often arise are labor economics, for instance in re-employment studies (see Fermanian (2003) for practical examples).
One way of formalising this is to say that we observe a sample of independent couples where
[TABLE]
The i.i.d. samples and , of respective continuous distribution functions and , represent the lifetimes and censoring times of the individuals, and are supposed to be independent. For convenience, we will suppose in this work that they are non-negative. The variables form a discrete sample with values in , and represent the causes of failure or death of the individuals or items. It is important to note that these causes are observed only when the data is uncensored (i.e. when ), therefore we only observe the ’s, not the complete ’s.
One way of considering the failure times is to write
[TABLE]
where the variable is a (rather artificial) variable representing the imaginary latent lifetime of the -th individual when the latter is only affected by the -th cause (the other causes being absent). This viewpoint may be interesting in its own right, but we will not keep on considering it in the sequel, one reason being that such variables cannot be realistically considered as independent, and their respective distributions are of no practical use or interpretability (as explained and demonstrated in the competing risks literature, these distributions are in fact not statistically identifiable, see Tsiatis (1975) for example).
The object of interest is the probability that a subject dies or fails after some given time , due to the -th cause, for high values of . This quantity, denoted by
[TABLE]
is related to the so-called cumulative incidence function defined by
[TABLE]
Note that is not equal to , but to , because is only a sub-distribution function. However we have . In the sequel, the notation will be used, for any non-decreasing function .
In this paper, we are interested in investigating the behaviour of for large values of . This amounts to statistically study extreme values in a context of censored data under competing risks, and will lead us to consider some extreme value index related to , which will be defined in a few lines. Equivalently, the object of interest is the high quantile when is close to [math], which can be interpreted as follows (in the context of lifetimes of individuals or failure times of systems) : in the presence of the other competing causes, a given individual (or item) will die (or fail), due to cause after such a time , only with small probability . A nonparametric inference for quantiles of fixed (and therefore not extreme) order, in the competing risk setting, has been already proposed in Peng and Fine (2007).
One way of addressing this problem could be through a parametric point of view (see Crowder (2001) for further methods in the competing risk setting), however, the non-parametric approach is the most common choice of people faced with data presenting censorship or competing risks. Of course, the standard Kaplan-Meier method for survival analysis does not yield valid results for a particular risk if failures from other causes are treated as censoring times, because the other causes cannot always be considered independent of the particular cause of interest.
The commonly used nonparametric estimator of the cumulative incidence function is the so-called Aalen-Johansen estimator (see Aalen and Johansen (1978), or Geffray (2009) equation ) defined by
[TABLE]
where denotes the standard Kaplan-Meier estimator of (and denotes ), so that we can introduce the following estimator for :
[TABLE]
But if the value considered is so high that only very few (if any) observations (such that ) exceed , then this purely nonparametric approach will lead to very unstable estimations of . This is why a semiparametric approach is desirable, and the one we will consider here is the one inspired by classical extreme value theory.
First note that in this paper, we will only consider situations where the underlying distributions and of the variables and are supposed to present power-like tails (also commonly named heavy tails), and we will focus on the evaluation of the order of this tail. Our working hypothesis will be thus that the different functions (for ) as well as belong to the Fréchet maximum domain of attraction. In other words, we assume that they are (see Definition 1 in the Appendix) regularly varying at infinity, with respective negative indices and
[TABLE]
Consequently, and (the survival function of ) are regularly varying (at ) with respective indices and , where and satisfies (these relations are constantly used in this paper).
The estimation of has been already studied in the literature, as it corresponds to the random (right) censoring framework, without competing risks. We can cite Beirlant et al. (2007) and Einmahl et al. (2008), where the authors propose to use consistent estimators of divided by the proportion of non-censored observations in the tail, or Worms and Worms (2014), where two Hill-type estimators are proposed for , based on survival analysis techniques. However, our target here is (for a fixed ) and the point is that there seems to be no way to deduce an estimator of from an estimator of . Note that the useful trick used in Beirlant et al. (2007) and Einmahl et al. (2008) to construct an estimator of does not seem to be extendable to this competing risks setting. To the best of our knowledge, our present paper is the first one addressing the problem of estimating the cause-specific extreme value index .
Considering assumption (1), it is simple to check that, for a given , we have
[TABLE]
It is therefore most natural to propose the following (Hill-type) estimator of , for some given threshold value (assumptions on this threshold are detailed in the next section) :
[TABLE]
which can be also written as
[TABLE]
where are the ordered random variables associated to , and and are the censoring indicator and cause number which correspond to the order statistic . It is clear that this estimator is a generalisation of one of the estimators proposed in Worms and Worms (2014), in which the situation (with only one cause of failure/death) was considered. The asymptotic result we prove in the present work is then valid in the situation studied in the latter, where only consistency was proved and a random threshold was used.
Our paper is organized as follows: in Section 2, we state the asymptotic normality result of the proposed estimator, and of a corresponding estimator of an extreme quantile of the cumulative incidence function. Section 5 is devoted to the proofs. In Section 3, we present some simulations in order to illustrate finite sample behaviour of our estimator. Some technical aspects of the proofs are postponed to the Appendix.
2 Assumptions and Statement of the results
The central limit theorem which is going to be proved has the rate where and is a threshold tending to with the following constraint
[TABLE]
If we note the slowly varying function associated to (i.e. such that in condition (1)), the second order condition we consider is the classical condition for (see Bingham, Goldie and Teugels (1987)),
[TABLE]
where is a positive measurable function, slowly varying with index , and when , or when .
Theorem 1
Under assumptions , and , if there exists such that , and if then we have
[TABLE]
where
[TABLE]
with and .
Remark 1
Note that when , then , and, when and (for instance when there is only one cause of failure/death), then reduces to .
Proposition 1
Under assumptions and , we have
[TABLE]
Remark 2
The condition (weak censoring) is not necessary for the consistency of .
Now, concerning the estimation of an extreme quantile (of order tending to [math]) associated to , we propose the usual Weissman-type estimator (in this heavy tailed context), associated to the threshold used in the estimation of ,
[TABLE]
where is assumed to satisfy the constraint p_{n}=o\big{(}\widebar{F}^{(k)}(t_{n})\big{)}. Remind that by definition , and thus the definition of this estimator is based on the fact that, by the assumed regular variation of , the ratio is close to .
Corollary 1
Under the assumptions of Theorem 1, if in addition (in (3)) and satisfies the condition
[TABLE]
then (with , and being defined in the statement of Theorem 1)
[TABLE]
3 Simulations
In this section, a small simulation study is conducted in order to illustrate the finite-sample behaviour of our new estimator in some simple cases, and discuss the main issues associated with the competing risks setting.
For simplicity, we focus on the situation with two competing risks (), also called causes below, and our aim is the extreme value index associated to the first cause. Data are generated from one of the following two models : for , non-negative constants satisfying , we consider the following (sub-)distribution for each cause-specific function () :
Fréchet : , for ;
Burr : , for , where , .
The lifetime , of survival function , is generated by the inversion method (with numerical computation of ). Censoring times are then generated from a Fréchet or a Burr distribution :
[TABLE]
In this section, we consider (as it is often done in simulation studies) that the threshold used in the definition of our new estimator is taken equal to (i.e. we consider it as random). One aim of this section is to show how our estimator (with random threshold)
[TABLE]
of behaves when the proportion of cause events varies : we consider , the case corresponding to the simple censoring framework, without competing risk.
Another aim is to illustrate the impact of dependency between the causes, when estimating the tail. The starting point is that, if cause could be considered independent of cause , then we could (and would) include it in the censoring mechanism and we would be in the simple random censoring setting, without competing risk. In this case, it would be possible to estimate by one of the following two estimators, the first one being proposed in Beirlant et al. (2007) (a Hill estimator weighted with a constant weight), and the second one in Worms and Worms (2014) (a Hill estimator weighted with varying Kaplan-Meier weights):
[TABLE]
where, in Equation , , and in Equation , the Kaplan Meier estimators and are based on the . These two estimators consider the uncensored lifetimes associated to cause 2 as independent censoring times. Comparing our new estimator with these latter two estimators, when , will empirically prove that considering cause as a competing risk independent of cause has a great (negative) impact on the estimation of . Note that when , the new estimator and are exactly the same (therefore the thick and dashed lines in sub-figures (a), (c) and (e) of Figures 2 and 3 are overlapping, identical).
We address these two aims for each set-up (Fréchet, or Burr), by generating datasets of size , with three configurations of the triplet : (, moderate censoring ), (, heavy censoring ), or (, moderate censoring ). Median bias and mean squared error (MSE) of the different estimators are plotted against different values of , the number of excesses used. When Burr distributions are simulated, the parameter is taken equal to , and the parameters are taken equal to in configurations 1 and 2, and to in configuration 3.
Figure 1 illustrates the behaviour of our estimator when varies. In terms of bias and MSE, we can see that the first configuration is a little better than the second one, which is itself much better than the third one. We observed this phenomenon in many other cases, not reported here : our estimator behaves best when it is the smallest parameter which is estimated, and when the censoring is not too strong. Our simulations also show that the quality of our estimator (especially in terms of the MSE) diminishes with .
Figures 2 and 3 present the comparison between our new estimator and the ones described in (5) and (6). A general conclusion (confirmed by other simulations not reported here) is that and behave worse in most cases, even for a value of of , which is only a slight modification of the situation without competing risk (). Therefore, a contamination of the cause distribution by another cause rapidly yield inadequate estimations of if dependency between causes is ignored ; this conclusion is true for both and , but to a greater extent for . In the third configuration , the improvement provided by (with respect to ) becomes notable when drops below .
4 Conclusion
In this paper, we consider heavy tailed lifetime data subject to random censoring and competing risks, and use the Aalen-Johansen estimator of the cumulative incidence function to construct an estimator for the extreme value index associated to the main cause of interest. To the best of our knowledge, this is the first estimator proposed in this context. Its asymptotic normality is proved and a small simulation study exhibiting its finite-sample performance shows that accounting for the dependency of the different causes is important, but that the bias can be particularly high. Estimating second order tail parameters would then be interesting in order to reduce this bias. A first step towards this aim could be to study the following moments
[TABLE]
which asymptotic behaviour can be derived following the same lines as in the proof of Theorem 1.
5 Proofs
This section is essentially devoted to the proof of the main Theorem 1. Some hints about the proof of the consistency result contained in Proposition 1 are given in Subsection 5.3, and Corollary 1 is proved in Subsection 5.4.
We adopt a strategy developed by Stute in Stute (1995) in order to prove his Theorem 1.1, a well-known result which states that a Kaplan-Meier integral of the form can be approximated by a sum of independent terms. This idea is used in Suzukawa (2002) in the context of competing risks. We thus intend to approximate by the integral of some deterministic function , with respect to the Aalen-Johansen estimator, and approximate this integral by the mean of independent variables (defined a few lines below). The passage from to (which amounts to replacing by in the denominator of ) will imply an additional sum of independent variables , which will participate to the asymptotic variance of our estimator.
However, a major difference with Stute (1995) or Suzukawa (2002) is that the function we integrate here, , is not only an unbounded function, depending on , but it also has a ”sliding” support , which is therefore always close to the endpoint of the distribution . In Stute (1995), a crucial point of the proof consists in temporarily considering that the integrated function has a support which is bounded away from the endpoint of (condition (2.3) there). Considering the kind of function we have to deal with here, we cannot follow the same strategy : dealing with the remainder terms will thus be a particularly challenging part of our work. Finally note that, in order to deal with the ratio (and somehow approximate by ) we will have to consider simultaneously integrals (with respect to ) of and of another function , defined below, which basically shares the same flaws as .
Let us first recall or define the following objects :
[TABLE]
We thus have , where
[TABLE]
and we now introduce the following new quantities, related to the Stute-like decomposition of and :
[TABLE]
where, for any function , we note (for any given )
[TABLE]
This enables us to finally define the important objects
[TABLE]
which are the triangular sums of independent terms which will respectively approximate and . At the beginning of section 5.1, it will be proved that and , while and , yielding and ; the terms , , and only participate to the variance component of the estimator. The relation between all these quantities is made clearer in the following Lemma :
Lemma 1
We have
[TABLE]
where
[TABLE]
The proof of Lemma 1 is simple :
[TABLE]
which leads to the desired relation (8).
The main theorem thus becomes an immediate consequence of the following four results, the second one being the most difficult to establish.
Proposition 2
Under condition and assuming that
[TABLE]
if , then
[TABLE]
where is defined in the statement of Theorem 1.
Proposition 3
Under conditions and , if , then
[TABLE]
where , (and consequently too) are .
Corollary 2
Under the conditions of Proposition 3, converges in distribution to where .
Lemma 2
Under conditions , and , the bias term in (8) converges to as , where is defined in Theorem 1.
Propositions 2 and 3 will be proved in Sections 5.1 and 5.2 respectively, sometimes with the help of other results stated and established in the Appendix. The proofs of Corollary 2 and Lemma 2 are short, we state them below.
Concerning Corollary 2, once the proof of Proposition 2 has been gone through, it will become clear to the reader that converges in distribution to the centred gaussian distribution of variance , because where are centred, and (this is proved similarly as (11) and (15)). Since Proposition 3 states that , the same central limit theorem holds for and the corollary is proved.
Concerning now Lemma 2, remind that . An integration by parts and the fact that yield
[TABLE]
and, using assumption and Proposition 3.1 in de Haan and Ferreira (2006), we can write
[TABLE]
The result then follows from assumption and the fact that .
In the rest of the paper, we will very often handle the well-known sub-distributions functions and defined, for all , by
[TABLE]
Note that we have
[TABLE]
5.1 Proof of Proposition 2
We first write
[TABLE]
where , and are centred, because the random variables and have expectations respectively equal to and . Indeed, we have
[TABLE]
and
[TABLE]
as well as
[TABLE]
The proof for is similar.
We will now prove the asymptotic normality of by using the Lyapunov criteria.
Lemma 3
Under the conditions and , if :
we have
[TABLE] 2.
we have
[TABLE] 3.
we have, noting (which belongs to under our conditions) as well as ,
[TABLE]
Lemma 4
Under the conditions and , if , then
, as tends to infinity, for some .
We can then immediately prove Proposition 2. Indeed, since , Lemma 3 yields
[TABLE]
which, since , becomes
[TABLE]
Therefore, depending on the limit of the ratio when (for instance, it converges to [math] when ), it is simple to check that the variance of converges to the value described in the statement of Theorem 1. Thanks to Lemma 4, the Lyapunov CLT applies and Proposition 2 is proved.
The two subsections 5.1.1 and 5.1.2 are now respectively devoted to the proofs of Lemmas 3 and 4.
5.1.1 Proof of Lemma 3
Part of the lemma is straightforward : since and are centred, we have indeed
[TABLE]
and the result comes by using the fact that converges to as .
Now we proceed to the proof of part , and will only prove (12) because, by definition of and , the proofs for (13) and (14) will be completely similar. First of all, we obviously have
[TABLE]
The first term in the right-hand side of (12) is equal to , and the second one (without the minus sign) is equal to and to because
[TABLE]
and
[TABLE]
The expectation equals [math] because is constantly [math], and we are now going to prove that , which ends the proof of (12) in view of (16). Indeed, noting and using the simple fact that for every , we have
[TABLE]
and
[TABLE]
as announced.
We can now start proving part of the lemma, in which the exact nature of the function matters. First remind that functions , and are regularly varying of respective orders , and (for , this is proved in Lemma 8 with ). Let us define the constants and () by
[TABLE]
Since was assumed, then according to Lemma 7 part (applied first with for , and then with for ) , we have
[TABLE]
Hence, by definition of , , the first terms of , and in relations (12), (13) and (14) are respectively equivalent (as ) to , and where denotes
[TABLE]
Since is found to be equal to , then in view of (11) this proves the first term in relation (15). We now need to obtain equivalent expressions for the quantities , and in order to prove the second part of relation (15) and therefore finish the proof of Lemma 3.
For saving space, we will use temporarily the following notations :
[TABLE]
According to the technical Lemma 9 of the Appendix and, after splitting the integral into and , we can write
[TABLE]
where in is due to part of Lemma 7 and to the fact that in Lemma 9 converges to [math] uniformly in . According to the second part of relation , we thus have
[TABLE]
The other terms are treated similarly (using the fact that when , and when ) and we obtain
[TABLE]
In view of (11), combining , and and using Remark 3 (following Lemma 8) to write that (as ), this proves the second term in relation (15).
5.1.2 Proof of Lemma 4
We have to prove that, for some small enough, tends to [math], as . In the sequel, denotes an unspecified absolute positive constant. According to the definition of , it is clear that
[TABLE]
First, we clearly have as . Secondly, since has the same form as , with instead of (i.e. without the log factor), we will only prove that there exists some such that, as ,
[TABLE]
For , we have
[TABLE]
Applying part of Lemma 7 for , and (with sufficiently small so that is kept ), and using the fact that , this ends the proof of (22) for .
For , we have
[TABLE]
By definition of , , and , we have when . Therefore, splitting the integral above into two integrals and we obtain
[TABLE]
where, on one hand,
[TABLE]
and, on the other hand, using the technical Lemma 9, for some ,
[TABLE]
Applying Lemma 8 to , we have , therefore . It is then easy to check that tends to 0, because and , since .
For , since by Lemma 8 the function is regularly varying with index , the application of part of Lemma 7 to or and to various couples of values of and finally yields , and consequently tends to 0.
We now come to the study of relation (22) for . We have
[TABLE]
Proceeding as above by splitting the integral into two integrals and , we obtain
[TABLE]
where
[TABLE]
and
[TABLE]
where
[TABLE]
and, using the technical Lemma 9 as we did some lines above,
[TABLE]
Using Lemma 8 and part of Lemma 7, we find that both and are and, though the term is more involved, we are also going to prove below that the same property holds for : this will finish the proof of Lemma 4 because tends to [math], as already seen in the proof for .
We only treat the first integral in the right-hand side of , since the two others are very similar, i.e. we need to prove that
[TABLE]
Now,
[TABLE]
Using Potter-bounds for , integration by parts and then Potter-bounds for , it is easy to see that for sufficiently large and , there exists some positive constants , , such that
[TABLE]
where . Consequently
[TABLE]
This yields , by using part of Lemma 7 to this value of , to (and to or ), as well as Lemma 8.
5.2 Proof of Proposition 3
Let us start with an important note. In Proposition 3, the main result is that the remainder terms and are . Proving this will be conducted in a similar way as proving that is in Theorem 1.1 of Stute (1995). But, recall that in our situation, the function that we integrate here is , which is depending on , with a ”sliding” support . We will need to be particularly cautious with integrability issues, especially when dealing with U-statistics for the terms and in the remainder , defined below.
Before we proceed with the proof, let us define the following empirical (sub)-distribution functions : for ,
[TABLE]
First note that, since is the function without the log factor, it should be clear to the reader that proving that and will be simpler than proving that and . We will thus only prove the latter two relations.
Let us start with the first one, in other words let us define the remainder term . Remind that the definitions of and are and , where denotes the mean of the variables . We need to decompose the integral of with respect to , which is a stepwise subdistribution function which jumps at the (ordered) observations are equal to . But it is known that (see Lemma 2.1 in Stute (1995))
[TABLE]
Therefore, using the fact that , we have
[TABLE]
Consequently, using the mean value theorem for , and introducing the important notations
[TABLE]
it is easy to see that
[TABLE]
where is the first term in the definition of , and is a random quantity lying between and .
What we now need to do is to show that the term involving the quantity in relation (25) above can be written as plus a remainder term , and therefore we have , where
[TABLE]
The rest of the proof will, afterwards, be devoted to showing that each term of is .
Proceeding as in Stute (1995) or Suzukawa (2002), and using the fact that for any given function we have , we can write
[TABLE]
where
[TABLE]
Note that and are a kind of -statistics, which need to be approximated by sums of independent variables called Hoeffding decompositions : more precisely, if we introduce the functions (important in the sequel)
[TABLE]
for , and , then these decompositions are defined by
[TABLE]
Therefore, if we introduce the remainder terms
[TABLE]
then (27) becomes
[TABLE]
We are thus left to prove that . This is indeed the case because, if we note
[TABLE]
then, by definition of , the last (fourth) term in equals , the third one equals , the second one is (because )
[TABLE]
and the first one is (because and )
[TABLE]
Likewise, the first term of equals , the second one equals , and the last one equals . After straightforward simplifications, we obtain the desired equality , and the proof of is over.
The proof of Proposition 3 is now based on the following two lemmas : Lemma 5 is proved in subsection 5.2.1, and Lemma 6 is the longest to establish, its proof will be split across subsections 5.2.2 to 5.2.5.
Lemma 5
If conditions and (2) hold with , then we have
[TABLE]
Lemma 6
If conditions and (2) hold with , then we have
[TABLE]
5.2.1 Proof of Lemma 5
We start with the remainder term , which is defined as
[TABLE]
where . Since, for all , , we obtain
[TABLE]
and then
[TABLE]
But
[TABLE]
so, if we define
[TABLE]
where
[TABLE]
then it remains to prove (thanks to part of Lemma 10) that and .
Concerning , since implies that , then is a consequence of Lemma 11, used with and .
Concerning , an integration by parts yields
[TABLE]
for any given . Lemma 10 (applied with ) and the fact thus imply that
[TABLE]
so that, by definition of , the desired statement is a consequence of Lemma 11, applied with sufficiently small and , and of
[TABLE]
Indeed converges to and equals , which is by the standard central limit theorem.
Let us now turn to the remainder term , which is defined as
[TABLE]
A simple calculation leads to
[TABLE]
for any . Taking sufficiently small, the rest of the proof is very similar to the one for (compare to ) and relies on Lemma 10 and Lemma 11.
We can finally deal with the last remainder term , defined as
[TABLE]
where is a random quantity lying between and . Since , we have
[TABLE]
Since , where and , we clearly have
[TABLE]
But , where is the cumulative hazard function associated to , and its Nelson-Alen estimator. Relying on Zhou (1991) Theorem 2.1, we can deduce that . Hence, .
Now,
[TABLE]
By writing
[TABLE]
we prove (using Lemma 10 and simple integrations as for the previous treatment of above) that for .
Hence, on one hand , and on the other hand
[TABLE]
for any given (where the comes from , which does not depend on ). Therefore, it is sufficient to prove that
[TABLE]
are , and that
[TABLE]
is as well. But the first two statements are consequences of Lemma 11 with sufficiently close to [math] and, respectively, and . And for the third statement, the expectation of the expression turns out (thanks to Lemma 7 part ) to be equivalent to a constant times , which tends to [math].
5.2.2 Preliminaries to the proof of Lemma 6
We start this section by introducing important objects, issued from an idea appearing (to the best of our knowledge) in Stute (1994). We define the improper variables and by
[TABLE]
which have and for respective subdistribution functions. We thus have and , which, according to the definitions of and on one hand, and of functions and (in (28)) on the other hand, leads to
[TABLE]
and
[TABLE]
Since the latter triple sum is not convenient, we also define
[TABLE]
where will be the quantity approximated by , and \raisebox{7.68236pt}{\approx}C_{n}^{(1)} will be a remainder. We can indeed rewrite (31) as
[TABLE]
The terms in parentheses in (34) and (35) turn out to be genuine U-statistics of 2 and 3 variables, denoted by
[TABLE]
where functions and will be defined in a few lines (relation (37)) after some preliminaries, certainly well-known in the U-statistics literature, but which we include here to make our proof self-contained (and since we are dealing with improper variables).
If and denote independent improper random variables with subdistribution functions and (i.e. and where and are independent copies of ), we introduce the following notations : for any function ,
[TABLE]
as well as, for any function , with (of distribution function ) independent of and ,
[TABLE]
Since whenever or equals , we then have (the proof is simple)
[TABLE]
Therefore, setting (for in and and in )
[TABLE]
it is then not difficult to check (using (29) and (30)) that and in relation (36) are indeed equal to the differences in parentheses in relations (34) and (35), respectively. Lemma 6 thus becomes a consequence of the following facts : , , and
[TABLE]
We will prove these statements in the next 3 subsections.
5.2.3 Proof of
We note , when , and . It is clear that it suffices to prove that
[TABLE]
The good point is that turns out to be a sum of identically distributed centred and uncorrelated random variables , but unfortunately these variables are not square-integrable and potentially only have a moment of order slightly larger than when . In order to deal with this difficulty, since we cannot handle directly the norm of of order , we will follow a strategy similar to that found in Csorgo, Szyszkowicz and Wang (2008), based on truncation. We set
[TABLE]
The variables () are centred and bounded, but they lose the non-correlation property of the variables . This is why we define now
[TABLE]
which are centred and bounded but are also uncorrelated (see part of Lemma 13), and we write
[TABLE]
We thus need to prove that and both converge to [math] in probability.
Concerning , since the are centred and uncorrelated, we have
[TABLE]
where was defined in (39) (the justification of the last inequality is postponed to part of Lemma 13). Remind that is not square-integrable and , and introduce for some given small . We then write
[TABLE]
Thanks to Lemma 12 (parts and ) and to the definition of , the term is bounded by a quantity which is equivalent (as ) to . We now rely on Hölder’s inequality for dealing with the term . Let and such that . Since , again thanks to Lemma 12 (, and ), for sufficiently close to so that , we have
[TABLE]
which converges to [math] thanks to assumption (2), for small enough (we used part of Lemma 12 in the third upper bound).
We are thus left to prove that also converges to [math], but this time in . We start by writing that
[TABLE]
the last inequality being proved in the appendix (part of Lemma 13). The follow-up is a bit similar to the treatment of above, relying on Lemma 12 (parts , and ) and on Hölder’s inequality : for close to and a large such that , we can write
[TABLE]
which, for small enough, is thanks to assumption (2).
5.2.4 Proof of
The proof is very similar to the one contained in the previous subsection. We nonetheless provide a few details to convince the reader of the validity of the result. We note now and when , with denoting the cardinal of the index set . Since the observations are i.i.d., it should be clear to the reader that it suffices to prove that
[TABLE]
As previously, the problem lies with the moments of the centred and uncorrelated variables , and now we only have a guaranteed moment of order slightly more than instead of in the previous situation. Fortunately, the cardinal is now of order , which turns out to be the right compensation.
We thus define, for ,
[TABLE]
as well as
[TABLE]
which are centred and bounded but are also uncorrelated (see part of Lemma 13 in the Appendix), and we write
[TABLE]
Introducing and skipping details, we assess that
[TABLE]
and that this quantity converges to [math], as , thanks to parts and of Lemma 12. The same argument is used to prove that .
5.2.5 Proof of relation (38)
Let us first prove that, for some , \mathbb{E}(|\sqrt{v_{n}}\raisebox{7.68236pt}{\approx}C_{n}^{(1)}|^{d}) tends to 0, as tends to infinity. Recall that \raisebox{7.68236pt}{\approx}C_{n}^{(1)}=\frac{1}{n^{3}}\sum\sum_{i\neq j}h(V_{i},W_{j})/\widebar{H}(V_{i}). Since , we have
[TABLE]
According to part of Lemma 12, the right-hand side of the inequality above is , which tends to [math], since , and so we are done.
Let us now prove that tends to 0, as tends to infinity. is defined in , where the expectation of each of the four integrals is : therefore, we only need to prove that tends to [math]. This is straightforward using part of Lemma 12.
We can prove in a very similar way that tends to 0, as tends to infinity.
5.3 Proof of Proposition 1
Using the same notations as in the begining of Section 5, we have,
[TABLE]
The fact that is due to the application of a triangular weak law of large numbers (see Chow and Teicher (1997) for example) to and to . By carrefully following the proof of proposition 3 in Section 5.2, we can see that . The condition is not used, neither in the treatment of nor in that of . Details are omited.
5.4 Proof of Corollary 1
The proof is very similar to the one of Theorem 2 in Worms and Worms (2016), with and here replacing and there. For completeness, we provide some details about it. Reminding the notations and , we easily write
[TABLE]
where , and . We are going to prove that both and are , and that : this will conclude the proof, since both (Corollary 2) and tend to .
Concerning , the mean value theorem yields
[TABLE]
where and therefore tends to [math] in probability thanks to Theorem 1 and assumption . The desired result for is then implied by Theorem 1 again.
Concerning the fact that , the proof is completely similar to the evoked one in Worms and Worms (2016), so we omit it here (basically, this is based on some uniform regular variation implied by the assumed negativity of the second order parameter , and on the assumption that converges).
Finally, concerning we use the mean value theorem to write
[TABLE]
where lies between and . But Corollary 2 (and the consistency of ) implies that on one hand, and on the other hand ; therefore, .
6 Appendix
This appendix contains various results : some of them are used repeatedly in the proof of the main result (in particular Proposition 4, Lemmas 7, and 10, and to a lesser extent Lemmas 9 and 8), the other ones concern parts of the main proof which are postponed to the appendix for better clarity of the main flow of the proof (Lemmas 11, 12 and 13).
Definition 1
An ultimately positive function : is regularly varying (at infinity) with index , if
[TABLE]
This is noted . If , is said to be slowly varying.
Proposition 4
*(See de Haan and Ferreira (2006) Proposition B.1.9)
Suppose . If and , then there exists such that for every ,*
[TABLE]
and if ,
[TABLE]
Lemma 7
Let , , , and for and real numbers, and are two regular varying functions at infinity, with index, respectively, and . Then, as ,
.
, if
, if
Proof :
A simple change of variable and the definition of the function yields the result.
For the sake of simplicity, we are going to treat the case and . The only difference for the other cases is the sign in front of the or appearing below (coming from the application of several times), which can depend on the sign of , or another constant, but does not affect the result. Using Potter-bounds for yields, for sufficiently large and ,
[TABLE]
Let us treat only the upper bound and the case (the other cases being similar). By integration by parts, with , we have
[TABLE]
Using Potter-bounds for yields, for sufficiently large and
[TABLE]
Doing the same with the lower bound and making and tend to [math], yields the result after simplifications.
As in , using Potter-bounds for , integration by parts and then again for yields the result.
Lemma 8
For any , let denote the function
[TABLE]
Under condition , this function is regularly varying of order and we have , as .
Proof : by writing , the lemma is an immediate consequence of part of Lemma 7, with and .
Remark 3
In the Lemma above, is the important function introduced at the beginning of Section 5, and thus , as . Hence, is regularly varying at infinity with index , a property which proves useful several times in the main proofs.
Lemma 9
Let , for and . Under condition , we have
[TABLE]
where is a sequence tending to [math] uniformly in , as , and a positive real number such that .
Proof : We only consider the second situation where (the first one is straightforward) :
[TABLE]
An integration by part and the fact that is regularly varying at infinity with index , yields
[TABLE]
where
[TABLE]
Let be a positive real number. Then
[TABLE]
where the function is regularly varying with index . Then since
[TABLE]
and, when , we have , this concludes the proof.
Lemma 10
Recalling that is a distribution function with infinite right endpoint, we have :
**
for any ,
[TABLE]
Proof : part is well known (see for instance section 3 of chapter 10 of Shorack and Wellner (1986)), while the two statements in are proved by usual empirical processes techniques, showing that the family of functions defined in one case by , and in the other case by are Donsker whenever (using respective square integrable envelope functions and , which bound from above the functions uniformly in ) .
Lemma 11
Under conditions (1) and (2), suppose that and are real numbers. If and
[TABLE]
then we have , as tends to infinity, if is [math] or sufficiently close to it.
Proof :
According to the LLN for triangular arrays, we need to prove the following three statements :
[TABLE]
But, being positive, clearly implies . We thus need to prove that and hold.
Let us start with assertion . If is given, then
[TABLE]
Now, put (); since, for a given , there exists such that , , and using Potter-bounds for , we can write (using the definition of )
[TABLE]
where and is a constant depending on and only. Consequently, if tends to infinity,
[TABLE]
where and the last inequality is due to Potter-bounds applied to . Then, assertion above will be true as soon as we prove that and , as .
Since is equivalent to a positive constant times when , and , then , for and . Assumption (2) finally yields that tends to , since for sufficiently close to [math].
Now, proving that tends to [math] is equivalent to proving that tends to . The same arguments as in the previous paragraph yield that it is sufficient to prove that tends to , for and . This is a consequence of hypothesis , since and , for sufficiently close to [math]. This ends the proof of .
Let us now start the proof of assertion . If is given, using Potter-Bounds (41) for which belongs to , and introducing , we find that (for some positive constant )
[TABLE]
where we set . Hence, denoting by the inverse function of ,
[TABLE]
Consequently, using once again Potter-Bounds and bounding the log with a constant times a power of , we get
[TABLE]
where and is some given positive value (the inequality , was used). But, by integration by parts and applied to , setting , we have
[TABLE]
Proceeding similarly as in the previous paragraphs, we find that (and thus and as well) thanks to assumption (2), for close to [math]. We are thus left to prove that tends to [math], where . If is negative, this is immediate. We thus suppose that and, after some simple computations, we find out that tends to [math] if tends to , a property which holds true thanks to assumption (2), for close to [math] (we omit the details).
Lemma 12
Suppose that and are independent improper random variables of respective subdistribution functions and , and is independent of and and has distribution . Consider , , and the functions defined in (28) and (37).
For any , there exist some positive constants and such that
[TABLE]
For any , we have
[TABLE]
In particular, if , then is of the order of and is finite whenever is (greater than but) sufficiently close to .
For any , we have
[TABLE]
In particular, if , then is of the order of and is finite whenever is (greater than but) sufficiently close to .
For any , we have . In particular, if then taking (greater than but) sufficiently close to is permitted, otherwise it is instead of .
The integral is equivalent, as , to .
Proof :
Let , and remind that is a non-negative function. Using several times the inequality , we can write
[TABLE]
But using the fact that the norm is bounded by the norm whenever , we have and it is quite simple to prove (by independency of and ) that it is also the case of , as well as for . The inequality is thus proved. The other one (concerning and ) is proved similarly.
Let . Since (), we have
[TABLE]
where the function was defined in the statement of Lemma 8. This lemma and Lemma 7, applied with , and (the constraint specified on certifies that ), imply that the integral in the previous line converges to a constant. And Lemma 8 also implies that the ratio in front of this integral is equivalent, as , to a positive constant times , which is itself lower than , as desired.
Let . By definition of in (28), and proceeding as in the previous item, equals
[TABLE]
which is equivalent to as soon as, thanks to Lemma 7, the sum is negative, which turns out to be true whenever , as specified.
The proof is very similar to the previous ones, starting from
[TABLE]
so we omit the details.
Noting that is slowly varying at infinity null at [math], we have
[TABLE]
which can be dealt with using part of Lemma 7 with , and : the obtained constant is indeed equal to .
Lemma 13
In this Lemma, various notations defined in sections 5.2.2 to 5.2.4 are used.
The variables for are centred and uncorrelated . This is also true for the variables for .
We have .
We have
Proof :
Let us consider the first situation, where . First, if , then ; but, by definition of and independency of and , we have , and is obtained similarly, so we proved that . Note that we can prove (with similar arguments) that for every in , a property which is repeatedly used below . Let us now deal with the non-correlation of and , by considering the various cases where with and are in .
If all four indices are distinct, then non-correlation of and is immediate by mutual independence of the variables .
If but , then where , by independence of with , and of and .
The case and is similar using .
If but , then where ; the case and is treated similarly.
Note that the case and (i.e. , ) is not permitted (it would lead to dependency) since we cannot have simultaneously and ; this is the reason why, in the beginning of section 5.2.3, we restricted the study of the sum to that of the sum having terms satisfying .
The second situation, for and with in , is a bit more tedious (with more cases to detail) but very similar, so we omit its proof.
We start by the trivial bound
[TABLE]
Noting , we can write, on one hand, by definition of , . On the other hand, if is independent of , we have , which is the same term as the first one, and is thus lower than . The same is true of , so the desired inequality is proved.
First recall that denotes . Now, since is centred and we trivially have , noting yields
[TABLE]
Secondly, using the fact that (simple to prove), we can write
[TABLE]
where denotes and satisfies , and similarly
[TABLE]
with and . Summing these three terms finally leads to
[TABLE]
which is lower than , as announced.
References
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aalen and Johansen (1978) A. Aalen and S. Johansen . An empirical transition matrix for nonhomogeneous Markov chains based on censored observations. In Scand J Stat (5) pages 141-150 (1978)
- 2Beirlant et al. (2007) J. Beirlant, G. Dierckx, A. Guillou and A. Fils-Villetard . Estimation of the extreme value index and extreme quantiles under random censoring. In Extremes 10 , pages 151-174 (2007)
- 3Bingham, Goldie and Teugels (1987) N. H. Bingham, C.M. Goldie and I.L. Teugels. Regular variation. Cambridge University press (1987)
- 4Chow and Teicher (1997) Y.S. Chow and H. Teicher . Probability theory. Independence, interchangeability, martingales. Springer (1997)
- 5Crowder (2001) M. Crowder . Classical competing risks. Chapman and Hall, London (2001)
- 6Csorgo, Szyszkowicz and Wang (2008) M. Csorgo, B. Szyszkowicz and Q. Wang . Asymptotics of studentized U-type processes for change-point problems. In Acta Math. Hunga. 121 (4) , pages 333-357 (2008)
- 7de Haan and Ferreira (2006) L. de Haan and A. Ferreira . Extreme Value Theory : an Introduction. Springer Science + Business Media (2006)
- 8Einmahl et al. (2008) J. Einmahl, A. Fils-Villetard and A. Guillou . Statistics of extremes under random censoring. In Bernoulli 14 , pages 207-227 (2008)
