On the assumption of independent right censoring
Morten Overgaard, Stefan Nygaard Hansen

TL;DR
This paper examines various assumptions on right-censoring mechanisms in survival analysis, distinguishing between minimal identifiability and stronger independence assumptions, and characterizes their implications for estimator consistency.
Contribution
It provides a comprehensive classification of eight assumptions on right censoring, clarifying their relationships and implications for survival analysis estimators.
Findings
Eight assumptions categorized into two groups.
Characterization of pointwise and full independence.
Examples illustrating assumption differences.
Abstract
Various assumptions on a right-censoring mechanism to ensure consistency of the Kaplan--Meier and Aalen--Johansen estimators in a competing risks setting are studied. Specifically, eight different assumptions are seen to fall in two categories: a weaker identifiability assumption, which is the weakest possible assumption in a precise sense, and a stronger representativity assumption which ensures the existence of an independent censoring time. When a given censoring time is considered, similar assumptions can be made on the censoring time. This allows for a characterization of so-called pointwise independence as well as full independence of censoring time and event time and type. Examples illustrate how the various assumptions differ.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
On the assumption of independent right censoring
Morten Overgaard1 and Stefan Nygaard Hansen2
(Department of Public Health, Aarhus University
Bartholins Allé 2 - Building 1260, DK-8000 Aarhus C, Aarhus, Denmark
May 7, 2019 )
Abstract
Various assumptions on a right-censoring mechanism to ensure consistency of the Kaplan–Meier and Aalen–Johansen estimators in a competing risks setting are studied. Specifically, eight different assumptions are seen to fall in two categories: a weaker identifiability assumption, which is the weakest possible assumption in a precise sense, and a stronger representativity assumption which ensures the existence of an independent censoring time. When a given censoring time is considered, similar assumptions can be made on the censoring time. This allows for a characterization of so-called pointwise independence as well as full independence of censoring time and event time and type. Examples illustrate how the various assumptions differ.
Keywords: Censoring; competing risks; consistency; identifiability; product integral; representativity.
1 Introduction
When dealing with right censoring in survival analysis, assumptions on the censoring mechanism are inevitably needed in order to bridge the gap between the observable world and the underlying world of interest. Many seemingly different assumptions have been proposed in the literature. The papers of Williams & Lagakos, (1977), Kalbfleisch & MacKay, (1979), and Lagakos, (1979) did clarify connections between some of the different assumptions. Since then, martingale theory has become a much used tool in survival analysis, and assumptions on the censoring mechanism are made by means of martingale assumptions in, for instance, Aalen & Johansen, (1978), Gill, (1980), and Andersen et al., (1993). A clear overview and comparison of the various assumptions does, however, seem to be lacking.
The purpose of this paper is to provide a clear overview and a comparison of various assumptions on the censoring mechanism made in order to ensure consistency of estimators such as the Kaplan–Meier and Aalen–Johansen estimators. This is done in a competing risks setting and without assuming absolute continuity of the involved random variables. Along the way, we obtain assumptions that are minimal in a precise sense for ensuring this consistency. We also make clear that important differences exist between considering a given, underlying censoring time producing right censoring and not considering such a censoring time.
Additionally, the use of product integrals and the techniques used in the many proofs might, in itself, be of interest to researchers in the field of theoretical survival analysis. In particular, the appendix provides a wealth of technical results that may be useful in other settings as well.
The paper is structured as follows. In Section 2, right censoring in a general form is studied and minimal conditions to ensure consistency of the Kaplan–Meier and Aalen–Johansen estimators are obtained. Various assumptions from the literature are discussed and it is shown that they only correspond to two nested properties: an identifiability assumption and a representativity assumption – the latter being the strongest. Section 3 concerns the setting where an explicit censoring time is given. We discuss assumptions on the censoring mechanism and show that independence of the event and censoring times is equivalent to representativity assumptions on both the event and censoring time. In Section 4 we treat two examples in order to show that the representativity assumption is strictly stronger than the identifiability assumption and to illustrate the assumptions in a practical setting. Finally, in Section 5 we discuss some of the perspectives of the paper.
2 A censored event time
Consider an event time and event type that are subject to right censoring meaning that we are only able to observe a with and an indicator with values in where 0 indicates a censoring. These are all considered proper random variables, that is, with . We will refer to and as the observed exit time and exit type, respectively, because the risk set is exited at time and states how. This setting does not involve an explicit, underlying censoring time and may be useful in certain practical settings where such a censoring time is difficult to define. A setting with a given censoring time is dealt with in the next section.
For the pair of interest we define the survival function , the cause-specific cumulative incidence functions for and the cause-specific cumulative hazard functions for . We define the corresponding functions for the observed pair , that is, , and for . Both and are well-defined functions from into for . Here and in the following, division by 0 can be interpreted as 0 or any arbitrary number since it only occurs in integrals on a null set of the integrator. Frequently, a restriction to the interval is relevant since we will never observe an exit time beyond . Let denote and note that either , when , or , when .
In this section we shall study the assumptions under which we can identify and by the Kaplan–Meier and Aalen–Johansen estimators defined in Appendix 2. To this end, let denote the matrix of transition probabilities
[TABLE]
where and for and . With a slight abuse of notation, we let which is the matrix of interest. If denotes the all-cause cumulative hazard function and denotes the observed counterpart, then we define the two matrices
[TABLE]
and again, with slight abuse of notation, we let and for .
According to (14) of the appendix, the Aalen–Johansen estimator , defined in (13) of the appendix, is consistent for for any in a setting with independent and identically distributed observations. We now have the following result.
Proposition 1**.**
In a setting with independent and identically distributed observations, the Aalen–Johansen estimator consistently estimates for all if and only if for all . In other words, the Aalen–Johansen estimator of is consistent for all for if and only if for all for .
Proof.
By uniqueness of the product integral, we immediately have for all if and only if for all . This is due to Theorem 3 of Gill & Johansen, (1990) since both and are seen to be of bounded variation on for any . Now, is seen to satisfy the requirements of Lemma 8 of the appendix by definition of from which it follows that . This establishes the equivalence. ∎
A similar argument reveals that the Kaplan–Meier estimator from (15) in the appendix consistently estimates for all if and only if for all .
We call the property of Proposition 1 the property of identity of forces of mortality with inspiration from Elandt-Johnson, (1976). An assumption of identity of forces of mortality is, for instance, used by Gail, (1975) in a competing risks setting as a weaker substitute for the assumption of independent latent event times.
Williams & Lagakos, (1977) study, in a setting without competing risks, the constant-sum assumption as a weaker alternative to the assumption of independence of event time and censoring time. Let be the function, unique up to -null sets, given by and let . In the competing risks setting, the constant-sum property can then be phrased as
[TABLE]
for -almost all for . In the paper of Kalbfleisch & MacKay, (1979), the authors argue that this property is equivalent to identity of forces of mortality in a setting without competing risks and with a differentiable event hazard function.
Estimators in survival analysis and in the competing risks setting have often been studied using martingale theory, for instance in Aalen & Johansen, (1978), Gill, (1980), Jacobsen, (1989), and Andersen et al., (1993). In such a setting, the following martingale property, which we will call the weak martingale property in light of stronger properties introduced later on, has been shown to ensure the desired consistency of estimators. Let for and . The weak martingale property is that the processes given by
[TABLE]
for , for are all martingales with respect to the filtration given by , which models the observed information. This or similar assumptions are made, for instance, in Assumption 3.1.1 of Gill, (1980), in (2.9) of Jacobsen, (1989), in Definition 3.1.1 of Martinussen & Scheike, (2007), in (5.5) of Kalbfleisch & Prentice, (1980), and in Theorem 1.3.1 of Fleming & Harrington, (1991).
Recall that . We consider here yet another property, which we call status-independent observation. Status-independent observation is the property that
[TABLE]
for -almost all for , and it is called so because it states that between the statuses of surviving up to a certain time, , and having some event at that time, with , the probability, given a certain status, of that status actually being observed does not depend on the status.
As the following result shows, these four properties are in fact equivalent, and we will refer them collectively as the identifiability property in light of Proposition 1.
Proposition 2**.**
The following properties are equivalent.
- (2.1)
Identity of forces of mortality: for and for any . 2. (2.2)
The weak martingale property: The processes given by , , for are all martingales with respect to the filtration , the observed information. 3. (2.3)
Status-independent observation: for -almost all for . 4. (2.4)
The constant-sum property: for -almost all for .
Proof.
We consider it well known that , defines a martingale with respect to . Under the assumption of (2.1) and since is 0 and there is no increment in outside almost surely, we have that
[TABLE]
almost surely for all which yields the result. On the other hand, assume that (2.2) holds. Then, for a given and a given ,
[TABLE]
Since for , integrating with respect to both sides establishes
[TABLE]
and this yields (2.1).
Generally, and . For , this establishes
[TABLE]
and thereby the equivalence of (2.1) and (2.3), since and have the same null sets on . Assume that (2.1) and (2.3) hold. By using equation (6) of the appendix, it can be seen that for all under this assumption. Since for -almost all for under the assumption, we have established for -almost all for , which is (2.4). Assume instead that (2.4) holds. Equation (8) of the appendix implies that, again, for all . Use of the constant-sum condition again then yields for -almost all for , which is (2.3). ∎
A somewhat stronger martingale property has, however, also been considered. Let for and . Define also a filtration by and an enlarged filtration by . What we call the strong martingale property is that the processes given by
[TABLE]
for , for are all martingales with respect to the enlarged filtration . It seems well-known that the processes are martingales with respect to . So, loosely speaking, the property states that enlarging the filtration by does not add any information relevant for the processes. This property has similarities to Definition III.2.1 of Andersen et al., (1993) of an independent right censoring concept, which also requires the underlying martingale processes to be martingales with respect to an enlarged filtration. Similarly, Aalen & Johansen, (1978) also require the underlying martingale processes to be martingales with respect to an enlarged filtration.
The property that
[TABLE]
for all , and plays a role in Theorem 3.1.1 of Gill, (1980), in condition (G) of Jacobsen, (1989), and also matches the interpretation of independent right censoring given by Andersen & Keiding, (2006), p. 466. We call this the property of non-prognostic observation since it implies that, given survival past time , the extra knowledge that the survival past is observed, , does not influence the prognosis, that is, the probability of having events at a later point in time.
In Williams & Lagakos, (1977), survival is said to be independent of the conditions producing censoring when a property like
[TABLE]
for any and -almost all holds for . With inspiration from Lagakos, (1979), we will call this property non-prognostic censoring because, under assumption of this property, the censoring does not provide any prognostic information about the event time or type other than survival to the censoring time.
The following result shows that these three properties are equivalent and, moreover, that they are equivalent to the existence of an independent censoring time. We will refer to them collectively as the representativity property because, looking at (3.2) and (3.3), this property implies that those at risk at a given time, , are representative for those being censored at this time, , in terms of the event risks.
Proposition 3**.**
The following properties are equivalent.
- (3.1)
The strong martingale property: The processes that are given by , , for , are martingales with respect to the enlarged filtration . 2. (3.2)
Non-prognostic observation: for all and . 3. (3.3)
Non-prognostic censoring: for all and -almost all and . 4. (3.4)
Existence of an independent censoring time: A censoring time, , exists such that and C\mathchoice{\mathrel{\hbox to0.0pt{\displaystyle\perp\hss}\mkern 2.0mu{\displaystyle\perp}}}{\mathrel{\hbox to0.0pt{\textstyle\perp\hss}\mkern 2.0mu{\textstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptstyle\perp\hss}\mkern 2.0mu{\scriptstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptscriptstyle\perp\hss}\mkern 2.0mu{\scriptscriptstyle\perp}}}(T,D).
Proof.
Assume (3.1) and let and be given. Since , we can use the martingale property to obtain and divide by to get
[TABLE]
The matrix-valued function given by , for , , and for and , is right continuous with left limits in both variables and by (2) is seen to satisfy the conditions of Lemma 8. Thus, we conclude that , which then implies that since as seen earlier. In particular we have for all by this argument, and this extends to all since has probability 0 in either probability measure. We have thereby established (3.2).
Assuming (3.2), we may argue the other way to obtain, for ,
[TABLE]
which is enough to establish the martingale property of (3.1) since is generated by sets of the type for and for .
Assume again (3.2). Then we have the strong martingale property, (3.1), which is seen to imply (2.2) since integration of the -predictable process with respect to the integrator yields , which is then a -martingale and thus also a -martingale. In light of Proposition 2 this means that (2.1) holds. For any given and , equation (10) of the appendix reveals that for any since the integrand is 0 for , since we are assuming (3.2), and since the first two integrals of (10) are always zero because for either for all or . This establishes (3.3).
If we instead assume (3.3), we obtain (2.4) from equation (11) of the appendix and so (2.1) from Proposition 2. Then equation (10) of the appendix shows that (3.2) holds since, again, for either for or .
Assume now that (3.2) holds and let us show (3.4). The construction used is the one given in Appendix 3 and is based on the modification of as defined in equation (3) below. By construction we have that . Furthermore, we see how, for with ,
[TABLE]
according to equations (16) and (17) of the appendix since also (2.1) holds. The conclusion, remains valid for when and so since in this case either or because . For with , we have
[TABLE]
using among other things (3.2) and the product structure of (4) below. The conclusion remains valid when since either side is 0 in this case. Put together, this establishes independence of and and so (3.4).
Under assumption of (3.4) we have , using the independence, and this is (3.2). ∎
As noted by many authors working under assumption of some version of the representativity property, representativity implies identifiability. As demonstrated by Williams & Lagakos, (1977) in their setting, the two properties are not equivalent. This is also the case in our setting.
Proposition 4**.**
The representativity property implies the identifiability property, but the reverse does not hold.
Proof.
In the proof of Proposition 3, the implication has already been established. Let us here present another argument. Assume (3.4) and choose a censoring time accordingly such that C\mathchoice{\mathrel{\hbox to0.0pt{\displaystyle\perp\hss}\mkern 2.0mu{\displaystyle\perp}}}{\mathrel{\hbox to0.0pt{\textstyle\perp\hss}\mkern 2.0mu{\textstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptstyle\perp\hss}\mkern 2.0mu{\scriptstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptscriptstyle\perp\hss}\mkern 2.0mu{\scriptscriptstyle\perp}}}(T,D). Then (2.3) holds since for almost all for . This shows the implication.
On the other hand, the event time and the observed pair constructed in Section 4 below provides an example where identifiability holds but representativity does not. ∎
3 Censoring by a given censoring time
In this section we consider as given an event time , an event type , and a censoring time . The observed pair is thus explicitly and , which is a special case of the setting in Section 2.
For the censoring time, we denote its survival function , distribution function and cumulative hazard function .
As a result of the asymmetry between and in the definition of where takes priority, a modification of is relevant for it to be comparable to the defined . We let
[TABLE]
define this modification and note that and so . The continuous parts of and are the same so by the characterization of the product integral of Definition 4 from Gill & Johansen, (1990), the modification allows for the product structure
[TABLE]
which has technical importance in the following.
If we define , this modification can also be expressed as using the definition of . The difference is seen to be , and, by letting , we see that .
We now consider properties relating to the censoring similar to those of Proposition 2 which are then naturally termed the censoring identifiability property.
Proposition 5**.**
The following properties are equivalent.
- (5.1)
We have that for any . 2. (5.2)
The process given by , , is a martingale with respect to the filtration , the observed information. 3. (5.3)
We have that for -almost all . 4. (5.4)
We have that for -almost all .
Proof.
The equivalence of (5.1) and (5.2) follows by a similar argument as in the proof of Proposition 2 but now using the fact that , , can be shown to define a martingale.
The equivalence of (5.1) and (5.3) is obtained by mimicking the steps in Proposition 2 while using the identity
[TABLE]
The equivalence then follows by noting that .
The identity in (7) of the appendix immediately shows that (5.1) implies (5.4) by exploiting the fact that we have already established the equivalence between (5.3) and (5.1). Similarly, the identity in (10) of the appendix immediately shows that (5.4) implies (5.3). ∎
Williams & Lagakos, (1977) considered, with inspiration from Gail, (1975), an independent censoring assumption which, in this setting, may be formulated as the following property. The property is
[TABLE]
for and
[TABLE]
for all . In Williams & Lagakos, (1977), this assumption was seen to be a stronger assumption than the constant-sum property, here given in (2.4). As was also noted by Kalbfleisch & MacKay, (1979), in the setting of their paper, this is the case only because it includes an additional requirement on the given censoring time. This is the content of the following result.
Proposition 6**.**
The following properties are equivalent.
- (6.1)
* for and for all .* 2. (6.2)
* for and for all .* 3. (6.3)
* for -almost all for and for -almost all .*
Proof.
Assume that (6.1) holds. The product structure results in and similarly under the assumption. Using the assumption again, we have and which is (6.2).
Assume now that (6.2) holds. Then using the last part of the assumption. Using this in combination with the first part of the assumption yields . We already know that , so the constant sum property of (2.4) and hence also (2.1) follow. The property (5.4) and so (5.1) can be obtained in a similar manner. This establishes (6.1).
From the equalities and which hold for all , the properties of (6.2) and (6.3) are seen to be equivalent. ∎
The equivalence between (6.1) and (6.2) shows that the property introduced by Williams & Lagakos, (1977) is equivalent to having both the identifiability and the censoring identifiability property. The property of (6.3) can be considered pointwise independence between and and, as is evident from the proof of Proposition 6, it also implies for all . For this reason, we refer to the properties in Proposition 6 collectively as the property of pointwise independence. It does not imply independence of and , however.
Independence of and is here referred to as full independence. This assumption is made by many authors, and is, for instance, used in Kaplan & Meier, (1958). In Lagakos, (1979), the property is described as strictly stronger than the non-prognostic censoring property from Proposition 3. The next result shows that this is the case only because full independence includes a further property of representativity of the given censoring time. This property is that
[TABLE]
holds for any and -almost all for . We will refer to this as the censoring representativity property as it is a counterpart to (3.2). An argument similar to the one used in Proposition 4 shows that censoring representativity implies censoring identifiability but that the two properties are not equivalent. The following result now applies.
Proposition 7**.**
Full independence, C\mathchoice{\mathrel{\hbox to0.0pt{\displaystyle\perp\hss}\mkern 2.0mu{\displaystyle\perp}}}{\mathrel{\hbox to0.0pt{\textstyle\perp\hss}\mkern 2.0mu{\textstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptstyle\perp\hss}\mkern 2.0mu{\scriptstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptscriptstyle\perp\hss}\mkern 2.0mu{\scriptscriptstyle\perp}}}(T,D), holds if and only if both the representativity property and the censoring representativity property hold.
Proof.
It is evident that full independence implies (3.4). Similarly, for any and -almost all for under the independence assumption.
Assume instead that the properties of Proposition 3 and (5) hold. By equation (12) of the appendix, (5.4) and so, by Proposition 5, also (5.1). Now (5) states that . This is exactly the conditional distribution of the independent censoring time constructed in the proof of Proposition 3. Since we are assuming that the properties of Proposition 3 hold, the same calculations lead to the independence of and . ∎
4 Examples
4.1 A technical setting
This technical example serves to illustrate the differences between the identifiability and representativity properties.
Consider the probability space with , the Borel -algebra, and the uniform distribution such that for , , all in . The random variables given by and are then independent. We further define the random variables
[TABLE]
and . A direct calculation reveals that the distributions of are all uniform on .
If we define and , then and for any choice of and . That is, any combination of the event and censoring times defined above yields the same observable exit time and exit type.
Note that the representativity property holds for by virtue of (3.4) because is independent of and and so by Proposition 4, the identifiability property also holds for . Thus, the identifiability property also holds for since, for instance, the property of identity of forces of mortality is inherited from as and have the same distribution. A calculation reveals that for -almost all we have and such that non-prognostic censoring and thereby representativity cannot hold for . Similarly, censoring representativity holds for , but cannot hold for .
Since, for , , the cumulative hazard associated with the distribution of is for all such that censoring identifiability cannot hold for .
Figure 1 illustrates the definition of and as well as the observed exit time as a heat map. Note how, for any combination of and , the minimum of their respective graphs correspond to the graph of . The assumptions met for the various choices of and to produce are summarized by Table 1.
The primary idea behind these examples is that with basis in independent and , we can alter the unobserved parts of the underlying event and censoring time without altering the observed . If the event time is left unaltered but the unobserved part of the censoring time is altered arbitrarily, the representativity property is retained. If the marginal distribution is retained as is the case for and , the identifiability property is retained.
4.2 A practical setting
As an illustration of a practical setting, we can consider the following example of a register-based study. Suppose we are interested in studying the cumulative incidences of different causes of death in a certain population. In this case, we can let denote the pair of time of death and cause of death for a randomly picked member of the population. Imagine that we have information on age and cause of death of population members except in the case of emigration from the population. In other words, we have information on , which equals if the time of death is observed and is the time of emigration otherwise, and , which is the cause of death if the time of death is observed and 0, denoting emigration, otherwise. As can be seen from Proposition 13 of the appendix, data on the observed pair alone does not allow us to refute the idea that is produced by and a time to emigration that are independent, C\mathchoice{\mathrel{\hbox to0.0pt{\displaystyle\perp\hss}\mkern 2.0mu{\displaystyle\perp}}}{\mathrel{\hbox to0.0pt{\textstyle\perp\hss}\mkern 2.0mu{\textstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptstyle\perp\hss}\mkern 2.0mu{\scriptstyle\perp}}}{\mathrel{\hbox to0.0pt{\scriptscriptstyle\perp\hss}\mkern 2.0mu{\scriptscriptstyle\perp}}}(T,D). However, in this case, common sense tells us that emigration cannot happen after death so the time to emigration can never be independent of . Instead, one should rather define when or, as in Section 2, simply not trouble oneself with defining a time to emigration for all individuals.
Suppose now we are interested in estimating the cumulative incidence proportion for various time points for the different causes of death . In the end, the problem of defining a time to emigration has no bearing on the validity of the Aalen–Johansen estimator as an estimate of . We instead require the identifiability property relating to directly as laid out in Proposition 2. In terms of the identity of forces of mortality property, this requirement has the interpretation that the observable hazard of any of the causes of death as represented by should equal the underlying hazard of the same cause as represented by on the relevant time interval. In terms of the status-independent observation property, the requirement has the interpretation that the status of survival up to any time point and the status of death of a certain cause at the same time point are equally likely to be observed, that is, the probability of not emigrating before that time point given the status does not depend on the status.
If we are instead interested in using the Aalen–Johansen estimator for prognosis, we need a stronger assumption. Suppose we are looking at population members that are alive and have not emigrated at time point and we are interested in estimating the probabilities of dying of the different causes before time . In other words, we are interested in estimating . Under the identifiability assumption, a valid estimate of can be obtained based on the Aalen–Johansen and related Kaplan–Meier estimator. In order for this estimate to be a valid estimate of as well, an assumption of the non-prognostic observation property from Proposition 3 is needed. That is, we require the stronger representativity property to hold. By the equivalence to non-prognostic censoring, this entails that population members emigrating at time have the same probabilities of dying of certain causes as members that are alive at time .
The representativity assumption also implies the existence of a censoring time independent of which corresponds to the time to emigration for individuals who emigrate. It may be useful to think in terms of such a , but its value for individuals who are not observed to be emigrating should not, at least without further assumptions, be confused with a counterfactual emigration time that would have been observed if death had not occurred beforehand. In fact, the censoring time may not have any relevant interpretation for individuals who are observed to die.
In register-based studies, censoring at end of follow up may be much more prominent than, for instance, censoring by emigration from the population. End of follow-up is an example of a censoring time that may defined explicitly without consideration of the underlying . This extra piece of information can be used to judge whether censoring identifiability and censoring representativity are appropriate, but these properties do not help us in judging the validity of the representativity or identifiability properties related to and thus the validity of the Aalen–Johansen and Kaplan–Meier estimators.
5 Discussion
When no given, underlying censoring time is considered, the assumptions that we have studied that ensure consistency of the Kaplan–Meier and Aalen–Johansen estimators fall in two categories: an identifiability assumption and a representativity assumption. Although the properties within one category are all equivalent, they are quite different in their interpretations and hence some may be easier to communicate to a clinical researcher than others. Which interpretation is most suitable is a matter of preference but it seems to us that the properties of status-independent observation and non-prognostic observation are much easier to interpret and potentially refute than, for example, the corresponding martingale properties.
The appropriateness of either assumption cannot be assessed based on information on the exit time and exit type alone as is seen from Proposition 13 of the appendix, which ensures the existence of an event time and type that realize the observed exit time and exit type and at the same time satisfy the representativity assumption. This is in a similar vein to the result by Molenberghs et al., (2008) that one cannot distinguish between missing-at-random and missing-not-at-random models based on only the observed data. Consequently, any information used to the assess the validity of the identifiability or representativity assumption must come from an external source.
Other properties than the ones treated in this paper have been considered in the literature. Ebrahimi et al., (2003) considered a certain property and proceeded to argue its equivalence to the constant-sum property. Jacobsen, (1989) considered three nested properties in a setting where the observed censoring times need not be independent and identically distributed, see his Proposition 3.4.
It appears that these three properties all translate into equivalents of the representativity property in our setting.
Our focus has been on marginal distributions, but in regression analysis in a survival analysis context, a similar question of necessary assumptions on the censoring mechanism is highly relevant. Seemingly, versions of the assumptions studied here in the conditional distribution given covariates of a regression model are useful in this respect.
Acknowledgements
The authors would like to thank an anonymous referee for valuable comments that have greatly improved the manuscript. Morten Overgaard is supported by the Novo Nordisk Foundation grant NNF17OC0028276.
Appendix 1
Technical results
Consider the matrix defined in (1). We then have the following characterization of the product integral .
Lemma 8**.**
Consider a matrix-valued function given by for which is right continuous with left limits in both variables. Then, for given , for all if and only if
[TABLE]
for , , and for for all .
Proof.
This is a special case of Theorem 5 of Gill & Johansen, (1990), which establishes that if and only if the forward equation for all holds. The only solutions to the equations for all for implied by the forward equation, are , see for instance Theorem 10 of Gill & Johansen, (1990). This, in turn, implies that for for to be a solution to the forward equation. ∎
In the following, we give some useful identities in the setup of Section 3 where event time , event type , and censoring time are all given and we observe and . The identities not involving the may, however, also be used in the setting of Section 2 where a censoring time is not explicitly given.
Lemma 9**.**
The equalities
[TABLE]
and
[TABLE]
hold for all .
Proof.
Since the product integral structures and hold, the equality holds according to the Duhamel equation, see Theorem 6 of Gill & Johansen, (1990). Note that and recall that to obtain the equality (6).
A similar argument leads to . The equality (7) now follows by realizing that and . ∎
Lemma 10**.**
The equalities
[TABLE]
and
[TABLE]
hold for all .
Proof.
We know that and similarly that . Since , we have . Recall that . A change in the order of integration reveals that , where the definition of is used once more. Put together, this establishes that the equality
[TABLE]
holds for all . Now the desired result follows since . As for the second equality, similar arguments lead to and (10) then follows since . ∎
Lemma 11**.**
The equality
[TABLE]
holds for all and with .
Proof.
As a preliminary step, we have . On the other hand, an application of the Duhamel equation in dimensions, or a direct calculation, reveals that . Subtract the first expression from the second expression to obtain (10). ∎
Lemma 12**.**
The equalities
[TABLE]
and
[TABLE]
hold for all for alle .
Proof.
A change in the order of integration shows that . Split up , where we have , and put together to obtain (11). The argument for (12) is similar. ∎
Appendix 2
Convergence of the Aalen–Johansen estimator
For an i.i.d. sample of , we let denote the Nelson–Aalen estimator for and . If we define the matrix
[TABLE]
then the Aalen–Johansen estimator is defined as
[TABLE]
By arguments similar to Section 4.2 of Gill & Johansen, (1990), we see that almost surely for for all and and thus also almost surely for for all . By continuity of the product integral (Gill & Johansen,, 1990) we conclude that
[TABLE]
almost surely as for all .
The Kaplan–Meier estimator for the all-cause survival function is defined as
[TABLE]
which is just entry of .
Appendix 3
Constructing latent times
Let us consider a probability space on which random variables and are defined. We now want to extend the probability space in order to define a random variable that satisfies when and when and follows a certain conditional distribution given . The desired conditional cumulative distribution function is given by where is defined in (3) solely based on the distribution of . For given and , the function is right-continuous and increasing. For a right-continuous and increasing function , the inversion, as defined in for instance Section II.2a of Asmussen & Glynn, (2007), given by is a useful concept. Using right continuity and that is increasing, the conclusion that if and only if can be reached. Extend the sample space to and the -algebra to , where is the Borel -algebra, and the probability measure by . By these extensions, the random variable defined on the probability space and given by for follows a uniform distribution on and is independent of . The random variable defined by for , where is the inversion of , now fulfills when and when and has the desired conditional distribution. Renaming to , we note that, for ,
[TABLE]
where the last equation uses and . Since , these equalities can also be used to establish that
[TABLE]
This reveals that the constructed is proper when and only when for .
In a setting identical to above, we now want to construct such that when and when and such that has a certain conditional distribution given . Here, the desired conditional cumulative distribution function is given by . This can be achieved in a manner similar to above. A two step procedure is to construct according to the conditional cumulative distribution function given by and then to construct according to the conditional cumulative distribution function given by
[TABLE]
where division by 0 can be taken to produce 0. The so constructed pair satisfies in particular for and for and so also
[TABLE]
These constructions lead to the following proposition, which has similarities to Theorem 2 of Tsiatis, (1975).
Proposition 13**.**
Given positive random variable and random variable with values in , positive random variables and and random variable with values in exist such that , and such that is independent of .
Proof.
Construct , and as described above. The distribution of is given by and the distribution of is given by . Equation (16) reveals that, for , . By construction we have, for , . According to Proposition 3, this implies for all and . Using this and (4), we have, for with ,
[TABLE]
Both sides are 0 if . We have thereby established that for all and , and so and are independent. ∎
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Aalen & Johansen, (1978) Aalen, OO, & Johansen, S. 1978. An empirical transition matrix for non-homogeneous Markov chains based on censored observations. Scandinavian Journal of Statistics , 141–150.
- 2Andersen & Keiding, (2006) Andersen, PK, & Keiding, N. 2006. Survival and event history analysis . Wiley.
- 3Andersen et al., (1993) Andersen, PK, Borgan, Ø, Gill, RD, & Keiding, N. 1993. Statistical models based on counting processes . Springer Series in Statistics. Springer-Verlag, New York.
- 4Asmussen & Glynn, (2007) Asmussen, S, & Glynn, PW. 2007. Stochastic simulation: algorithms and analysis . Vol. 57. Springer Science & Business Media.
- 5Ebrahimi et al., (2003) Ebrahimi, N, Molefe, D, & Ying, Z. 2003. Identifiability and censored data. Biometrika , 90 (3), 724–727.
- 6Elandt-Johnson, (1976) Elandt-Johnson, RC. 1976. Conditional failure time distributions under competing risk theory with dependent failure times and proportional hazard rates. Scandinavian Actuarial Journal , 1976 (1), 37–51.
- 7Fleming & Harrington, (1991) Fleming, TR, & Harrington, DP. 1991. Counting Processes and Survival Analysis . John Wiley & Sons.
- 8Gail, (1975) Gail, M. 1975. A review and critique of some models used in competing risk analysis. Biometrics , 209–222.
