The nonparametric bootstrap for the current status model
Piet Groeneboom, Kim Hendrickx

TL;DR
This paper demonstrates that while direct bootstrap of the nonparametric MLE in the current status model is inconsistent, bootstrapping functionals of the MLE yields valid confidence intervals, supported by convergence results.
Contribution
It introduces a method for valid bootstrap inference in the current status model by focusing on functionals of the MLE, overcoming previous inconsistency issues.
Findings
Bootstrapped MLE converges at the correct rate in Lp-distance.
Bootstrapping functionals of the MLE produces valid confidence intervals.
Results extend to the current status regression model.
Abstract
It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid intervals. To this end, we prove that the bootstrapped MLE converges at the right rate in the -distance. We also discuss applications of this result to the current status regression model.
| Estimate | mean | var | MSE | Wald-type CI | Bootstrap CI | |||
|---|---|---|---|---|---|---|---|---|
| CP | AL | CP | AL | |||||
| SSE | 100 | 0.498943 | 0.310723 | 0.310968 | 0.978 | 0.265883 | 0.824 | 0.204163 |
| 500 | 0.499717 | 0.220885 | 0.220925 | 0.982 | 0.097457 | 0.897 | 0.080317 | |
| 1000 | 0.500720 | 0.217415 | 0.217933 | 0.977 | 0.065837 | 0.924 | 0.055648 | |
| 5000 | 0.499993 | 0.195111 | 0.195112 | 0.977 | 0.027159 | 0.945 | 0.024423 | |
| MRCE | 100 | 0.497996 | 0.308180 | 0.308582 | 0.979 | 0.268731 | 0.821 | 0.205522 |
| 500 | 0.499761 | 0.251232 | 0.251260 | 0.978 | 0.098028 | 0.862 | 0.089143 | |
| 1000 | 0.500553 | 0.246388 | 0.246693 | 0.973 | 0.063990 | 0.911 | 0.053129 | |
| 5000 | 0.499876 | 0.208386 | 0.208462 | 0.965 | 0.027197 | 0.922 | 0.026987 | |
| ESE | 100 | 0.500145 | 0.337755 | 0.337757 | 0.964 | 0.252687 | 0.824 | 0.223849 |
| 500 | 0.499671 | 0.217428 | 0.217482 | 0.978 | 0.094390 | 0.896 | 0.080003 | |
| 1000 | 0.500742 | 0.207401 | 0.207953 | 0.973 | 0.063990 | 0.911 | 0.053129 | |
| 5000 | 0.500228 | 0.185614 | 0.185874 | 0.972 | 0.026396 | 0.904 | 0.022285 | |
| Estimate | mean | var | MSE | Wald-type CI | Bootstrap CI | |||
|---|---|---|---|---|---|---|---|---|
| CP | AL | CP | AL | |||||
| SSE | 100 | 0.935732 | 4.525330 | 4.938096 | 0.922 | 1.000283 | 0.855 | 0.79952 |
| 500 | 0.966217 | 4.676249 | 5.246881 | 0.926 | 0.399728 | 0.902 | 0.364210 | |
| 1000 | 0.977799 | 5.032432 | 5.525339 | 0.933 | 0.279928 | 0.914 | 0.262449 | |
| 5000 | 0.989466 | 4.580756 | 5.135616 | 0.945 | 0.124375 | 0.948 | 0.121388 | |
| MRCE | 100 | 1.038510 | 8.500588 | 8.648890 | 0.925 | 1.125225 | 0.889 | 1.364034 |
| 500 | 1.006050 | 6.443404 | 6.461690 | 0.932 | 0.429007 | 0.912 | 0.473787 | |
| 1000 | 1.002680 | 6.294143 | 6.301326 | 0.939 | 0.296537 | 0.903 | 0.320908 | |
| 5000 | 0.998502 | 5.160694 | 5.171915 | 0.962 | 0.129512 | 0.954 | 0.136487 | |
| ESE | 100 | 0.974199 | 5.722576 | 5.789144 | 0.768 | 0.604649 | 0.827 | 0.910229 |
| 500 | 0.998806 | 5.984291 | 5.985003 | 0.823 | 0.290297 | 0.902 | 0.430819 | |
| 1000 | 1.005545 | 6.032743 | 6.063495 | 0.841 | 0.214280 | 0.928 | 0.302124 | |
| 5000 | 1.002462 | 5.244373 | 5.274692 | 0.892 | 0.104281 | 0.951 | 0.131427 | |
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Statistical Methods and Bayesian Inference · Bayesian Methods and Mixture Models
The nonparametric bootstrap for the current status model
Piet Groeneboomlabel=e1][email protected] label=u1 [[
url]http://dutiosc.twi.tudelft.nl/~pietg
Delft University of Technology, Mekelweg 4, 2628 CD Delft, The Netherlands.
Kim Hendrickxlabel=e2][email protected] label=u2 [[
url]http://www.uhasselt.be/fiche_en?voornaam=Kim&naam=HENDRICKX
Hasselt University, I-BioStat, Agoralaan, B3590 Diepenbeek, Belgium.
Abstract
It has been proved that direct bootstrapping of the nonparametric maximum likelihood estimator (MLE) of the distribution function in the current status model leads to inconsistent confidence intervals. We show that bootstrapping of functionals of the MLE can however be used to produce valid intervals. To this end, we prove that the bootstrapped MLE converges at the right rate in the -distance. We also discuss applications of this result to the current status regression model.
62G09,
62N01,
bootstrap,
current status,
MLE,
smooth functionals,
keywords:
[class=AMS]
keywords:
\arxiv
arXiv:1701.07359 \startlocaldefs
\setattributejournalname \endlocaldefs
1 Introduction
In the current status model, the variable of interest is a survival variable with distribution function . However, instead of observing the exact survival time , a censoring variable is observed together with the indicator . Such data arise naturally in clinical trials when a patient can only be checked at one measurement due to destructive testing. A lot of research has been published on the behavior of the maximum likelihood estimator (MLE) of the distribution function . The limiting distribution of ) is after scaling by the constant given by
[TABLE]
where is a two-sided Brownian motion with (see [19]). Other estimators with similar asymptotic properties are Chernoff’s estimator of the mode ([6]), the Grenander estimator ([10]) of a nonincreasing density, Manski’s maximum score estimator ([27]) and Rouseeuw’s least median of squares estimator ([29]). A general framework for cube-root asymptotics is given in [25].
In this paper we investigate the behavior of Efron’s nonparametric bootstrap method ([9]) for constructing confidence intervals for smooth functionals of the MLE. It is known that the nonparametric bootstrap is inconsistent for generating the limit distribution of the MLE. The authors of [2] prove that (conditional on the data),
[TABLE]
where is the bootstrap MLE and and are two independent two-sided Brownian motions originating at zero. A similar result is obtained in [26] and in [31] for the Grenander estimator. The maximum score estimator of [27] is another example of a cube-root statistic with asymptotic distribution derived in [25], where inconsistency of the nonparametric bootstrap for this estimator is shown in [2].
Constructing asymptotic confidence intervals for the distribution function in the current status model based on Chernoff’s distribution and the normalizing constant is complicated by the need to compute the critical values of and to estimate the density consistently. Since this turns out to be a rather difficult task several alternative bootstrap methods have been proposed based on resampling from a smooth estimate. [32] consider a smooth kernel estimate of and resample the from a Bernoulli distribution with probability , while keeping the censoring variables fixed and center the values of the bootstrap samples by subtracting the smooth estimate of the distribution function. [26] and [31] propose similar smooth respampling schemes for the Grenander estimator and a model-based smoothed bootstrap procedure for making inference on the maximum score estimator is developed in [28]. All methods result in consistent estimation of the (suitably standardized) distribution conditional on the original data.
A drawback of this approach is that smoothness conditions of are used which allow faster than cube-root estimation of . This raises the question if one should really use confidence intervals based on the MLE instead of on a faster converging estimate.
This latter procedure is followed in [14], where the authors consider constructing confidence intervals around the smoothed maximum likelihood estimator (SMLE) of in the current status model. The SMLE is a kernel estimate based on the MLE with an asymptotic normal distribution, instead of Chernoff’s limiting distribution ([16]). The bootstrap method proposed in [14] is however still based on the smooth bootstrap procedure described in [32] and not on Efron’s nonparametric bootstrap. We show in this paper that the construction of confidence intervals around the SMLE based on the nonparametric bootstrap can also be proved to be valid, where one does not resample from a smooth estimate of , but just resamples with replacement from the pairs in the original sample. This method already has been used without proof in [17] and also in [18] and the present manuscript intends to fill the gap of the missing proofs here. An important difference with the smooth bootstrap in [14] is that for the centering of the estimates in the nonparametric bootstrap samples the SMLE of the original sample is used, whereas this will not work for the resampling as proposed in [14]; in the latter case one needs to center the estimates in the bootstrap samples by a kernel convolution of the SMLE in the original sample. It is not clear which method is better, and the most striking fact is the similarity of the results of the two methods in our simulations. An advantage of the purely nonparametric bootstrap, discussed in the present paper, might be its conceptual simplicity and the absence of the need to center with a convolution of the SMLE in the centering of the bootstrap samples instead of the SMLE itself. An advantage of the smooth bootstrap, discussed in [14] might be the fact that only the indicators are being resampled, and that in this sense one stays closest to the sample distribution of the observation times , which stay fixed in this procedure.
Although it is argued in [8] that the naive bootstrap will not work for their goodness-of-fit test for monotone functions, based on the Grenander estimator, no theoretical justification for this conjecture is given. Other examples where a smooth bootstrap procedure is used, are the likelihood ratio type two-sample test for current status data proposed by [11] and the test for equality of functions under monotonicity constraints proposed by [7]. Both tests establish asymptotic normality for the test statistic considered.
The paper is organized as follows: In Section 2 we introduce the current status model and review some interesting properties of the MLE. The validity of the nonparametric bootstrap is discussed in Section 3. In Section 4 we provide two examples to illustrate the applicability of our result. In the first example we construct pointwise confidence intervals based on the smoothed MLE in the current status model. The second example deals with doing inferences for a finite dimensional regression parameter in the current status linear regression model. For both examples, the theoretical and finite sample behavior of the nonparametric bootstrap is discussed. Section 5 presents some concluding remarks. The proofs of our results are given in Section 6.
2 The current status model and the MLE
Let be an i.i.d. sample from the probability space , where and . The are interpreted as (nonnegative) survival times with distribution function . Instead of observing , a censoring variable is observed (with density ) independent of . One could say that in the current status model, each observation represents the current status of the item at time . The density of with respect to the product of Lebesgue measure and counting measure on is given by
[TABLE]
The maximum likelihood estimator is defined as the maximizer of the log likelihood given by (up to a constant not depending on ),
[TABLE]
over all distribution functions . [19] show that the MLE can be characterized as the left-continuous slope of the greatest convex minorant of a cumulative sum diagram consisting of the points (0,0) and
[TABLE]
where we let denote the th order statistic of the and be the corresponding to it (assuming no ties are present in the data). An important property of the MLE is the so-called switch relation, see [17] p. 69. Let be the empirical distribution function of and define the process by
[TABLE]
and the process (in ) by
[TABLE]
Then, taking , we get the switch relation:
[TABLE]
see also Figure 2.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Abrevaya [1999] {barticle} [author] \bauthor \bsnm Abrevaya, \bfnm Jason \binits J. ( \byear 1999). \btitle Rank regression for current-status data: asymptotic normality. \bjournal Statist. Probab. Lett. \bvolume 43 \bpages 275–287. \bdoi 10.1016/S 0167-7152(98)00267-3 \bmrnumber 1708095 \endbibitem
- 2Abrevaya and Huang [2005] {barticle} [author] \bauthor \bsnm Abrevaya, \bfnm Jason \binits J. and \bauthor \bsnm Huang, \bfnm Jian \binits J. ( \byear 2005). \btitle On the bootstrap of the maximum score estimator. \bjournal Econometrica \bvolume 73 \bpages 1175–1204. \endbibitem
- 3Balabdaoui, Groeneboom and Hendrickx [2017] {bmisc} [author] \bauthor \bsnm Balabdaoui, \bfnm Fadoua \binits F., \bauthor \bsnm Groeneboom, \bfnm Piet \binits P. and \bauthor \bsnm Hendrickx, \bfnm Kim \binits K. ( \byear 2017). \btitle Score estimation in the monotone single index model. \bhowpublished working paper. \endbibitem
- 4Banerjee and Wellner [2005] {barticle} [author] \bauthor \bsnm Banerjee, \bfnm Moulinath \binits M. and \bauthor \bsnm Wellner, \bfnm Jon A. \binits J. A. ( \byear 2005). \btitle Confidence intervals for current status data. \bjournal Scand. J. Statist. \bvolume 32 \bpages 405–424. \bdoi 10.1111/j.1467-9469.2005.00454.x \bmrnumber 2204627 \endbibitem
- 5Cheng et al. [2010] {barticle} [author] \bauthor \bsnm Cheng, \bfnm Guang \binits G., \bauthor \bsnm Huang, \bfnm Jianhua Z \binits J. Z. \betal et al. ( \byear 2010). \btitle Bootstrap consistency for general semiparametric M-estimation. \bjournal Ann. Statist. \bvolume 38 \bpages 2884–2915. \endbibitem
- 6Chernoff [1964] {barticle} [author] \bauthor \bsnm Chernoff, \bfnm H. \binits H. ( \byear 1964). \btitle Estimation of the mode. \bjournal Ann. Inst. Statist. Math. \bvolume 16 \bpages 31–41. \bmrnumber 0172382 (30 ##2601) \endbibitem
- 7Durot, Groeneboom and Lopuhaä [2013] {barticle} [author] \bauthor \bsnm Durot, \bfnm C. \binits C., \bauthor \bsnm Groeneboom, \bfnm P. \binits P. and \bauthor \bsnm Lopuhaä, \bfnm H. P. \binits H. P. ( \byear 2013). \btitle Testing equality of functions under monotonicity constraints. \bjournal J. Nonparametr. Stat. \bvolume 25 \bpages 939–970. \bdoi 10.1080/10485252.2013.826356 \endbibitem
- 8Durot and Reboul [2010] {barticle} [author] \bauthor \bsnm Durot, \bfnm Cécile \binits C. and \bauthor \bsnm Reboul, \bfnm Laurence \binits L. ( \byear 2010). \btitle Goodness-of-Fit Test for Monotone Functions. \bjournal Scandinavian Journal of Statistics \bvolume 37 \bpages 422–441. \endbibitem
