Efficient Estimation of the Maximal Association between Multiple Predictors and a Survival Outcome
Tzu-Jung Huang, Alex Luedtke, Ian W. McKeague

TL;DR
This paper introduces a scalable, reliable method for testing the maximal association between high-dimensional predictors and survival outcomes, addressing bias and computational challenges in post-selection inference.
Contribution
It develops semi-parametrically efficient estimators and a stabilization technique for valid, scalable inference on predictor-survival associations in high-dimensional settings.
Findings
Method provides valid tests even with superpolynomial predictor growth
Simulation results support asymptotic theory at moderate sample sizes
Applied to viral gene expression data to identify relevant patterns
Abstract
This paper develops a new approach to post-selection inference for screening high-dimensional predictors of survival outcomes. Post-selection inference for right-censored outcome data has been investigated in the literature, but much remains to be done to make the methods both reliable and computationally-scalable in high-dimensions. Machine learning tools are commonly used to provide {\it predictions} of survival outcomes, but the estimated effect of a selected predictor suffers from confirmation bias unless the selection is taken into account. The new approach involves construction of semi-parametrically efficient estimators of the linear association between the predictors and the survival outcome, which are used to build a test statistic for detecting the presence of an association between any of the predictors and the outcome. Further, a stabilization technique reminiscent of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference
