A Fast Divide-and-Conquer Sparse Cox Regression
Yan Wang, Nathan Palmer, Qian Di, Joel Schwartz, Isaac Kohane, Tianxi, Cai

TL;DR
This paper introduces a fast divide-and-conquer algorithm for sparse Cox regression that efficiently handles massive datasets, maintaining statistical accuracy while significantly reducing computation time.
Contribution
The paper presents a novel DAC algorithm that combines linear approximation and least squares to efficiently fit sparse Cox models on large-scale survival data.
Findings
Outperforms existing methods in computational speed
Achieves similar statistical efficiency as full data methods
Successfully applied to large-scale survival datasets
Abstract
We propose a computationally and statistically efficient divide-and-conquer (DAC) algorithm to fit sparse Cox regression to massive datasets where the sample size is exceedingly large and the covariate dimension is not small but . The proposed algorithm achieves computational efficiency through a one-step linear approximation followed by a least square approximation to the partial likelihood (PL). These sequences of linearization enable us to maximize the PL with only a small subset and perform penalized estimation via a fast approximation to the PL. The algorithm is applicable for the analysis of both time-independent and time-dependent survival data. Simulations suggest that the proposed DAC algorithm substantially outperforms the full sample-based estimators and the existing DAC algorithm with respect to the computational speed, while it achieves similar…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods and Inference · Machine Learning and Algorithms · Bayesian Methods and Mixture Models
