Efficient and scalable clustering of survival curves
Nora M. Villanueva, Marta Sestelo, Luis Meira-Machado

TL;DR
This paper introduces a scalable clustering method for survival curves that uses k-means and log-rank tests, significantly reducing computation time while maintaining accuracy, suitable for large-scale survival data analysis.
Contribution
The paper presents a novel, efficient clustering approach for survival curves that eliminates the need for bootstrap resampling, improving scalability and speed.
Findings
Achieves comparable accuracy to bootstrap methods
Dramatically reduces computational time
Effective for large-scale survival datasets
Abstract
Survival analysis encompasses a broad range of methods for analyzing time-to-event data, with one key objective being the comparison of survival curves across groups. Traditional approaches for identifying clusters of survival curves often rely on computationally intensive bootstrap techniques to approximate the null hypothesis distribution. While effective, these methods impose significant computational burdens. In this work, we propose a novel approach that leverages the k-means and log-rank test to efficiently identify and cluster survival curves. Our method eliminates the need for computationally expensive resampling, significantly reducing processing time while maintaining statistical reliability. By systematically evaluating survival curves and determining optimal clusters, the proposed method ensures a practical and scalable alternative for large-scale survival data analysis.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Statistical Methods and Inference
