Efficient and scalable clustering of survival curves

Nora M. Villanueva; Marta Sestelo; Luis Meira-Machado

arXiv:2512.16481·stat.ME·December 19, 2025

Efficient and scalable clustering of survival curves

Nora M. Villanueva, Marta Sestelo, Luis Meira-Machado

PDF

Open Access

TL;DR

This paper introduces a scalable clustering method for survival curves that uses k-means and log-rank tests, significantly reducing computation time while maintaining accuracy, suitable for large-scale survival data analysis.

Contribution

The paper presents a novel, efficient clustering approach for survival curves that eliminates the need for bootstrap resampling, improving scalability and speed.

Findings

01

Achieves comparable accuracy to bootstrap methods

02

Dramatically reduces computational time

03

Effective for large-scale survival datasets

Abstract

Survival analysis encompasses a broad range of methods for analyzing time-to-event data, with one key objective being the comparison of survival curves across groups. Traditional approaches for identifying clusters of survival curves often rely on computationally intensive bootstrap techniques to approximate the null hypothesis distribution. While effective, these methods impose significant computational burdens. In this work, we propose a novel approach that leverages the k-means and log-rank test to efficiently identify and cluster survival curves. Our method eliminates the need for computationally expensive resampling, significantly reducing processing time while maintaining statistical reliability. By systematically evaluating survival curves and determining optimal clusters, the proposed method ensures a practical and scalable alternative for large-scale survival data analysis.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Clustering Algorithms Research · Bayesian Methods and Mixture Models · Statistical Methods and Inference