Scaling Survival Analysis in Healthcare with Federated Survival Forests: A Comparative Study on Heart Failure and Breast Cancer Genomics
Alberto Archetti, Francesca Ieva, Matteo Matteucci

TL;DR
This paper introduces FedSurF++, a federated survival forest algorithm that enhances privacy, efficiency, and scalability in distributed healthcare survival analysis, demonstrated on real-world datasets for heart failure and breast cancer.
Contribution
We develop FedSurF++, an improved federated survival forest method that requires only one communication round and performs well on heterogeneous healthcare data.
Findings
FedSurF++ achieves comparable accuracy to neural network models.
The algorithm requires only a single communication round.
Results on real-world datasets validate its effectiveness in healthcare applications.
Abstract
Survival analysis is a fundamental tool in medicine, modeling the time until an event of interest occurs in a population. However, in real-world applications, survival data are often incomplete, censored, distributed, and confidential, especially in healthcare settings where privacy is critical. The scarcity of data can severely limit the scalability of survival models to distributed applications that rely on large data pools. Federated learning is a promising technique that enables machine learning models to be trained on multiple datasets without compromising user privacy, making it particularly well-suited for addressing the challenges of survival data and large-scale survival applications. Despite significant developments in federated learning for classification and regression, many directions remain unexplored in the context of survival analysis. In this work, we propose an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
