Heterogeneous Datasets for Federated Survival Analysis Simulation
Alberto Archetti, Eugenio Lomurno, Francesco Lattari, Andr\'e Martin,, Matteo Matteucci

TL;DR
This paper introduces a novel method to create realistic, heterogeneous datasets for federated survival analysis by splitting existing datasets using Dirichlet-based algorithms, facilitating benchmarking and development.
Contribution
It proposes two Dirichlet-based algorithms for generating heterogeneous federated survival datasets with adjustable heterogeneity levels, addressing the lack of standard benchmarks.
Findings
Algorithms effectively produce diverse heterogeneity levels.
Generated datasets enable reproducible federated survival analysis experiments.
Quantitative and qualitative evaluations validate the dataset splitting methods.
Abstract
Survival analysis studies time-modeling techniques for an event of interest occurring for a population. Survival analysis found widespread applications in healthcare, engineering, and social sciences. However, the data needed to train survival models are often distributed, incomplete, censored, and confidential. In this context, federated learning can be exploited to tremendously improve the quality of the models trained on distributed data while preserving user privacy. However, federated survival analysis is still in its early development, and there is no common benchmarking dataset to test federated survival models. This work provides a novel technique for constructing realistic heterogeneous datasets by starting from existing non-federated datasets in a reproducible way. Specifically, we propose two dataset-splitting algorithms based on the Dirichlet distribution to assign each data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · demographic modeling and climate adaptation · Insurance, Mortality, Demography, Risk Management
MethodsTest
