Distributed learning optimisation of Cox models can leak patient data: Risks and solutions
Carsten Brink (1,2), Christian R{\o}nn Hansen (1,2), Matthew, Field (3,4), Gareth Price (5), David Thwaites (6), Nis Sarup (1) and, Uffe Bernchou (1,2), Lois Holloway (3,4,6,7) ((1) Laboratory of Radiation, Physics, Department of Oncology, Odense University Hospital, Odense

TL;DR
This paper reveals that distributed optimization of Cox models can compromise patient privacy through data leakage and proposes a secure, validated method to prevent this while handling missing data.
Contribution
It identifies privacy risks in distributed Cox model training and introduces a novel secure optimization approach with validation methods.
Findings
Distributed Cox model optimization can leak patient data.
Proposed method prevents data leakage during model training.
Feasibility demonstrated with Matlab implementation.
Abstract
Medical data are often highly sensitive, and frequently there are missing data. Due to the data's sensitive nature, there is an interest in creating modelling methods where the data are kept in each local centre to preserve their privacy, but yet the model can be trained on and learn from data across multiple centres. Such an approach might be distributed machine learning (federated learning, collaborative learning) in which a model is iteratively calculated based on aggregated local model information from each centre. However, even though no specific data are leaving the centre, there is a potential risk that the exchanged information is sufficient to reconstruct all or part of the patient data, which would hamper the safety-protecting rationale idea of distributed learning. This paper demonstrates that the optimisation of a Cox survival model can lead to patient data leakage.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Radiomics and Machine Learning in Medical Imaging
