Generation of Synthetic Electronic Health Records Using a Federated GAN
John Weldon, Tomas Ward, Eoin Brophy

TL;DR
This paper presents a federated GAN approach to generate synthetic electronic health records, enabling data sharing across hospitals without exposing sensitive patient data, and demonstrates comparable quality to centralized models.
Contribution
It introduces a federated learning framework for GANs to generate high-quality synthetic EHR data from multiple data silos without data sharing.
Findings
Synthetic data quality is maintained across federated and centralized training.
Statistical evaluation shows minimal RMSE difference between methods.
Medical professionals' review confirms comparable data realism.
Abstract
Sensitive medical data is often subject to strict usage constraints. In this paper, we trained a generative adversarial network (GAN) on real-world electronic health records (EHR). It was then used to create a data-set of "fake" patients through synthetic data generation (SDG) to circumvent usage constraints. This real-world data was tabular, binary, intensive care unit (ICU) patient diagnosis data. The entire data-set was split into separate data silos to mimic real-world scenarios where multiple ICU units across different hospitals may have similarly structured data-sets within their own organisations but do not have access to each other's data-sets. We implemented federated learning (FL) to train separate GANs locally at each organisation, using their unique data silo and then combining the GANs into a single central GAN, without any siloed data ever being exposed. This global,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Generative Adversarial Networks and Image Synthesis · AI in cancer detection
