Privacy-Preserving Generative Modeling and Clinical Validation of Longitudinal Health Records for Chronic Disease
Benjamin D. Ballyk, Ankit Gupta, Sujay Konda, Kavitha Subramanian, Chris Landon, Ahmed Ammar Naseer, Georg Maierhofer, Sumanth Swaminathan, Vasudevan Venkateshwaran

TL;DR
This paper develops a privacy-preserving generative model for longitudinal health records, enabling realistic synthetic data creation for chronic disease research while ensuring data privacy and maintaining clinical utility.
Contribution
It introduces DP-TimeGAN, a novel differentially private time-series generative model that improves privacy-utility trade-offs for clinical longitudinal data.
Findings
Non-private Augmented TimeGAN outperforms other models on statistical metrics.
DP-TimeGAN maintains high data authenticity under privacy constraints.
Synthetic data achieves clinician-rated realism comparable to real data.
Abstract
Data privacy is a critical challenge in modern medical workflows as the adoption of electronic patient records has grown rapidly. Stringent data protection regulations limit access to clinical records for training and integrating machine learning models that have shown promise in improving diagnostic accuracy and personalized care outcomes. Synthetic data offers a promising alternative; however, current generative models either struggle with time-series data or lack formal privacy guaranties. In this paper, we enhance a state-of-the-art time-series generative model to better handle longitudinal clinical data while incorporating quantifiable privacy safeguards. Using real data from chronic kidney disease and ICU patients, we evaluate our method through statistical tests, a Train-on-Synthetic-Test-on-Real (TSTR) setup, and expert clinical review. Our non-private model (Augmented TimeGAN)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Machine Learning in Healthcare · Artificial Intelligence in Healthcare
