Protect and Extend -- Using GANs for Synthetic Data Generation of   Time-Series Medical Records

Navid Ashrafi; Vera Schmitt; Robert P. Spang; Sebastian M\"oller,; Jan-Niklas Voigt-Antons

arXiv:2402.14042·cs.LG·March 4, 2024·1 cites

Protect and Extend -- Using GANs for Synthetic Data Generation of Time-Series Medical Records

Navid Ashrafi, Vera Schmitt, Robert P. Spang, Sebastian M\"oller,, Jan-Niklas Voigt-Antons

PDF

Open Access

TL;DR

This paper evaluates GAN-based models for generating synthetic time-series medical records, focusing on privacy preservation and data quality, to enable safe data sharing in sensitive healthcare contexts.

Contribution

It compares state-of-the-art GAN models for privacy-preserving synthetic medical data generation and introduces a model that balances privacy with data quality.

Findings

01

PPGAN outperforms other models in privacy preservation

02

Generated data maintains acceptable quality for predictive tasks

03

Membership inference attacks reveal reduced data leakage risks

Abstract

Preservation of private user data is of paramount importance for high Quality of Experience (QoE) and acceptability, particularly with services treating sensitive data, such as IT-based health services. Whereas anonymization techniques were shown to be prone to data re-identification, synthetic data generation has gradually replaced anonymization since it is relatively less time and resource-consuming and more robust to data leakage. Generative Adversarial Networks (GANs) have been used for generating synthetic datasets, especially GAN frameworks adhering to the differential privacy phenomena. This research compares state-of-the-art GAN-based models for synthetic data generation to generate time-series synthetic medical records of dementia patients which can be distributed without privacy concerns. Predictive modeling, autocorrelation, and distribution analysis are used to assess the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare