Synthesizing Multimodal Electronic Health Records via Predictive Diffusion Models
Yuan Zhong, Xiaochen Wang, Jiaqi Wang, Xiaokun Zhang, Yaqing Wang,, Mengdi Huai, Cao Xiao, Fenglong Ma

TL;DR
This paper introduces EHRPD, a diffusion-based model for synthesizing electronic health records that better captures temporal dependencies and time information, improving data quality and diversity over existing methods.
Contribution
The paper presents a novel diffusion model with time-aware embeddings and a predictive U-Net to enhance EHR data generation quality and diversity.
Findings
EHRPD outperforms existing methods in fidelity and utility.
The model effectively captures temporal dependencies in EHR data.
EHRPD maintains privacy while providing high-quality synthetic records.
Abstract
Synthesizing electronic health records (EHR) data has become a preferred strategy to address data scarcity, improve data quality, and model fairness in healthcare. However, existing approaches for EHR data generation predominantly rely on state-of-the-art generative techniques like generative adversarial networks, variational autoencoders, and language models. These methods typically replicate input visits, resulting in inadequate modeling of temporal dependencies between visits and overlooking the generation of time information, a crucial element in EHR data. Moreover, their ability to learn visit representations is limited due to simple linear mapping functions, thus compromising generation quality. To address these limitations, we propose a novel EHR data generation model called EHRPD. It is a diffusion-based model designed to predict the next visit based on the current one while…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare
Methods*Communicated@Fast*How Do I Communicate to Expedia? · Concatenated Skip Connection · Convolution · Max Pooling · U-Net · Diffusion
