TCDiff: Triplex Cascaded Diffusion for High-fidelity Multimodal EHRs Generation with Incomplete Clinical Data
Yandong Yan, Chenxi Li, Yu Huang, Dexuan Xu, Jiaqi Zhu, Zhongyan Chai, Huamin Zhang

TL;DR
TCDiff is a novel cascaded diffusion framework designed to generate high-fidelity, multimodal EHR data, effectively handling data heterogeneity and incompleteness, with superior performance demonstrated on public and new TCM datasets.
Contribution
The paper introduces TCDiff, a triplex cascaded diffusion model that models complex dependencies in multimodal EHRs and handles data incompleteness, advancing EHR synthesis methods.
Findings
Outperforms state-of-the-art baselines by 10% in data fidelity.
Robustly handles various levels of data missingness.
Validated on public datasets and a new TCM-specific dataset.
Abstract
The scarcity of large-scale and high-quality electronic health records (EHRs) remains a major bottleneck in biomedical research, especially as large foundation models become increasingly data-hungry. Synthesizing substantial volumes of de-identified and high-fidelity data from existing datasets has emerged as a promising solution. However, existing methods suffer from a series of limitations: they struggle to model the intrinsic properties of heterogeneous multimodal EHR data (e.g., continuous, discrete, and textual modalities), capture the complex dependencies among them, and robustly handle pervasive data incompleteness. These challenges are particularly acute in Traditional Chinese Medicine (TCM). To this end, we propose TCDiff (Triplex Cascaded Diffusion Network), a novel EHR generation framework that cascades three diffusion networks to learn the features of real-world EHR data,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Traditional Chinese Medicine Studies · Generative Adversarial Networks and Image Synthesis
