Collaborative Synthesis of Patient Records through Multi-Visit Health State Inference
Hongda Sun, Hongzhan Lin, Rui Yan

TL;DR
This paper introduces MSIC, a probabilistic model for synthesizing multi-visit electronic health records that respects medical event combinations, infers health states, and generates descriptive medical reports, improving data quality and privacy.
Contribution
MSIC is the first to model health states across multiple visits for realistic EHR synthesis and to generate detailed medical reports, enhancing utility and privacy.
Findings
MSIC outperforms existing methods in synthetic data quality.
Generated EHRs maintain low privacy risks.
Medical reports add valuable descriptive information.
Abstract
Electronic health records (EHRs) have become the foundation of machine learning applications in healthcare, while the utility of real patient records is often limited by privacy and security concerns. Synthetic EHR generation provides an additional perspective to compensate for this limitation. Most existing methods synthesize new records based on real EHR data, without consideration of different types of events in EHR data, which cannot control the event combinations in line with medical common sense. In this paper, we propose MSIC, a Multi-visit health Status Inference model for Collaborative EHR synthesis to address these limitations. First, we formulate the synthetic EHR generation process as a probabilistic graphical model and tightly connect different types of events by modeling the latent health states. Then, we derive a health state inference method tailored for the multi-visit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Topic Modeling · Biomedical Text Mining and Ontologies
