Predictive Multimodal Modeling of Diagnoses and Treatments in EHR
Cindy Shih-Ting Huang, Clarence Boon Liang Ng, Marek Rei

TL;DR
This paper introduces a multimodal predictive model that combines clinical notes and tabular EHR data to forecast diagnoses and treatments early during patient stays, improving upon existing methods.
Contribution
It proposes a novel fusion approach with pre-trained encoders, cross-modal attention, and a weighted temporal loss for early prediction in EHRs, addressing limited initial information.
Findings
Outperforms current state-of-the-art early prediction models.
Enhances representation learning through multimodal fusion techniques.
Improves early diagnosis and treatment forecasting accuracy.
Abstract
While the ICD code assignment problem has been widely studied, most works have focused on post-discharge document classification. Models for early forecasting of this information could be used for identifying health risks, suggesting effective treatments, or optimizing resource allocation. To address the challenge of predictive modeling using the limited information at the beginning of a patient stay, we propose a multimodal system to fuse clinical notes and tabular events captured in electronic health records. The model integrates pre-trained encoders, feature pooling, and cross-modal attention to learn optimal representations across modalities and balance their presence at every temporal point. Moreover, we present a weighted temporal loss that adjusts its contribution at each point in time. Experiments show that these strategies enhance the early prediction model, outperforming the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
