Multi-lingual Multi-institutional Electronic Health Record based Predictive Model
Kyunghoon Hur, Heeyoung Kwak, Jinsu Jang, Nakhwan Kim, Edward Choi

TL;DR
This study explores multilingual multi-institutional EHR prediction, demonstrating that translation-based language alignment outperforms multilingual encoders across diverse ICU datasets, enabling scalable, language-agnostic clinical prediction models.
Contribution
It introduces a scalable, text-based framework for multilingual EHR prediction that surpasses traditional methods requiring manual standardization, facilitating global multi-institutional research.
Findings
Translation-based lingual alignment outperforms multilingual encoders.
The model surpasses baselines with manual feature selection.
Text-based framework enables effective transfer learning with few-shot fine-tuning.
Abstract
Large-scale EHR prediction across institutions is hindered by substantial heterogeneity in schemas and code systems. Although Common Data Models (CDMs) can standardize records for multi-institutional learning, the manual harmonization and vocabulary mapping are costly and difficult to scale. Text-based harmonization provides an alternative by converting raw EHR into a unified textual form, enabling pooled learning without explicit standardization. However, applying this paradigm to multi-national datasets introduces an additional layer of heterogeneity, which is "language" that must be addressed for truly scalable EHRs learning. In this work, we investigate multilingual multi-institutional learning for EHR prediction, aiming to enable pooled training across multinational ICU datasets without manual standardization. We compare two practical strategies for handling language barriers: (i)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
