Graph-Text Multi-Modal Pre-training for Medical Representation Learning

Sungjin Park; Seongsu Bae; Jiho Kim; Tackeun Kim; Edward Choi

arXiv:2203.09994·cs.CL·March 21, 2022·1 cites

Graph-Text Multi-Modal Pre-training for Medical Representation Learning

Sungjin Park, Seongsu Bae, Jiho Kim, Tackeun Kim, Edward Choi

PDF

Open Access 1 Repo

TL;DR

MedGTX is a novel multi-modal pre-trained model that effectively learns joint representations of structured and unstructured EHR data using graph and text encoders, improving performance on clinical tasks.

Contribution

This work introduces MedGTX, a pre-trained model with a graph encoder and cross-modal learning for EHR data, addressing the challenge of combining structured and unstructured information.

Findings

01

Consistent improvement on clinical benchmarks

02

Effective joint representation of EHR modalities

03

Pre-training benefits downstream tasks

Abstract

As the volume of Electronic Health Records (EHR) sharply grows, there has been emerging interest in learning the representation of EHR for healthcare applications. Representation learning of EHR requires appropriate modeling of the two dominant modalities in EHR: structured data and unstructured text. In this paper, we present MedGTX, a pre-trained model for multi-modal representation learning of the structured and textual EHR data. MedGTX uses a novel graph encoder to exploit the graphical nature of structured EHR data, and a text encoder to handle unstructured text, and a cross-modal encoder to learn a joint representation space. We pre-train our model through four proxy tasks on MIMIC-III, an open-source EHR data, and evaluate our model on two clinical benchmarks and three novel downstream tasks which tackle real-world problems in EHR data. The results consistently show the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sjpark9503/kg_txt_multimodal
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning in Healthcare · Advanced Graph Neural Networks · Topic Modeling