Multimodal BEHRT: transformers for multimodal electronic health records to predict breast cancer prognosis
Ndèye Maguette Mbaye, Michael M. Danziger, Michal Rosen-Zvi, Aullène Toussaint, Elise Dumas, Julien Guerin, Anne-Sophie Hamy-Petit, Fabien Reyal, Chloé-Agathe Azencott

TL;DR
This paper introduces M-BEHRT, a deep learning model that uses electronic health records to predict breast cancer prognosis more accurately than traditional methods.
Contribution
M-BEHRT is a novel multimodal transformer model for EHR data that achieves better prognosis prediction with fewer records than typical deep learning studies.
Findings
M-BEHRT achieved an AUC-ROC of 0.77, outperforming the Nottingham Prognostic Index and random forests.
The model performs particularly well for older patients with at least one affected lymph node.
M-BEHRT demonstrates the potential of EHR data and transformer models for improving breast cancer prognosis.
Abstract
Electronic Health Records (EHRs) contain a wealth of information about patients that could be useful toward improving treatment outcomes for breast cancer patients, but remain mostly unexploited. Recent methodological developments in deep learning, however, open the way to developing new methods to leverage this information to improve patient care. We propose M-BEHRT, a Multimodal BERT for EHR data based on BEHRT, itself an architecture based on the popular natural language architecture BERT (Bidirectional Encoder Representations from Transformers). M-BEHRT models multimodal patient trajectories as a sequence of medical visits, comprising a variety of information such as clinical features, results from biological lab tests, medical department and procedure, and the content of free-text medical reports. M-BEHRT uses a pretraining task analog to a masked language model to learn a…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Biomedical Text Mining and Ontologies · Artificial Intelligence in Healthcare
