Multimodal Fusion of Histopathology Images and Electronic Health Records for Early Breast Cancer Diagnosis

Aditya Shribhagwan Khandelwal; Mohammad Samar Ansari; Asra Aslam

arXiv:2604.17122·cs.CV·April 21, 2026

Multimodal Fusion of Histopathology Images and Electronic Health Records for Early Breast Cancer Diagnosis

Aditya Shribhagwan Khandelwal, Mohammad Samar Ansari, Asra Aslam

PDF

TL;DR

This study develops a multimodal framework combining histopathology images and electronic health records, significantly improving early breast cancer diagnosis accuracy and interpretability over unimodal models.

Contribution

It introduces an integrated multimodal approach that outperforms individual models, demonstrating the value of combining image and clinical data for breast cancer diagnosis.

Findings

01

ResNet-18 achieves near-perfect accuracy and AUC on image classification.

02

XGBoost attains 98% accuracy on EHR prediction.

03

Fusion model achieves a macro-AUC of 0.997, surpassing unimodal baselines.

Abstract

Breast cancer is a leading cause of cancer-related mortality worldwide, and timely accurate diagnosis is critical to improving survival outcomes. While convolutional neural networks (CNNs) have demonstrated strong performance on histopathology image classification, and machine learning models on structured electronic health records (EHR) have shown utility for clinical risk stratification, most existing work treats these modalities in isolation. This paper presents a systematic multimodal framework that integrates patch-level histopathology features from the BreCaHAD dataset with structured clinical data from MIMIC-IV. We train and evaluate unimodal image models (a simple CNN baseline and ResNet-18 with transfer learning), unimodal tabular models (XGBoost and a multilayer perceptron), and an intermediate-fusion model that concatenates latent representations from both modalities.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.