Attention-Based Multimodal Survival Prediction with Cross-Modal Bilinear Fusion

Hassan Keshvarikhojasteh; Josien P.W. Pluim; Mitko Veta

arXiv:2605.13897·q-bio.QM·May 15, 2026

Attention-Based Multimodal Survival Prediction with Cross-Modal Bilinear Fusion

Hassan Keshvarikhojasteh, Josien P.W. Pluim, Mitko Veta

PDF

1 Repo

TL;DR

This paper introduces a multimodal deep learning framework combining histology, RNA-seq, and clinical data for patient survival prediction, emphasizing interpretability and efficiency.

Contribution

It presents a novel fusion architecture using cross-modal bilinear interactions for improved multimodal survival prediction.

Findings

01

Outperforms concatenation-based baselines in predictive accuracy.

02

Demonstrates competitive generalization on unseen cohorts.

03

Provides a structurally interpretable and parameter-efficient fusion method.

Abstract

We propose a novel multimodal deep learning framework for patient-level survival prediction, which integrates whole-slide histology features, RNA-seq expression profiles, and clinical variables. Our architecture combines an ABMIL module~\cite{ilse2018attention} for slide-level representation with feedforward encoders for RNA and clinical data. These embeddings are then integrated through low-rank bilinear cross-modal fusion~\cite{liu2018efficient} to model conditional interactions across modalities while controlling parameter growth. The model outputs continuous risk scores that are subsequently mapped to survival times using a nonparametric calibration procedure based on the Kaplan--Meier estimator~\cite{kaplan1958nonparametric}. By decomposing multimodal reasoning into independent pairwise interactions, the proposed fusion design promotes structural interpretability and parameter…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hassancpu/ChimeraChallenge2025_Task_3
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.