# Dual-stream cross-modal fusion alignment network for survival analysis

**Authors:** Jinmiao Song, Yatong Hao, Shuang Zhao, Peng Zhang, Qilin Feng, Qiguo Dai, Xiaodong Duan

PMC · DOI: 10.1093/bib/bbaf103 · 2025-03-21

## TL;DR

This paper introduces a new framework for predicting cancer patient survival by combining histopathological images and genomic data more effectively.

## Contribution

The novel DSCASurv framework improves survival prediction by addressing limitations in cross-modal fusion and local feature extraction.

## Key findings

- DSCASurv outperforms existing methods on five benchmark cancer datasets.
- The framework effectively integrates local and global features across modalities.
- Cross-modal attention enhances complementary information transfer for survival analysis.

## Abstract

Survival prediction serves as a pivotal component in precision oncology, enabling the optimization of treatment strategies through mortality risk assessment. While the integration of histopathological images and genomic profiles offers enhanced potential for patient stratification, existing methodologies are constrained by two fundamental limitations: (i) insufficient attention to fine-grained local features in favor of global representations, and (ii) suboptimal cross-modal fusion strategies that either neglect intrinsic correlations or discard modality-specific information. To address these challenges, we propose DSCASurv, a novel cross-modal fusion alignment framework designed to explore and integrate intrinsic correlations across multimodal data, thereby improving the accuracy of survival prediction. Specifically, DSCASurv leverages the local feature extraction capabilities of convolutional layers and the long-range dependency modeling of scanning state space models to extract intra-modal representations, while generating cross-modal representations through dual parallel mixer architectures. A cross-modal attention module functions as a bridge for inter-modal information exchange and complementary information transfer. The framework ultimately integrates all intra-modal representations to generate survival predictions by enhancing and recalibrating complementary information. Extensive experiments on five benchmark cancer datasets demonstrate the superior performance of our approach compared to existing methods.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11926988/full.md

---
Source: https://tomesphere.com/paper/PMC11926988