# Leveraging single-cell foundation models for accurate survival outcome prediction

**Authors:** Wei Liu, Qiang Wang, Lin Long, Wei Wang

PMC · DOI: 10.1093/bioadv/vbag076 · 2026-03-16

## TL;DR

This paper explores using single-cell foundation models to improve cancer survival predictions from bulk RNA-seq data.

## Contribution

The novel EGSP model combines foundation model embeddings with gene and clinical data to enhance survival prediction accuracy.

## Key findings

- The EGSP model achieved a mean concordance index of 0.724 across 25 cancer types.
- Embeddings from scFoundation showed lower redundancy with gene expression while retaining complementary signals.
- Prognostic embeddings captured interpretable biological programs like tumor differentiation and immune activity.

## Abstract

Foundation models trained on large-scale single-cell transcriptomes can capture rich molecular representations of cellular states, yet their potential for cancer survival prediction from bulk RNA-seq data remains largely unexplored.

We applied the single-cell foundation model scFoundation to derive patient-level embeddings across 25 cancer types from TCGA and systematically evaluated their prognostic value under both cancer-specific and pan-cancer settings. To leverage complementary information, we developed an Embedding–Gene–Survival Prediction (EGSP) model that integrates foundation model embeddings with gene expression and clinical variables. EGSP achieved a mean concordance index (C-index) of 0.724 across cancers and exceeded 0.8 in seven cancer types, consistently outperforming single-modality models and existing multi-omics survival approaches. Comparative analyses showed that embeddings derived from pretrained scFoundation weights exhibited lower redundancy with gene expression while retaining complementary prognostic signals relative to pan-cancer fine-tuned embeddings. Explainable AI analyses further revealed that prognostic embeddings capture interpretable biological programs related to tumor differentiation, immune activity, and tumor-intrinsic growth, enabling transparent survival prediction at both cohort and patient levels. Overall, single-cell foundation model embeddings provide biologically meaningful and partially non-redundant survival signals that substantially improve bulk RNA-seq–based prognostic modeling.

https://github.com/weiliu123/EGSP.

## Full-text entities

- **Diseases:** cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC13032892/full.md

---
Source: https://tomesphere.com/paper/PMC13032892