# SciSt: single-cell reference-informed spatial gene expression prediction from pathological images

**Authors:** Yixin Li, Fan Zhong, Lei Liu

PMC · DOI: 10.1093/bib/bbaf613 · 2025-11-20

## TL;DR

SciSt is a deep learning framework that predicts spatial gene expression from H&E-stained images using biological knowledge, improving accuracy and interpretability.

## Contribution

SciSt introduces a novel framework that integrates pathological features with biologically informed gene expressions for spatial transcriptomics prediction.

## Key findings

- SciSt outperformed existing models by 21.4% and 13.7% on benchmark datasets.
- The model demonstrated robust generalization on TCGA-BRCA and TCGA-LIHC cohorts.
- SciSt enables cross-modal translation between morphology and gene expression.

## Abstract

The widespread application of spatial transcriptomics in uncovering disease mechanisms remains limited by the scarcity of samples and the high experimental costs, which have not declined substantially in recent years. Unlocking the vast resources of clinical H&E-stained images could provide an efficient and cost-effective alternative for large-scale spatial analysis. However, predicting spatial gene expression from histopathological images remains challenging, as existing end-to-end frameworks often fail to capture the intrinsic transcriptomic structures observed in real transcriptomics data. To address this, we developed SciSt, a deep learning framework that predicts spatial gene expression by integrating pathological features with biologically informed initial gene expressions. These initial expressions are generated through a weighted strategy combining cell segmentation and single-cell reference data, thereby enhancing biological interpretability. SciSt achieved state-of-the-art performance across three benchmark datasets, outperforming the second-best models by 21.4% and 13.7%, respectively, and demonstrated robust generalization on the TCGA-BRCA and TCGA-LIHC cohorts. Beyond accurate prediction, SciSt enables cross-modal translation between morphology and gene expression, offering new avenues for mining the untapped potential of clinical image archives. This work highlights how prior biological knowledge can substantially advance the interpretability and scalability of biomedical AI models.

## Full-text entities

- **Genes:** BRCA1 (BRCA1 DNA repair associated) [NCBI Gene 672] {aka BRCAI, BRCC1, BROVCA1, FANCS, IRIS, PNCA4}
- **Chemicals:** H&amp;E (MESH:D006371)

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12632197/full.md

---
Source: https://tomesphere.com/paper/PMC12632197