# Transcriptomic-guided whole-slide image classification for molecular subtype identification

**Authors:** Weiwen Wang, Xiwen Zhang, Yuanyan Xiong

PMC · DOI: 10.1371/journal.pcbi.1013950 · PLOS Computational Biology · 2026-02-09

## TL;DR

This paper introduces TEMI, a new AI method that combines histopathology images and gene data to better identify cancer subtypes, showing that tissue appearance and gene activity are closely linked.

## Contribution

TEMI is a novel multimodal framework that integrates transcriptomic data with whole-slide images to improve molecular subtype classification of cancers.

## Key findings

- TEMI outperforms existing methods in molecular subtype classification by effectively integrating transcriptomic information.
- Morphological features learned by TEMI enhance gene expression prediction and are stable across imaging techniques.
- The study highlights the interplay between tumor morphology and transcriptomics, suggesting histological features encode latent molecular signals.

## Abstract

Recent advancements in computational pathology have greatly improved automated histopathological analysis. A compelling question in the field is how morphological traits are associated with genetic characteristics or molecular phenotypes. Here we propose TEMI, a novel framework for molecular subtype classification of cancers using whole-slide images (WSIs), augmented with transcriptomic data during training. TEMI aims to extract molecular-level signals from WSIs and make efficient use of available multimodal data. To this end, TEMI introduces a patch fusion network that captures dependencies among local patches of gigapixel WSIs to produce global representations and aligns them with transcriptomic embeddings attained from a masked transcriptomic autoencoder. TEMI achieves superior performance compared with existing methods in molecular subtype classification, owing to its effective integration of transcriptomic information achieved by the two developed alignment strategies. Guided by discriminative transcriptomic data, TEMI learns invariant WSI representations, while morphological features also enhance gene expression prediction. These findings suggest that histological features encode latent molecular signals, highlighting the interplay between the tumor microenvironment and cancer transcriptomics. Our study demonstrates how multimodal learning can bridge morphology and molecular biology, providing an effective tool to advance precision medicine.

Cancer’s intrinsic heterogeneity poses significant challenges to effective treatment. Therefore, molecular subtyping that stratifies patients into subgroups based on molecular and genetic distinctions serves as a cornerstone of precision medicine and personalized healthcare. Owing to the strong capability of artificial intelligence (AI) particularly deep learning in discovery of patterns, developing AI models to identify molecular subtypes directly from routinely available haematoxylin–eosin (H&E)-stained histopathology slides can improve the efficiency of patient stratification, enable timely treatment, and reduce medical costs. A major challenge lies in uncovering how morphological features revealed in H&E-stained slides relate to variation in molecular signals, bridging the gap between phenotypes at different layers of the biological hierarchy. To address this issue, we proposed TEMI, a method for inferring molecular subtypes from H&E-stained histopathology slides, in which the discriminative representation learning of morphological characteristics is guided by transcriptomic profiles. The experiments showed that the proposed method achieved superior performance compared with existing tools, and the guidance of transcriptomic profiles helped to learn stable representations of morphological features in change of imaging techniques. Moreover, morphological features were shown to benefit gene expression prediction. These results suggest that the developed tool is effective for molecular subtype identification from histopathology slides and underscore the links between cancer morphology and transcriptomics.

## Linked entities

- **Diseases:** cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** cancer (MESH:D009369)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12900446/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12900446/full.md

## References

47 references — full list in the complete paper: https://tomesphere.com/paper/PMC12900446/full.md

---
Source: https://tomesphere.com/paper/PMC12900446