# S2potAE: multimodal spatial spot autoencoder integrating image and transcriptomic features for deconvolution

**Authors:** Tianyi Chen, Wen Xue, Yunfei Zhang, Yongcan Luo, Cheng Liu, Wenjun Shen, Si Wu, Hau-San Wong

PMC · DOI: 10.1093/bib/bbag020 · 2026-01-31

## TL;DR

S2potAE is a new method that combines gene expression data and histology images to better understand cell-type proportions in spatial transcriptomics.

## Contribution

S2potAE introduces a novel multimodal autoencoder framework for spatial spot deconvolution integrating transcriptomic and image data.

## Key findings

- S2potAE outperforms existing methods in accuracy and robustness for spatial transcriptomics deconvolution.
- The method accurately identifies tumor boundaries and captures nuanced cell-type distributions.
- It integrates spatial coordinates and histological features through a graph-based encoder and perceptual image embeddings.

## Abstract

Spatial transcriptomics (ST) technologies have significantly advanced our ability to discern gene expression patterns within intact tissue structures, enabling unprecedented insights into cellular heterogeneity and tissue architecture. However, accurately determining cell-type proportions within spatially aggregated transcriptomic spots remains challenging due to inherent granularity discrepancies, batch effects, and spatial heterogeneity. To address these challenges, we introduce S\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
$^{2}$\end{document}potAE, a novel spatial spot autoencoder framework that integrates gene expression data, spatial coordinates, and morphological features from histology images for precise spot-level deconvolution. S\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
$^{2}$\end{document}potAE employs a multilevel feature aggregation strategy, systematically extracting and fusing spatially-aware features through a graph-based spatial encoder and perceptual image embeddings from histological patches. Furthermore, an auxiliary pathological classification task enhances biological relevance and model interpretability. Comprehensive benchmarking across multiple simulated and real datasets—including human breast cancer, mouse brain anterior, and human dorsolateral prefrontal cortex—demonstrates that S\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{upgreek}
\usepackage{mathrsfs}
\setlength{\oddsidemargin}{-69pt}
\begin{document}
$^{2}$\end{document}potAE consistently surpasses state-of-the-art methods in accuracy, robustness, and biological interpretability. Our approach effectively resolves complex cellular compositions, accurately identifies tumor boundaries, and captures nuanced cell-type distributions, significantly enhancing the utility of ST in biological research and clinical applications.

## Linked entities

- **Diseases:** breast cancer (MONDO:0004989)
- **Species:** Homo sapiens (taxon 9606), Mus musculus (taxon 10090)

## Full-text entities

- **Diseases:** tumor (MESH:D009369), breast cancer (MESH:D001943)
- **Species:** Homo sapiens (human, species) [taxon 9606], Mus musculus (house mouse, species) [taxon 10090]

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12860387/full.md

---
Source: https://tomesphere.com/paper/PMC12860387