ST-Align: A Multimodal Foundation Model for Image-Gene Alignment in   Spatial Transcriptomics

Yuxiang Lin; Ling Luo; Ying Chen; Xushi Zhang; Zihui Wang; Wenxian; Yang; Mengsha Tong; Rongshan Yu

arXiv:2411.16793·cs.CV·November 27, 2024·3 cites

ST-Align: A Multimodal Foundation Model for Image-Gene Alignment in Spatial Transcriptomics

Yuxiang Lin, Ling Luo, Ying Chen, Xushi Zhang, Zihui Wang, Wenxian, Yang, Mengsha Tong, Rongshan Yu

PDF

Open Access

TL;DR

ST-Align is a novel multimodal foundation model that aligns pathological images with genomic data in spatial transcriptomics by incorporating spatial context and multi-scale alignment, improving analysis and reducing costs.

Contribution

It introduces the first foundation model for spatial transcriptomics that deeply integrates image and gene data with spatial context through a novel pretraining framework.

Findings

01

Outperforms existing methods in zero-shot and few-shot tasks

02

Pretrained on 1.3 million spot-niche pairs

03

Enhances understanding of tissue architecture

Abstract

Spatial transcriptomics (ST) provides high-resolution pathological images and whole-transcriptomic expression profiles at individual spots across whole-slide scales. This setting makes it an ideal data source to develop multimodal foundation models. Although recent studies attempted to fine-tune visual encoders with trainable gene encoders based on spot-level, the absence of a wider slide perspective and spatial intrinsic relationships limits their ability to capture ST-specific insights effectively. Here, we introduce ST-Align, the first foundation model designed for ST that deeply aligns image-gene pairs by incorporating spatial context, effectively bridging pathological imaging with genomic features. We design a novel pretraining framework with a three-target alignment strategy for ST-Align, enabling (1) multi-scale alignment across image-gene pairs, capturing both spot- and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSingle-cell and spatial transcriptomics