A Large-Scale Benchmark of Cross-Modal Learning for Histology and Gene Expression in Spatial Transcriptomics
Rushin H. Gindra, Giovanni Palla, Mathias Nguyen, Sophia J. Wagner, Manuel Tran, Fabian J Theis, Dieter Saur, Lorin Crawford, Tingying Peng

TL;DR
This paper introduces HESCAPE, a comprehensive benchmark for cross-modal learning in spatial transcriptomics, revealing insights into model performance, batch effects, and the need for robust multimodal methods.
Contribution
We present HESCAPE, the first large-scale benchmark dataset and evaluation framework for cross-modal contrastive learning in spatial transcriptomics.
Findings
Gene expression encoders are key for representational alignment.
Pretrained gene models outperform non-spatial baselines.
Contrastive pretraining improves mutation classification but harms gene expression prediction.
Abstract
Spatial transcriptomics enables simultaneous measurement of gene expression and tissue morphology, offering unprecedented insights into cellular organization and disease mechanisms. However, the field lacks comprehensive benchmarks for evaluating multimodal learning methods that leverage both histology images and gene expression data. Here, we present HESCAPE, a large-scale benchmark for cross-modal contrastive pretraining in spatial transcriptomics, built on a curated pan-organ dataset spanning 6 different gene panels and 54 donors. We systematically evaluated state-of-the-art image and gene expression encoders across multiple pretraining strategies and assessed their effectiveness on two downstream tasks: gene mutation classification and gene expression prediction. Our benchmark demonstrates that gene expression encoders are the primary determinant of strong representational…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Domain Adaptation and Few-Shot Learning · Cell Image Analysis Techniques
