SAGE-FM: A lightweight and interpretable spatial transcriptomics foundation model
Xianghao Zhan, Jingyu Xu, Yuanning Zheng, Zinaida Good, Olivier Gevaert

TL;DR
SAGE-FM is a lightweight, interpretable graph convolutional network-based model that captures spatial gene expression relationships, outperforming existing methods in clustering, biological heterogeneity, and downstream tasks in spatial transcriptomics.
Contribution
Introduces SAGE-FM, a novel spatial transcriptomics foundation model based on GCNs that is lightweight, interpretable, and effective across multiple biological and computational tasks.
Findings
Achieves 91% correlation for masked gene recovery.
Outperforms MOFA in clustering and heterogeneity preservation.
Enables accurate downstream tasks like tumor subtype prediction.
Abstract
Spatial transcriptomics enables spatial gene expression profiling, motivating computational models that capture spatially conditioned regulatory relationships. We introduce SAGE-FM, a lightweight spatial transcriptomics foundation model based on graph convolutional networks (GCNs) trained with a masked central spot prediction objective. Trained on 416 human Visium samples spanning 15 organs, SAGE-FM learns spatially coherent embeddings that robustly recover masked genes, with 91% of masked genes showing significant correlations (p < 0.05). The embeddings generated by SAGE-FM outperform MOFA and existing spatial transcriptomics methods in unsupervised clustering and preservation of biological heterogeneity. SAGE-FM generalizes to downstream tasks, enabling 81% accuracy in pathologist-defined spot annotation in oropharyngeal squamous cell carcinoma and improving glioblastoma subtype…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSingle-cell and spatial transcriptomics · Gene expression and cancer classification · Bioinformatics and Genomic Networks
