TL;DR
HEIST is a hierarchical graph transformer model that integrates spatial transcriptomics and proteomics data, capturing cellular heterogeneity and tissue context to improve biological insights and generalize across data types.
Contribution
The paper introduces HEIST, a novel hierarchical graph transformer that models tissue data at multiple levels, enabling cross-level message passing and generalization to unseen data types.
Findings
Reveals spatially informed cell subpopulations missed by prior models.
Achieves state-of-the-art performance in clinical outcome prediction.
Demonstrates strong generalization to proteomics data.
Abstract
Single-cell transcriptomics and proteomics have become a great source for data-driven insights into biology, enabling the use of advanced deep learning methods to understand cellular heterogeneity and gene expression at the single-cell level. With the advent of spatial-omics data, we have the promise of characterizing cells within their tissue context as it provides both spatial coordinates and intra-cellular transcriptional or protein counts. Proteomics offers a complementary view by directly measuring proteins, which are the primary effectors of cellular function and key therapeutic targets. However, existing models either ignore the spatial information or the complex genetic and proteomic programs within cells. Thus they cannot infer how cell internal regulation adapts to microenvironmental cues. Furthermore, these models often utilize fixed gene vocabularies, hindering their…
Peer Reviews
Decision·ICLR 2026 Poster
- Modeling spatial cell graphs with gene co-expression networks seems well-motivated, as previous work emphasized either one - The inference time speed-up over other methods seems good - Downstream results seem promising
Weaknesses/Questions: - How is $\tau$ chosen? Is it different across cell-types? How does varying $\tau$ affect the results? - What is $\sigma$ in L240? - It would be interesting to see if the model is capable of identifying known inter-cell interactions. Currently biological interpretability analysis seems limited - While the loss seems interesting, I am slightly skeptical if the MAE + CL work together; how does perform change on tasks if you train on either? Overall, I think the paper pres
1. A graph foundation model is proposed for analysing spatial transcriptomics and proteomics data. 2. The method incorporates two hierarchical graph levels — the spatial cell graph and the gene co-expression graph — within a hierarchical graph transformer framework for effective embedding learning. 3. Extensive experiments on multiple datasets are conducted to evaluate the effectiveness of the proposed approach.
1. The detailed descriptions of the spatial cell graph and the gene co-expression graph are insufficient and should be clarified. 2. Figure 2 does not effectively illustrate the main concept or workflow of the HEIST architecture. 3. Regarding the contrastive learning design, the c–c, g–g, and c–g objectives are formulated separately. It is unclear whether combining these formulations in a joint loss function might lead to a decrease or not in performance. 4. Concerning the weighted sum of the
- A clear hierarchical formulation that couples intra-cell gene programs with inter-cell spatial context via cross-level message passing. - Ambitious pretraining scope (22.3M cells across multiple organs). - Demonstrates SOTA-level results on clustering, annotation, imputation, and outcome prediction, with ablations highlighting the contributions of hierarchy, cross-level passing, and loss design.
Although the idea is interesting, several components are not clearly motivated, and some key method and experiment settings are missing.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsContrastive Learning
