Linking spatial biology and clinical histology via Haiku

Yan Cui; Jacob S. Leiby; Wenhui Lei; Dokyoon Kim; Yanxiang Deng; Aaron T. Mayer; Zhenqin Wu; Alexandro E. Trevino; Zhi Huang

arXiv:2605.00925·cs.LG·May 5, 2026

Linking spatial biology and clinical histology via Haiku

Yan Cui, Jacob S. Leiby, Wenhui Lei, Dokyoon Kim, Yanxiang Deng, Aaron T. Mayer, Zhenqin Wu, Alexandro E. Trevino, Zhi Huang

PDF

1 Repo

TL;DR

Haiku is a tri-modal contrastive learning model that integrates spatial proteomics, histology, and clinical data to enhance biomedical analysis and enable zero-shot biomarker inference.

Contribution

The paper introduces Haiku, a novel model that aligns three modalities in a shared space, improving retrieval, classification, and biomarker prediction in spatial biology.

Findings

01

Haiku achieves high cross-modal retrieval accuracy (Recall@50 up to 0.611).

02

It improves survival prediction with a C-index of 0.737.

03

Enables zero-shot biomarker inference with a mean Pearson correlation of 0.718.

Abstract

Integrating molecular, morphological, and clinical data is essential for basic and translational biomedical research, yet systematic frameworks for jointly modeling these modalities remain limited. Here we present Haiku, a tri-modal contrastive learning model trained on multiplexed immunofluorescence (mIF). It comprises 26.7 million spatial proteomics patches from 3,218 tissue sections across 1,606 patients spanning 11 organ types, with matched hematoxylin and eosin (H&E) histology and clinical metadata aligned in a shared embedding space. Haiku enables three-way cross-modal retrieval, improves downstream classification and clinical prediction tasks over unimodal baselines, and supports zero-shot biomarker inference through fusion retrieval conditioned on clinical metadata-only text descriptions. Across tasks, Haiku outperforms competing approaches, achieving cross-modal retrieval…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhihuanglab/Haiku
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.