Multi-Modal Foundation Models for Computational Pathology: A Survey
Dong Li, Guihong Wan, Xintao Wu, Xinyu Wu, Xiaohui Chen, Yi He,, Christine G. Lian, Peter K. Sorger, Yevgeniy R. Semenov, Chen Zhao

TL;DR
This survey reviews recent advances in multi-modal foundation models for computational pathology, focusing on integrating visual data with text, knowledge, and molecular profiles to improve analysis of histopathological images.
Contribution
It categorizes and analyzes 32 state-of-the-art multi-modal models, datasets, and tasks, providing a comprehensive taxonomy and highlighting future research directions.
Findings
Classification of models into vision-language, vision-knowledge graph, and vision-gene expression paradigms.
Analysis of 28 multi-modal datasets for pathology applications.
Identification of key challenges and future research directions.
Abstract
Foundation models have emerged as a powerful paradigm in computational pathology (CPath), enabling scalable and generalizable analysis of histopathological images. While early developments centered on uni-modal models trained solely on visual data, recent advances have highlighted the promise of multi-modal foundation models that integrate heterogeneous data sources such as textual reports, structured domain knowledge, and molecular profiles. In this survey, we provide a comprehensive and up-to-date review of multi-modal foundation models in CPath, with a particular focus on models built upon hematoxylin and eosin (H&E) stained whole slide images (WSIs) and tile-level representations. We categorize 32 state-of-the-art multi-modal foundation models into three major paradigms: vision-language, vision-knowledge graph, and vision-gene expression. We further divide vision-language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Medical Image Segmentation Techniques
MethodsFocus
