CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole   Slide Image Analysis in Computational Pathology

Yuxuan Sun; Yixuan Si; Chenglu Zhu; Xuan Gong; Kai Zhang; Pingyi Chen,; Ye Zhang; Zhongyi Shui; Tao Lin; Lin Yang

arXiv:2412.12077·cs.CV·December 17, 2024

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Yuxuan Sun, Yixuan Si, Chenglu Zhu, Xuan Gong, Kai Zhang, Pingyi Chen,, Ye Zhang, Zhongyi Shui, Tao Lin, Lin Yang

PDF

Open Access

TL;DR

CPath-Omni is a large multimodal foundation model that unifies patch and whole-slide image analysis in pathology, achieving state-of-the-art results across multiple tasks and datasets.

Contribution

It introduces the first 15-billion-parameter model that consolidates patch and WSI analysis, and develops a novel CLIP-based visual processor for pathology.

Findings

01

Achieves SOTA on 39 out of 42 datasets across seven tasks.

02

Outperforms or matches task-specific models.

03

First to integrate diverse vision models with a large language model in pathology.

Abstract

The emergence of large multimodal models (LMMs) has brought significant advancements to pathology. Previous research has primarily focused on separately training patch-level and whole-slide image (WSI)-level models, limiting the integration of learned knowledge across patches and WSIs, and resulting in redundant models. In this work, we introduce CPath-Omni, the first 15-billion-parameter LMM designed to unify both patch and WSI level image analysis, consolidating a variety of tasks at both levels, including classification, visual question answering, captioning, and visual referring prompting. Extensive experiments demonstrate that CPath-Omni achieves state-of-the-art (SOTA) performance across seven diverse tasks on 39 out of 42 datasets, outperforming or matching task-specific models trained for individual tasks. Additionally, we develop a specialized pathology CLIP-based visual…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Digital Imaging for Blood Diseases

MethodsContrastive Language-Image Pre-training