Generalizable Whole Slide Image Classification with Fine-Grained   Visual-Semantic Interaction

Hao Li; Ying Chen; Yifei Chen; Wenxian Yang; Bowen Ding; Yuchen Han,; Liansheng Wang; Rongshan Yu

arXiv:2402.19326·cs.CV·April 8, 2024·2 cites

Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction

Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han,, Liansheng Wang, Rongshan Yu

PDF

Open Access 1 Repo

TL;DR

This paper introduces the FiVE framework, which enhances whole slide image classification by leveraging fine-grained visual-semantic interactions and localized pathological descriptions, improving generalization and transferability across diverse tasks.

Contribution

The paper proposes a novel FiVE framework that uses fine-grained visual-semantic interaction and language-driven descriptions to improve WSI classification generalization.

Findings

01

Outperforms existing methods on TCGA Lung Cancer dataset with at least 9.19% higher accuracy.

02

Utilizes language models to extract detailed pathological descriptions from reports.

03

Sampling of visual instances improves training efficiency and robustness.

Abstract

Whole Slide Image (WSI) classification is often formulated as a Multiple Instance Learning (MIL) problem. Recently, Vision-Language Models (VLMs) have demonstrated remarkable performance in WSI classification. However, existing methods leverage coarse-grained pathogenetic descriptions for visual representation supervision, which are insufficient to capture the complex visual appearance of pathogenetic images, hindering the generalizability of models on diverse downstream tasks. Additionally, processing high-resolution WSIs can be computationally expensive. In this paper, we propose a novel "Fine-grained Visual-Semantic Interaction" (FiVE) framework for WSI classification. It is designed to enhance the model's generalizability by leveraging the interaction between localized visual patterns and fine-grained pathological semantics. Specifically, with meticulously designed queries, we start…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ls1rius/wsi_five
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Machine Learning and Data Classification