Generalizable Whole Slide Image Classification with Fine-Grained Visual-Semantic Interaction
Hao Li, Ying Chen, Yifei Chen, Wenxian Yang, Bowen Ding, Yuchen Han,, Liansheng Wang, Rongshan Yu

TL;DR
This paper introduces the FiVE framework, which enhances whole slide image classification by leveraging fine-grained visual-semantic interactions and localized pathological descriptions, improving generalization and transferability across diverse tasks.
Contribution
The paper proposes a novel FiVE framework that uses fine-grained visual-semantic interaction and language-driven descriptions to improve WSI classification generalization.
Findings
Outperforms existing methods on TCGA Lung Cancer dataset with at least 9.19% higher accuracy.
Utilizes language models to extract detailed pathological descriptions from reports.
Sampling of visual instances improves training efficiency and robustness.
Abstract
Whole Slide Image (WSI) classification is often formulated as a Multiple Instance Learning (MIL) problem. Recently, Vision-Language Models (VLMs) have demonstrated remarkable performance in WSI classification. However, existing methods leverage coarse-grained pathogenetic descriptions for visual representation supervision, which are insufficient to capture the complex visual appearance of pathogenetic images, hindering the generalizability of models on diverse downstream tasks. Additionally, processing high-resolution WSIs can be computationally expensive. In this paper, we propose a novel "Fine-grained Visual-Semantic Interaction" (FiVE) framework for WSI classification. It is designed to enhance the model's generalizability by leveraging the interaction between localized visual patterns and fine-grained pathological semantics. Specifically, with meticulously designed queries, we start…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques · Machine Learning and Data Classification
