GNN-ViTCap: GNN-Enhanced Multiple Instance Learning with Vision Transformers for Whole Slide Image Classification and Captioning
S M Taslim Uddin Raju, Md. Milon Islam, Md Rezwanul Haque, Hamdi Altaheri, and Fakhri Karray

TL;DR
GNN-ViTCap introduces a novel framework combining graph neural networks and vision transformers to improve whole slide image classification and captioning, addressing redundancy and contextual challenges in histopathology analysis.
Contribution
It presents a new method integrating GNNs and vision transformers for enhanced image analysis and captioning in histopathology, outperforming existing approaches.
Findings
Achieved F1 score of 0.934 in classification
Attained BLEU-4 score of 0.811 in captioning
Outperformed state-of-the-art methods on datasets
Abstract
Microscopic assessment of histopathology images is vital for accurate cancer diagnosis and treatment. Whole Slide Image (WSI) classification and captioning have become crucial tasks in computer-aided pathology. However, microscopic WSI face challenges such as redundant patches and unknown patch positions due to subjective pathologist captures. Moreover, generating automatic pathology captions remains a significant challenge. To address these issues, we introduce a novel GNN-ViTCap framework for classification and caption generation from histopathological microscopic images. First, a visual feature extractor generates patch embeddings. Redundant patches are then removed by dynamically clustering these embeddings using deep embedded clustering and selecting representative patches via a scalar dot attention mechanism. We build a graph by connecting each node to its nearest neighbors in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis · AI in cancer detection
MethodsGraph Neural Network · Linear Layer
