Language models are good pathologists: using attention-based sequence reduction and text-pretrained transformers for efficient WSI classification
Juan I. Pisula, Katarzyna Bozek

TL;DR
This paper introduces SeqShort, a sequence shortening layer for transformer-based WSI classification that reduces computational costs and improves performance by leveraging text-pretrained transformers with minimal fine-tuning.
Contribution
The paper proposes SeqShort for efficient WSI processing and demonstrates that pre-trained text transformers can be effectively adapted for pathology image classification.
Findings
SeqShort reduces memory and compute requirements for WSI classification.
Pre-trained text transformers improve WSI classification accuracy.
Minimal fine-tuning (<0.1%) suffices for effective model adaptation.
Abstract
In digital pathology, Whole Slide Image (WSI) analysis is usually formulated as a Multiple Instance Learning (MIL) problem. Although transformer-based architectures have been used for WSI classification, these methods require modifications to adapt them to specific challenges of this type of image data. Among these challenges is the amount of memory and compute required by deep transformer models to process long inputs, such as the thousands of image patches that can compose a WSI at or magnification. We introduce \textit{SeqShort}, a multi-head attention-based sequence shortening layer to summarize each WSI in a fixed- and short-sized sequence of instances, that allows us to reduce the computational costs of self-attention on long sequences, and to include positional information that is unavailable in other MIL approaches. Furthermore, we show that WSI…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Radiomics and Machine Learning in Medical Imaging · Biomedical Text Mining and Ontologies
