Improving mitosis detection on histopathology images using large vision-language models
Ruiwen Ding, James Hall, Neil Tenenholtz, Kristen Severson

TL;DR
This paper enhances mitosis detection in histopathology images by employing large vision-language models that integrate visual features with natural language, formulated as captioning and VQA tasks, outperforming baseline models.
Contribution
It introduces a novel approach using pre-trained vision-language models for mitosis detection, incorporating metadata as context to improve accuracy.
Findings
Improved detection accuracy over baseline models
Effective use of metadata like tumor and scanner types
Demonstrated on MIDOG22 dataset with large number of samples
Abstract
In certain types of cancerous tissue, mitotic count has been shown to be associated with tumor proliferation, poor prognosis, and therapeutic resistance. Due to the high inter-rater variability of mitotic counting by pathologists, convolutional neural networks (CNNs) have been employed to reduce the subjectivity of mitosis detection in hematoxylin and eosin (H&E)-stained whole slide images. However, most existing models have performance that lags behind expert panel review and only incorporate visual information. In this work, we demonstrate that pre-trained large-scale vision-language models that leverage both visual features and natural language improve mitosis detection accuracy. We formulate the mitosis detection task as an image captioning task and a visual question answering (VQA) task by including metadata such as tumor and scanner types as context. The effectiveness of our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Multimodal Machine Learning Applications · Digital Imaging for Blood Diseases
