Automated Histopathology Report Generation via Pyramidal Feature Extraction and the UNI Foundation Model
Ahmet Halici, Ece Tugba Cebeci, Musa Balci, Mustafa Cini, and Serkan Sokmen

TL;DR
This paper introduces a hierarchical vision-language framework for automated histopathology report generation from gigapixel whole slide images, combining a foundation model, multi-resolution patch selection, and retrieval verification to enhance accuracy and reliability.
Contribution
It presents a novel multi-resolution patch selection and retrieval-based verification method integrated with a foundation model for improved report generation from WSIs.
Findings
Effective multi-resolution patch selection reduces computational load.
Retrieval verification improves report accuracy.
The framework outperforms existing methods in diagnostic report quality.
Abstract
Generating diagnostic text from histopathology whole slide images (WSIs) is challenging due to the gigapixel scale of the input and the requirement for precise, domain specific language. We propose a hierarchical vision language framework that combines a frozen pathology foundation model with a Transformer decoder for report generation. To make WSI processing tractable, we perform multi resolution pyramidal patch selection (downsampling factors 2^3 to 2^6) and remove background and artifacts using Laplacian variance and HSV based criteria. Patch features are extracted with the UNI Vision Transformer and projected to a 6 layer Transformer decoder that generates diagnostic text via cross attention. To better represent biomedical terminology, we tokenize the output using BioGPT. Finally, we add a retrieval based verification step that compares generated reports with a reference corpus…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Multimodal Machine Learning Applications · Topic Modeling
