PathReasoning: A multimodal reasoning agent for query-based ROI navigation on whole-slide images
Kunpeng Zhang, Hanwen Xu, Sheng Wang

TL;DR
PathReasoning is a multimodal reasoning agent that navigates gigapixel whole-slide images by iterative question-guided reasoning, significantly improving ROI selection and diagnostic report accuracy without dense annotations.
Contribution
It introduces a novel multi-modal reasoning framework that mimics pathologist navigation, enabling efficient, interpretable, and annotation-free ROI identification in large-scale pathology images.
Findings
Outperforms existing ROI-selection methods by 6.7% and 3.1% AUROC on key tasks.
Achieves 10% higher accuracy in breast cancer report generation compared to GPT-4o.
Effectively constructs interpretable reasoning chains for digital pathology analysis.
Abstract
Deciphering tumor microenvironment from Whole Slide Images (WSIs) is intriguing as it is key to cancer diagnosis, prognosis and treatment response. While these gigapixel images on one hand offer a comprehensive portrait of cancer, on the other hand, the extremely large size, as much as more than 10 billion pixels, make it challenging and time-consuming to navigate to corresponding regions to support diverse clinical inspection. Inspired by pathologists who conducted navigation on WSIs with a combination of sampling, reasoning and self-reflection, we proposed "PathReasoning", a multi-modal reasoning agent that iteratively navigates across WSIs through multiple rounds of reasoning and refinements. Specifically, starting with randomly sampled candidate regions, PathReasoning reviews current selections with self-reflection, reasoning over the correspondence between visual observations and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAI in cancer detection · Multimodal Machine Learning Applications · Generative Adversarial Networks and Image Synthesis
