PRS-Med: Position Reasoning Segmentation in Medical Imaging
Quoc-Huy Trinh, Minh-Van Nguyen, Jun Zeng, Debesh Jha, Ulas Bagci

TL;DR
PRS-Med introduces a clinical-first, position reasoning segmentation framework for medical imaging, leveraging a vision-language model and a large expert-validated dataset to improve accuracy and interpretability.
Contribution
It presents a novel, scalable approach combining a vision-language model with a new spatial reasoning dataset, enhancing medical image segmentation accuracy and clinical reliability.
Findings
Segmentation accuracy improved with up to +31.2% Dice score.
The PosMed dataset contains 116,000 expert-validated spatial QA pairs.
Outperforms state-of-the-art models in clinical reasoning and interpretability.
Abstract
Prompt-based medical image segmentation has rapidly emerged, yet existing methods rely on explicit prompts like bounding boxes and struggle to reason about the spatial relationships essential for clinical diagnosis. While general-domain models attempt complex coordinate regression, these approaches often lack the structured reliability required for medical applications. In this work, we introduce PRS-Med, a unified framework that adopts an elegant, clinical-first approach to position reasoning segmentation. By utilizing a medical vision-language model integrated with a segmentation decoder, PRS-Med mimics the structured "search patterns" used by radiologists to identify pathologies within specific anatomical zones. To support this robust reasoning, we present the Medical Position Reasoning Segmentation (PosMed) dataset, comprising 116,000 expert-validated, spatially grounded…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
