RAP: Retrieve, Adapt, and Prompt-Fit for Training-Free Few-Shot Medical Image Segmentation
Zhihao Mao, Bangpu Chen

TL;DR
RAP is a training-free framework that leverages retrieval, structural adaptation, and prompt-based refinement of SAM2 for improved few-shot medical image segmentation, exploiting anatomical consistency across patients.
Contribution
It introduces a novel training-free approach combining retrieval, boundary-aware adaptation, and prompt-based refinement for FSMIS, outperforming prior methods.
Findings
RAP outperforms existing FSMIS methods on multiple benchmarks.
It achieves state-of-the-art performance without training or fine-tuning.
Explicit structural fitting enhances robustness under domain shifts.
Abstract
Few-shot medical image segmentation (FSMIS) has achieved notable progress, yet most existing methods mainly rely on semantic correspondences from scarce annotations while under-utilizing a key property of medical imagery: anatomical targets exhibit repeatable high-frequency morphology (e.g., boundary geometry and spatial layout) across patients and acquisitions. We propose RAP, a training-free framework that retrieves, adapts, and prompts Segment Anything Model 2 (SAM2) for FSMIS. First, RAP retrieves morphologically compatible supports from an archive using DINOv3 features to reduce brittleness in single-support choice. Second, it adapts the retrieved support mask to the query by fitting boundary-aware structural cues, yielding an anatomy-consistent pre-mask under domain shifts. Third, RAP converts the pre-mask into prompts by sampling positive points via Voronoi partitioning and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
