SegRAG: Training-Free Retrieval-Augmented Semantic Segmentation

Abderrahmene Boudiaf; Irfan Hussain; Sajid Javed

arXiv:2605.17630·cs.CV·May 21, 2026

SegRAG: Training-Free Retrieval-Augmented Semantic Segmentation

Abderrahmene Boudiaf, Irfan Hussain, Sajid Javed

PDF

1 Repo

TL;DR

SegRAG is a training-free, retrieval-augmented segmentation framework that enhances open-vocabulary models like SAM3 by grounding them with class-specific prompts from a feature bank, improving performance across benchmarks.

Contribution

It introduces a novel retrieval-augmented approach with ICCD and TSG techniques, enabling training-free, zero-shot domain transfer for semantic segmentation.

Findings

01

Outperforms text-only baseline on four benchmarks, up to +3.92 mIoU on LVIS.

02

Significantly improves zero-shot domain transfer, raising mean IoU from 25.27 to 59.24.

03

Ablation studies confirm each component's contribution to overall performance.

Abstract

Open-vocabulary segmentation models such as SAM3 perform well across broad categories via text prompting, yet degrade when target classes are visually underrepresented in pretraining or depart from canonical depictions-limitations text prompts cannot resolve spatially. We present SegRAG, a training-free retrieval-augmented segmentation framework that grounds SAM3 with class-specific point prompts derived from a curated DINOv3 feature bank. Offline, dense patch-level descriptors are extracted from annotated references and filtered by Intra-Class Cohesion Distillation (ICCD), retaining only prototypes that reliably retrieve within-class foreground. At inference, Topographic Similarity Grounding (TSG) computes a cosine-similarity landscape against retrieved prototypes, identifies coherent high-confidence regions via connected-component analysis, and extracts peak locations through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

boudiafA/SegRAG
github

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.