Multi-Text Guided Few-Shot Semantic Segmentation

Qiang Jiao; Bin Yan; Yi Yang; Mengrui Shi; Qiang Zhang

arXiv:2511.15515·cs.CV·November 20, 2025

Multi-Text Guided Few-Shot Semantic Segmentation

Qiang Jiao, Bin Yan, Yi Yang, Mengrui Shi, Qiang Zhang

PDF

Open Access

TL;DR

This paper introduces MTGNet, a dual-branch framework that fuses multiple textual prompts and enhances cross-modal interaction to improve few-shot semantic segmentation, especially for complex categories with high intra-class variation.

Contribution

The paper proposes a novel multi-text guided framework with modules for textual prior refinement, semantic anchor fusion, and visual prior enhancement, addressing limitations of single-prompt methods.

Findings

01

Achieves 76.8% mIoU on PASCAL-5i in 1-shot setting.

02

Attains 57.4% mIoU on COCO-20i in 1-shot setting.

03

Shows significant improvements on categories with high intra-class variation.

Abstract

Recent CLIP-based few-shot semantic segmentation methods introduce class-level textual priors to assist segmentation by typically using a single prompt (e.g., a photo of class). However, these approaches often result in incomplete activation of target regions, as a single textual description cannot fully capture the semantic diversity of complex categories. Moreover, they lack explicit cross-modal interaction and are vulnerable to noisy support features, further degrading visual prior quality. To address these issues, we propose the Multi-Text Guided Few-Shot Semantic Segmentation Network (MTGNet), a dual-branch framework that enhances segmentation performance by fusing diverse textual prompts to refine textual priors and guide the cross-modal optimization of visual priors. Specifically, we design a Multi-Textual Prior Refinement (MTPR) module that suppresses interference and aggregates…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Multimodal Machine Learning Applications