Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation   and Beyond

Chen Shuai; Meng Fanman; Zhang Runtong; Qiu Heqian; Li Hongliang; Wu; Qingbo; Xu Linfeng

arXiv:2308.07539·cs.CV·August 16, 2023·1 cites

Visual and Textual Prior Guided Mask Assemble for Few-Shot Segmentation and Beyond

Chen Shuai, Meng Fanman, Zhang Runtong, Qiu Heqian, Li Hongliang, Wu, Qingbo, Xu Linfeng

PDF

Open Access

TL;DR

This paper introduces PGMA-Net, a novel few-shot segmentation model that leverages visual and textual priors with a class-agnostic mask assembly process, achieving state-of-the-art results and versatility across multiple segmentation tasks.

Contribution

The paper proposes a class-agnostic mask assembly network with diverse, plug-and-play interactions, enabling improved generalization and multi-task capabilities without extra re-training.

Findings

01

Achieves state-of-the-art mIoU of 77.6 on PASCAL-5^i in 1-shot.

02

Effective in cross-domain and zero-shot segmentation tasks.

03

Operates without class-specific information or additional training.

Abstract

Few-shot segmentation (FSS) aims to segment the novel classes with a few annotated images. Due to CLIP's advantages of aligning visual and textual information, the integration of CLIP can enhance the generalization ability of FSS model. However, even with the CLIP model, the existing CLIP-based FSS methods are still subject to the biased prediction towards base classes, which is caused by the class-specific feature level interactions. To solve this issue, we propose a visual and textual Prior Guided Mask Assemble Network (PGMA-Net). It employs a class-agnostic mask assembly process to alleviate the bias, and formulates diverse tasks into a unified manner by assembling the prior through affinity. Specifically, the class-relevant textual and visual features are first transformed to class-agnostic prior in the form of probability map. Then, a Prior-Guided Mask Assemble Module (PGMAM)…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Image Processing Techniques and Applications · Advanced Neural Network Applications

MethodsContrastive Language-Image Pre-training · Balanced Selection