Bridge the Points: Graph-based Few-shot Segment Anything Semantically

Anqi Zhang; Guangyu Gao; Jianbo Jiao; Chi Harold Liu; and Yunchao Wei

arXiv:2410.06964·cs.CV·October 14, 2024·3 cites

Bridge the Points: Graph-based Few-shot Segment Anything Semantically

Anqi Zhang, Guangyu Gao, Jianbo Jiao, Chi Harold Liu, and Yunchao Wei

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a graph-based method to improve few-shot semantic segmentation with SAM, enhancing prompt selection, reducing hyperparameters, and increasing efficiency, resulting in state-of-the-art performance on multiple datasets.

Contribution

The paper proposes a novel graph analysis approach with modules for prompt selection and mask clustering, significantly improving efficiency and accuracy in few-shot segmentation tasks.

Findings

01

Achieves 58.7% mIoU on COCO-20i dataset

02

Outperforms existing models in efficiency and accuracy

03

Effective in cross-domain and one-shot segmentation scenarios

Abstract

The recent advancements in large-scale pre-training techniques have significantly enhanced the capabilities of vision foundation models, notably the Segment Anything Model (SAM), which can generate precise masks based on point and box prompts. Recent studies extend SAM to Few-shot Semantic Segmentation (FSS), focusing on prompt generation for SAM-based automatic semantic segmentation. However, these methods struggle with selecting suitable prompts, require specific hyperparameter settings for different scenarios, and experience prolonged one-shot inference times due to the overuse of SAM, resulting in low efficiency and limited automation ability. To address these issues, we propose a simple yet effective approach based on graph analysis. In particular, a Positive-Negative Alignment module dynamically selects the point prompts for generating masks, especially uncovering the potential of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ANDYZAQ/GF-SAM
pytorchOfficial

Videos

Bridge the Points: Graph-based Few-shot Segment Anything Semantically· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Video Analysis and Summarization

MethodsSoftmax · Dense Connections · Layer Normalization · Linear Layer · Multi-Head Attention · Residual Connection · Attention Is All You Need · Vision Transformer · self-DIstillation with NO labels · Segment Anything Model