Semantic-aware SAM for Point-Prompted Instance Segmentation

Zhaoyang Wei; Pengfei Chen; Xuehui Yu; Guorong Li; Jianbin Jiao,; Zhenjun Han

arXiv:2312.15895·cs.CV·May 28, 2024·1 cites

Semantic-aware SAM for Point-Prompted Instance Segmentation

Zhaoyang Wei, Pengfei Chen, Xuehui Yu, Guorong Li, Jianbin Jiao,, Zhenjun Han

PDF

Open Access 1 Repo

TL;DR

This paper presents SAPNet, a novel semantic-aware segmentation network that enhances SAM's class-specific segmentation using point prompts, MIL, and strategic proposals, achieving promising results on Pascal VOC and COCO.

Contribution

Introduces SAPNet, integrating MIL and point prompts with SAM for cost-effective, category-specific instance segmentation with novel strategies to address weak supervision challenges.

Findings

01

SAPNet outperforms baseline methods on Pascal VOC and COCO datasets.

02

The proposed strategies effectively mitigate 'group' and 'local' issues in weakly supervised segmentation.

03

Semantic matching capabilities are significantly improved with the new framework.

Abstract

Single-point annotation in visual tasks, with the goal of minimizing labelling costs, is becoming increasingly prominent in research. Recently, visual foundation models, such as Segment Anything (SAM), have gained widespread usage due to their robust zero-shot capabilities and exceptional annotation performance. However, SAM's class-agnostic output and high confidence in local segmentation introduce 'semantic ambiguity', posing a challenge for precise category-specific segmentation. In this paper, we introduce a cost-effective category-specific segmenter using SAM. To tackle this challenge, we have devised a Semantic-Aware Instance Segmentation Network (SAPNet) that integrates Multiple Instance Learning (MIL) with matching capability and SAM with point prompts. SAPNet strategically selects the most representative mask proposals generated by SAM to supervise segmentation, with a specific…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

zhaoyangwei123/sapnet
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsFocus · Segment Anything Model