Foundation Model Assisted Weakly Supervised Semantic Segmentation

Xiaobo Yang; Xiaojin Gong

arXiv:2312.03585·cs.CV·December 12, 2023·1 cites

Foundation Model Assisted Weakly Supervised Semantic Segmentation

Xiaobo Yang, Xiaojin Gong

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel framework leveraging foundation models CLIP and SAM to generate high-quality seeds for weakly supervised semantic segmentation, achieving state-of-the-art results on PASCAL VOC 2012.

Contribution

The work proposes a coarse-to-fine seed generation method using frozen foundation models with learnable prompts, improving weakly supervised segmentation performance.

Findings

01

Achieves state-of-the-art on PASCAL VOC 2012

02

Competitive results on MS COCO 2014

03

Effective seed generation with CLIP and SAM modules

Abstract

This work aims to leverage pre-trained foundation models, such as contrastive language-image pre-training (CLIP) and segment anything model (SAM), to address weakly supervised semantic segmentation (WSSS) using image-level labels. To this end, we propose a coarse-to-fine framework based on CLIP and SAM for generating high-quality segmentation seeds. Specifically, we construct an image classification task and a seed segmentation task, which are jointly performed by CLIP with frozen weights and two sets of learnable task-specific prompts. A SAM-based seeding (SAMS) module is designed and applied to each task to produce either coarse or fine seed maps. Moreover, we design a multi-label contrastive loss supervised by image-level labels and a CAM activation loss supervised by the generated coarse seed map. These losses are used to learn the prompts, which are the only parts need to be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

HAL-42/FMA-WSSS
pytorchOfficial

Videos

Foundation Model Assisted Weakly Supervised Semantic Segmentation· youtube

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Neural Network Applications

MethodsSegment Anything Model · Class-activation map · Contrastive Language-Image Pre-training