Task-Specific Adaptation of Segmentation Foundation Model via Prompt   Learning

Hyung-Il Kim; Kimin Yun; Jun-Seok Yun; Yuseok Bae

arXiv:2403.09199·cs.CV·October 14, 2024·1 cites

Task-Specific Adaptation of Segmentation Foundation Model via Prompt Learning

Hyung-Il Kim, Kimin Yun, Jun-Seok Yun, Yuseok Bae

PDF

Open Access

TL;DR

This paper introduces a prompt learning-based adaptation method for the Segment Anything Model to improve its performance on task-specific instance segmentation, especially for out-of-distribution objects, with more efficient training.

Contribution

It proposes a prompt learning module and a point matching module to customize SAM for specific segmentation tasks, addressing ambiguity and training efficiency issues.

Findings

01

Enhanced segmentation accuracy on customized scenarios

02

More efficient training process for task-specific adaptation

03

Improved boundary detail alignment in segmentation results

Abstract

Recently, foundation models trained on massive datasets to adapt to a wide range of tasks have attracted considerable attention and are actively being explored within the computer vision community. Among these, the Segment Anything Model (SAM) stands out for its remarkable progress in generalizability and flexibility for image segmentation tasks, achieved through prompt-based object mask generation. However, despite its strength, SAM faces two key limitations when applied to instance segmentation that segments specific objects or those in unique environments (e.g., task-specific adaptation for out-of-distribution objects) not typically present in the training data: 1) the ambiguity inherent in input prompts and 2) the necessity for extensive additional training to achieve optimal segmentation. To address these challenges, we propose a task-specific adaptation (i.e., customization) of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsImage Processing and 3D Reconstruction · Web Data Mining and Analysis

MethodsALIGN · Segment Anything Model