Multi-Scale and Detail-Enhanced Segment Anything Model for Salient   Object Detection

Shixuan Gao; Pingping Zhang; Tianyu Yan; Huchuan Lu

arXiv:2408.04326·cs.CV·August 9, 2024·5 cites

Multi-Scale and Detail-Enhanced Segment Anything Model for Salient Object Detection

Shixuan Gao, Pingping Zhang, Tianyu Yan, Huchuan Lu

PDF

Open Access 1 Repo

TL;DR

This paper introduces MDSAM, a novel model that enhances the Segment Anything Model for Salient Object Detection by integrating multi-scale, multi-level, and fine-grained detail information, achieving superior performance and generalization.

Contribution

The paper proposes a lightweight multi-scale adapter, a multi-level fusion module, and a detail enhancement module to improve SAM's performance in SOD tasks.

Findings

01

Outperforms existing SOD methods on multiple datasets.

02

Demonstrates strong generalization to other segmentation tasks.

03

Effectively incorporates multi-scale and fine-grained details.

Abstract

Salient Object Detection (SOD) aims to identify and segment the most prominent objects in images. Advanced SOD methods often utilize various Convolutional Neural Networks (CNN) or Transformers for deep feature extraction. However, these methods still deliver low performance and poor generalization in complex cases. Recently, Segment Anything Model (SAM) has been proposed as a visual fundamental model, which gives strong segmentation and generalization capabilities. Nonetheless, SAM requires accurate prompts of target objects, which are unavailable in SOD. Additionally, SAM lacks the utilization of multi-scale and multi-level information, as well as the incorporation of fine-grained details. To address these shortcomings, we propose a Multi-scale and Detail-enhanced SAM (MDSAM) for SOD. Specifically, we first introduce a Lightweight Multi-Scale Adapter (LMSA), which allows SAM to learn…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bellybeauty/mdsam
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Big Data and Business Intelligence

MethodsAdapter · Segment Anything Model