Boosting Segment Anything Model to Generalize Visually Non-Salient Scenarios

Guangqian Guo; Pengfei Chen; Yong Guo; Huafeng Chen; Boqiang Zhang; Shan Gao

arXiv:2601.00537·cs.CV·January 5, 2026

Boosting Segment Anything Model to Generalize Visually Non-Salient Scenarios

Guangqian Guo, Pengfei Chen, Yong Guo, Huafeng Chen, Boqiang Zhang, Shan Gao

PDF

Open Access

TL;DR

This paper introduces VNS-SAM, an enhanced version of the Segment Anything Model that better handles visually non-salient scenarios by exploiting low-level features, with a new dataset and demonstrated superior zero-shot segmentation performance.

Contribution

The paper presents VNS-SAM, a novel extension of SAM with modules for non-salient feature mining, and introduces VNS-SEG, a comprehensive dataset for training and benchmarking in non-salient scenarios.

Findings

01

VNS-SAM outperforms baseline models in VNS segmentation tasks.

02

The proposed modules improve understanding of non-salient features with minimal additional computational cost.

03

VNS-SAM maintains zero-shot generalizability while enhancing performance in challenging scenarios.

Abstract

Segment Anything Model (SAM), known for its remarkable zero-shot segmentation capabilities, has garnered significant attention in the community. Nevertheless, its performance is challenged when dealing with what we refer to as visually non-salient scenarios, where there is low contrast between the foreground and background. In these cases, existing methods often cannot capture accurate contours and fail to produce promising segmentation results. In this paper, we propose Visually Non-Salient SAM (VNS-SAM), aiming to enhance SAM's perception of visually non-salient scenarios while preserving its original zero-shot generalizability. We achieve this by effectively exploiting SAM's low-level features through two designs: Mask-Edge Token Interactive decoder and Non-Salient Feature Mining module. These designs help the SAM decoder gain a deeper understanding of non-salient characteristics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Generative Adversarial Networks and Image Synthesis