Multi-scale Contrastive Adaptor Learning for Segmenting Anything in   Underperformed Scenes

Ke Zhou; Zhongwei Qiu; Dongmei Fu

arXiv:2408.05936·cs.CV·August 13, 2024

Multi-scale Contrastive Adaptor Learning for Segmenting Anything in Underperformed Scenes

Ke Zhou, Zhongwei Qiu, Dongmei Fu

PDF

Open Access

TL;DR

This paper introduces MCA-SAM, a novel contrastive learning framework with multi-scale adaptors that significantly improves the performance of the Segment Anything Model in specialized segmentation tasks with limited data.

Contribution

The paper proposes a new multi-scale contrastive adaptor learning method, MCA-SAM, which enhances SAM's adaptability and performance in challenging segmentation domains.

Findings

01

Outperforms existing methods in camouflage object detection, shadow segmentation, and polyp segmentation.

02

Achieves 20.0% MAE improvement on COD10K dataset.

03

Achieves 7.9% mDice improvement on Kvasir-SEG dataset.

Abstract

Foundational vision models, such as the Segment Anything Model (SAM), have achieved significant breakthroughs through extensive pre-training on large-scale visual datasets. Despite their general success, these models may fall short in specialized tasks with limited data, and fine-tuning such large-scale models is often not feasible. Current strategies involve incorporating adaptors into the pre-trained SAM to facilitate downstream task performance with minimal model adjustment. However, these strategies can be hampered by suboptimal learning approaches for the adaptors. In this paper, we introduce a novel Multi-scale Contrastive Adaptor learning method named MCA-SAM, which enhances adaptor performance through a meticulously designed contrastive learning framework at both token and sample levels. Our Token-level Contrastive adaptor (TC-adaptor) focuses on refining local representations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVideo Surveillance and Tracking Methods · Anomaly Detection Techniques and Applications · Human Pose and Action Recognition

MethodsSegment Anything Model · Contrastive Learning · Masked autoencoder