SAMwave: Wavelet-Driven Feature Enrichment for Effective Adaptation of Segment Anything Model

Saurabh Yadav; Avi Gupta; Koteswar Rao Jerripothula

arXiv:2507.20186·cs.CV·July 29, 2025

SAMwave: Wavelet-Driven Feature Enrichment for Effective Adaptation of Segment Anything Model

Saurabh Yadav, Avi Gupta, Koteswar Rao Jerripothula

PDF

TL;DR

SAMwave introduces a wavelet-based feature enrichment technique that significantly improves the adaptation of the Segment Anything Model for complex low-level vision tasks, outperforming existing methods.

Contribution

It proposes a novel wavelet transform-based approach with complex-valued adapters for better feature extraction and adaptation of SAM in dense prediction tasks.

Findings

01

Outperforms existing adaptation methods on four low-level vision tasks

02

Effective across both SAM and SAM2 backbones

03

Works with real and complex-valued adapters

Abstract

The emergence of large foundation models has propelled significant advances in various domains. The Segment Anything Model (SAM), a leading model for image segmentation, exemplifies these advances, outperforming traditional methods. However, such foundation models often suffer from performance degradation when applied to complex tasks for which they are not trained. Existing methods typically employ adapter-based fine-tuning strategies to adapt SAM for tasks and leverage high-frequency features extracted from the Fourier domain. However, Our analysis reveals that these approaches offer limited benefits due to constraints in their feature extraction techniques. To overcome this, we propose \textbf{\textit{SAMwave}}, a novel and interpretable approach that utilizes the wavelet transform to extract richer, multi-scale high-frequency features from input data. Extending this, we introduce…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.