A Hybrid Mamba-SAM Architecture for Efficient 3D Medical Image Segmentation
Mohammadreza Gholipour Shahraki, Mehdi Rezaeian, Mohammad Ghasemzadeh

TL;DR
This paper introduces Mamba-SAM, a hybrid 3D medical image segmentation architecture combining a frozen SAM encoder with Mamba-based models, achieving high accuracy and efficiency on MRI datasets.
Contribution
It proposes a novel hybrid architecture integrating SAM with Mamba-based SSMs and introduces Multi-Frequency Gated Convolution for improved volumetric feature representation.
Findings
Achieves a Dice score of 0.906 on ACDC dataset
Outperforms baselines on Myocardium and Left Ventricle segmentation
Offers superior inference speed with the TP MFGC variant
Abstract
Accurate segmentation of 3D medical images such as MRI and CT is essential for clinical diagnosis and treatment planning. Foundation models like the Segment Anything Model (SAM) provide powerful general-purpose representations but struggle in medical imaging due to domain shift, their inherently 2D design, and the high computational cost of fine-tuning. To address these challenges, we propose Mamba-SAM, a novel and efficient hybrid architecture that combines a frozen SAM encoder with the linear-time efficiency and long-range modeling capabilities of Mamba-based State Space Models (SSMs). We investigate two parameter-efficient adaptation strategies. The first is a dual-branch architecture that explicitly fuses general features from a frozen SAM encoder with domain-specific representations learned by a trainable VMamba encoder using cross-attention. The second is an adapter-based approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Medical Image Segmentation Techniques · Generative Adversarial Networks and Image Synthesis
