SAMba-UNet: SAM2-Mamba UNet for Cardiac MRI in Medical Robotic Perception
Guohao Huo, Ruiting Dai, Ling Shao, Hao Tang

TL;DR
SAMba-UNet is a novel dual-encoder architecture combining SAM2, Mamba, and UNet for improved cardiac MRI segmentation, addressing domain shifts and enhancing boundary precision for robotic medical perception.
Contribution
Introduces SAMba-UNet with a Dynamic Feature Fusion Refiner and HOACM, enabling cross-modal feature learning and improved segmentation accuracy in cardiac MRI.
Findings
Achieves Dice of 0.9103 on ACDC benchmark
Significantly improves boundary localization of small structures
Demonstrates robustness for clinical robotic perception applications
Abstract
To address complex pathological feature extraction in automated cardiac MRI segmentation, we propose SAMba-UNet, a novel dual-encoder architecture that synergistically combines the vision foundation model SAM2, the linear-complexity state-space model Mamba, and the classical UNet to achieve cross-modal collaborative feature learning; to overcome domain shifts between natural images and medical scans, we introduce a Dynamic Feature Fusion Refiner that employs multi-scale pooling and channel-spatial dual-path calibration to strengthen small-lesion and fine-structure representation, and we design a Heterogeneous Omni-Attention Convergence Module (HOACM) that fuses SAM2's local positional semantics with Mamba's long-range dependency modeling via global contextual attention and branch-selective emphasis, yielding substantial gains in both global consistency and boundary precision-on the ACDC…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Soft Robotics and Applications · Medical Image Segmentation Techniques
MethodsSoftmax · Attention Is All You Need · Mamba: Linear-Time Sequence Modeling with Selective State Spaces
