RobustMedSAM: Degradation-Resilient Medical Image Segmentation via Robust Foundation Model Adaptation
Jieru Li, Matthew Chen, Micky C. Nnamdi, J. Ben Tamo, Benoit L. Marteau, May D. Wang

TL;DR
RobustMedSAM enhances medical image segmentation robustness by combining pretrained modules, fine-tuning only the mask decoder across diverse datasets and corruption types, significantly improving performance under degraded conditions.
Contribution
This work introduces a module-wise checkpoint fusion approach, combining pretrained medical and corruption-robust models, with selective fine-tuning for improved resilience in medical image segmentation.
Findings
RobustMedSAM increases Dice score from 0.613 to 0.719 on degraded images.
Structured fusion of pretrained modules effectively improves robustness.
The method performs well across multiple imaging modalities and corruption types.
Abstract
Medical image segmentation models built on Segment Anything Model (SAM) achieve strong performance on clean benchmarks, yet their reliability often degrades under realistic image corruptions such as noise, blur, motion artifacts, and modality-specific distortions. Existing approaches address either medical-domain adaptation or corruption robustness, but not both jointly. In SAM, we find that these capabilities are concentrated in complementary modules: the image encoder preserves medical priors, while the mask decoder governs corruption robustness. Motivated by this observation, we propose RobustMedSAM, which adopts module-wise checkpoint fusion by initializing the image encoder from MedSAM and the mask decoder from RobustSAM under a shared ViT-B architecture. We then fine-tune only the mask decoder on 35 medical datasets from MedSegBench, spanning six imaging modalities and 12…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
