Customize Segment Anything Model for Multi-Modal Semantic Segmentation with Mixture of LoRA Experts
Chenyang Zhu, Bin Xiao, Lin Shi, Shoukun Xu, Xu Zheng

TL;DR
This paper adapts the Segment Anything Model for multi-modal semantic segmentation by introducing a Mixture of LoRA Experts, enabling effective multi-modal feature integration and significantly improving performance on various benchmarks.
Contribution
It proposes a novel MoE-LoRA adaptation for SAM, with a new routing strategy and multi-scale feature fusion, to enhance multi-modal segmentation performance.
Findings
Outperforms state-of-the-art methods on DELIVER, MUSES, and MCubeS benchmarks.
Achieves 32.15% performance gain in missing modality scenarios.
Effectively integrates multi-modal features with a new adaptive routing strategy.
Abstract
The recent Segment Anything Model (SAM) represents a significant breakthrough in scaling segmentation models, delivering strong performance across various downstream applications in the RGB modality. However, directly applying SAM to emerging visual modalities, such as depth and event data results in suboptimal performance in multi-modal segmentation tasks. In this paper, we make the first attempt to adapt SAM for multi-modal semantic segmentation by proposing a Mixture of Low-Rank Adaptation Experts (MoE-LoRA) tailored for different input visual modalities. By training only the MoE-LoRA layers while keeping SAM's weights frozen, SAM's strong generalization and segmentation capabilities can be preserved for downstream tasks. Specifically, to address cross-modal inconsistencies, we propose a novel MoE routing strategy that adaptively generates weighted features across modalities,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence · Big Data Technologies and Applications
MethodsSegment Anything Model · Mixture of Experts
