Segment Any RGB-Thermal Model with Language-aided Distillation
Dong Xing, Xianxun Zhu, Wei Zhou, Qika Lin, Hang Yang, Yuqing Wang

TL;DR
This paper introduces SARTM, a novel framework that adapts the Segment Anything Model for RGB-Thermal semantic segmentation by fine-tuning with LoRA, incorporating language guidance, and using cross-modal knowledge distillation, achieving superior results.
Contribution
The paper presents a new method to customize SAM for RGB-T segmentation by combining LoRA fine-tuning, language guidance, and cross-modal knowledge distillation, which was not previously explored.
Findings
SARTM outperforms state-of-the-art methods on three benchmarks.
The framework effectively reduces modality gaps and semantic ambiguity.
Extensive experiments validate the robustness of SARTM across conditions.
Abstract
The recent Segment Anything Model (SAM) demonstrates strong instance segmentation performance across various downstream tasks. However, SAM is trained solely on RGB data, limiting its direct applicability to RGB-thermal (RGB-T) semantic segmentation. Given that RGB-T provides a robust solution for scene understanding in adverse weather and lighting conditions, such as low light and overexposure, we propose a novel framework, SARTM, which customizes the powerful SAM for RGB-T semantic segmentation. Our key idea is to unleash the potential of SAM while introduce semantic understanding modules for RGB-T data pairs. Specifically, our framework first involves fine tuning the original SAM by adding extra LoRA layers, aiming at preserving SAM's strong generalization and segmentation capabilities for downstream tasks. Secondly, we introduce language information as guidance for training our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Materials Science
MethodsSegment Anything Model
