DualSwinFusionSeg: Multimodal Martian Landslide Segmentation via Dual Swin Transformer with Multi-Scale Fusion and UNet++
Shahriar Kabir, Abdullah Muhammed Amimul Ehsan, Istiak Ahmmed Rifti, Md Kaykobad Reza

TL;DR
This paper introduces DualSwinFusionSeg, a novel multimodal segmentation model using dual Swin Transformer encoders and multi-scale fusion, achieving high accuracy in Martian landslide segmentation with limited labeled data.
Contribution
It proposes a dual-encoder architecture with multi-scale fusion and a UNet++ decoder for improved Martian landslide segmentation from heterogeneous data modalities.
Findings
Achieves 0.867 mIoU and 0.905 F1 on development set
Outperforms baseline models with modality-specific encoders
Effective with limited training samples
Abstract
Automated segmentation of Martian landslides, particularly in tectonically active regions such as Valles Marineris,is important for planetary geology, hazard assessment, and future robotic exploration. However, detecting landslides from planetary imagery is challenging due to the heterogeneous nature of available sensing modalities and the limited number of labeled samples. Each observation combines RGB imagery with geophysical measurements such as digital elevation models, slope maps, thermal inertia, and contextual grayscale imagery, which differ significantly in resolution and statistical properties. To address these challenges, we propose DualSwinFusionSeg, a multimodal segmentation architecture that separates modality-specific feature extraction and performs multi-scale cross-modal fusion. The model employs two parallel Swin Transformer V2 encoders to independently process RGB and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPlanetary Science and Exploration · 3D Surveying and Cultural Heritage · Robotics and Sensor-Based Localization
