TL;DR
This paper introduces GSCNet, a novel graph-based network for unaligned UAV RGBT image semantic segmentation, addressing spatial misalignment and semantic confusion, and presents a large-scale benchmark dataset.
Contribution
The paper proposes a new graph-based semantic calibration network and constructs the largest fine-grained unaligned UAV RGBT segmentation benchmark.
Findings
GSCNet outperforms state-of-the-art methods on the URTF benchmark.
The Feature Decoupling and Alignment Module improves spatial correction.
The Semantic Graph Calibration Module enhances accuracy for similar and rare categories.
Abstract
Fine-grained RGBT image semantic segmentation is crucial for all-weather unmanned aerial vehicle (UAV) scene understanding. However, UAV RGBT image semantic segmentation faces two coupled challenges: cross-modal spatial misalignment caused by sensor parallax and platform vibration, and severe semantic confusion among fine-grained ground objects under top-down aerial views. To address these issues, we propose a Graph-based Semantic Calibration Network (GSCNet) for unaligned UAV RGBT image semantic segmentation. Specifically, we design a Feature Decoupling and Alignment Module (FDAM) that decouples each modality into shared structural and private perceptual components and performs deformable alignment in the shared subspace, enabling robust spatial correction with reduced modality appearance interference. Moreover, we propose a Semantic Graph Calibration Module (SGCM) that explicitly…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
