MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection
Chengming Wang, Guodong Fan, Jinjiang Li, Min Gan, C. L. Philip Chen

TL;DR
MGCR-Net introduces a multimodal graph-conditioned approach utilizing vision-language reconstruction and large language models to enhance change detection in remote sensing imagery, achieving superior results over existing methods.
Contribution
The paper presents the first integration of multimodal graph-conditioned vision-language reconstruction in remote sensing change detection, leveraging large language models for semantic feature enhancement.
Findings
MGCR-Net outperforms mainstream change detection methods on four datasets.
The proposed model effectively fuses visual and textual features for improved accuracy.
Experimental results demonstrate the superiority of multimodal interaction in RSCD tasks.
Abstract
With the advancement of remote sensing satellite technology and the rapid progress of deep learning, remote sensing change detection (RSCD) has become a key technique for regional monitoring. Traditional change detection (CD) methods and deep learning-based approaches have made significant contributions to change analysis and detection, however, many outstanding methods still face limitations in the exploration and application of multimodal data. To address this, we propose the multimodal graph-conditioned vision-language reconstruction network (MGCR-Net) to further explore the semantic interaction capabilities of multimodal data. Multimodal large language models (MLLM) have attracted widespread attention for their outstanding performance in computer vision, particularly due to their powerful visual-language understanding and dialogic interaction capabilities. Specifically, we design a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
