MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection

Chengming Wang; Guodong Fan; Jinjiang Li; Min Gan; C. L. Philip Chen

arXiv:2508.01555·eess.IV·March 11, 2026

MGCR-Net:Multimodal Graph-Conditioned Vision-Language Reconstruction Network for Remote Sensing Change Detection

Chengming Wang, Guodong Fan, Jinjiang Li, Min Gan, C. L. Philip Chen

PDF

TL;DR

MGCR-Net introduces a multimodal graph-conditioned approach utilizing vision-language reconstruction and large language models to enhance change detection in remote sensing imagery, achieving superior results over existing methods.

Contribution

The paper presents the first integration of multimodal graph-conditioned vision-language reconstruction in remote sensing change detection, leveraging large language models for semantic feature enhancement.

Findings

01

MGCR-Net outperforms mainstream change detection methods on four datasets.

02

The proposed model effectively fuses visual and textual features for improved accuracy.

03

Experimental results demonstrate the superiority of multimodal interaction in RSCD tasks.

Abstract

With the advancement of remote sensing satellite technology and the rapid progress of deep learning, remote sensing change detection (RSCD) has become a key technique for regional monitoring. Traditional change detection (CD) methods and deep learning-based approaches have made significant contributions to change analysis and detection, however, many outstanding methods still face limitations in the exploration and application of multimodal data. To address this, we propose the multimodal graph-conditioned vision-language reconstruction network (MGCR-Net) to further explore the semantic interaction capabilities of multimodal data. Multimodal large language models (MLLM) have attracted widespread attention for their outstanding performance in computer vision, particularly due to their powerful visual-language understanding and dialogic interaction capabilities. Specifically, we design a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.