Semantic-CC: Boosting Remote Sensing Image Change Captioning via Foundational Knowledge and Semantic Guidance
Yongshuo Zhu, Lu Li, Keyan Chen, Chenyang Liu, Fugen Zhou, Zhenwei, Shi

TL;DR
Semantic-CC introduces a novel method for remote sensing image change captioning that leverages foundational knowledge and semantic guidance to produce more accurate and comprehensive change descriptions, overcoming limitations of existing methods.
Contribution
The paper proposes a new change captioning approach using foundation models and semantic guidance, with a multi-task training strategy for improved accuracy and robustness.
Findings
Semantic-CC outperforms existing methods on LEVIR-CC and LEVIR-CD datasets.
The approach effectively combines change detection and captioning tasks.
Joint training enhances the quality of change descriptions.
Abstract
Remote sensing image change captioning (RSICC) aims to articulate the changes in objects of interest within bi-temporal remote sensing images using natural language. Given the limitations of current RSICC methods in expressing general features across multi-temporal and spatial scenarios, and their deficiency in providing granular, robust, and precise change descriptions, we introduce a novel change captioning (CC) method based on the foundational knowledge and semantic guidance, which we term Semantic-CC. Semantic-CC alleviates the dependency of high-generalization algorithms on extensive annotations by harnessing the latent knowledge of foundation models, and it generates more comprehensive and accurate change descriptions guided by pixel-level semantics from change detection (CD). Specifically, we propose a bi-temporal SAM-based encoder for dual-image feature extraction; a multi-task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
