Robust Change Captioning in Remote Sensing: SECOND-CC Dataset and MModalCC Framework
Ali Can Karaca, M. Enes Ozelbas, Saadettin Berber, Orkhan Karimli, Turabi Yildirim, M. Fatih Amasyali

TL;DR
This paper introduces a new high-resolution dataset and a multimodal framework for remote sensing change captioning, significantly improving caption accuracy amidst challenges like illumination and viewpoint variations.
Contribution
The paper presents SECOND-CC, a comprehensive dataset, and MModalCC, a novel multimodal attention-based framework, advancing the state-of-the-art in remote sensing change captioning.
Findings
MModalCC outperforms existing methods with +4.6% BLEU4 and +9.6% CIDEr scores.
The dataset contains 6,041 image pairs and 30,205 descriptive sentences.
Attention mechanisms effectively address challenges like illumination and registration errors.
Abstract
Remote sensing change captioning (RSICC) aims to describe changes between bitemporal images in natural language. Existing methods often fail under challenges like illumination differences, viewpoint changes, blur effects, leading to inaccuracies, especially in no-change regions. Moreover, the images acquired at different spatial resolutions and have registration errors tend to affect the captions. To address these issues, we introduce SECOND-CC, a novel RSICC dataset featuring high-resolution RGB image pairs, semantic segmentation maps, and diverse real-world scenarios. SECOND-CC which contains 6,041 pairs of bitemporal RS images and 30,205 sentences describing the differences between images. Additionally, we propose MModalCC, a multimodal framework that integrates semantic and visual data using advanced attention mechanisms, including Cross-Modal Cross Attention (CMCA) and Multimodal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Genomics and Phylogenetic Studies · Image Retrieval and Classification Techniques
MethodsSoftmax · Attention Is All You Need
