Diffusion-RSCC: Diffusion Probabilistic Model for Change Captioning in Remote Sensing Images
Xiaofei Yu, Yitong Li, Jie Ma

TL;DR
This paper introduces Diffusion-RSCC, a diffusion probabilistic model designed for change captioning in remote sensing images, effectively capturing semantic changes and mitigating pixel-level differences for improved environmental analysis.
Contribution
The paper proposes a novel diffusion-based approach for remote sensing change captioning, incorporating cross-modal fusion and self-attention modules to enhance caption accuracy and robustness.
Findings
Outperforms existing methods on LEVIR-CC dataset
Demonstrates superior metrics in traditional and new evaluation criteria
Effectively mitigates pixel-level differences in change localization
Abstract
Remote sensing image change captioning (RSICC) aims at generating human-like language to describe the semantic changes between bi-temporal remote sensing image pairs. It provides valuable insights into environmental dynamics and land management. Unlike conventional change captioning task, RSICC involves not only retrieving relevant information across different modalities and generating fluent captions, but also mitigating the impact of pixel-level differences on terrain change localization. The pixel problem due to long time span decreases the accuracy of generated caption. Inspired by the remarkable generative power of diffusion model, we propose a probabilistic diffusion model for RSICC to solve the aforementioned problems. In training process, we construct a noise predictor conditioned on cross modal features to learn the distribution from the real caption distribution to the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification
MethodsDiffusion
