Towards Temporal Change Explanations from Bi-Temporal Satellite Images

Ryo Tsujimoto; Hiroki Ouchi; Hidetaka Kamigaito; Taro Watanabe

arXiv:2407.09548·cs.CV·July 16, 2024

Towards Temporal Change Explanations from Bi-Temporal Satellite Images

Ryo Tsujimoto, Hiroki Ouchi, Hidetaka Kamigaito, Taro Watanabe

PDF

Open Access

TL;DR

This paper explores using large-scale vision-language models to explain temporal changes in satellite images, proposing prompting methods to handle image pairs and demonstrating the effectiveness of step-by-step reasoning prompts.

Contribution

It introduces three prompting methods for LVLMs to analyze bi-temporal satellite images and shows the effectiveness of step-by-step reasoning prompts through human evaluation.

Findings

01

Step-by-step reasoning prompts improve explanation quality.

02

LVLMs can be adapted for bi-temporal satellite image analysis.

03

Prompting methods enhance human-AI collaboration in change explanation.

Abstract

Explaining temporal changes between satellite images taken at different times is important for urban planning and environmental monitoring. However, manual dataset construction for the task is costly, so human-AI collaboration is promissing. Toward the direction, in this paper, we investigate the ability of Large-scale Vision-Language Models (LVLMs) to explain temporal changes between satellite images. While LVLMs are known to generate good image captions, they receive only a single image as input. To deal with a par of satellite images as input, we propose three prompting methods. Through human evaluation, we found the effectiveness of our step-by-step reasoning based prompting.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeochemistry and Geologic Mapping