ChangeChat: An Interactive Model for Remote Sensing Change Analysis via   Multimodal Instruction Tuning

Pei Deng; Wenqian Zhou; Hanlin Wu

arXiv:2409.08582·cs.CV·September 16, 2024

ChangeChat: An Interactive Model for Remote Sensing Change Analysis via Multimodal Instruction Tuning

Pei Deng, Wenqian Zhou, Hanlin Wu

PDF

Open Access 1 Repo

TL;DR

ChangeChat is an innovative vision-language model tailored for remote sensing change analysis, enabling interactive, multimodal queries and surpassing existing methods in performance and versatility.

Contribution

It introduces the first bitemporal vision-language model for RS change analysis and develops a large, multimodal dataset for training and evaluation.

Findings

01

ChangeChat achieves state-of-the-art performance on change captioning and localization tasks.

02

The model outperforms GPT-4 on specific RS change analysis benchmarks.

03

The ChangeChat-87k dataset enhances model training and generalization.

Abstract

Remote sensing (RS) change analysis is vital for monitoring Earth's dynamic processes by detecting alterations in images over time. Traditional change detection excels at identifying pixel-level changes but lacks the ability to contextualize these alterations. While recent advancements in change captioning offer natural language descriptions of changes, they do not support interactive, user-specific queries. To address these limitations, we introduce ChangeChat, the first bitemporal vision-language model (VLM) designed specifically for RS change analysis. ChangeChat utilizes multimodal instruction tuning, allowing it to handle complex queries such as change captioning, category-specific quantification, and change localization. To enhance the model's performance, we developed the ChangeChat-87k dataset, which was generated using a combination of rule-based methods and GPT-assisted…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hanlinwu/changechat
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGeographic Information Systems Studies

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Label Smoothing · Byte Pair Encoding · Absolute Position Encodings · Softmax · Layer Normalization · Position-Wise Feed-Forward Layer · Dropout