DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

Junjue Wang; Weihao Xuan; Heli Qi; Zhihao Liu; Kunyi Liu; Yuhan Wu; Hongruixuan Chen; Jian Song; Junshi Xia; Zhuo Zheng; Naoto Yokoya

arXiv:2505.21089·cs.CV·October 22, 2025·2 cites

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response

Junjue Wang, Weihao Xuan, Heli Qi, Zhihao Liu, Kunyi Liu, Yuhan Wu, Hongruixuan Chen, Jian Song, Junshi Xia, Zhuo Zheng, Naoto Yokoya

PDF

Open Access 5 Datasets 1 Video

TL;DR

DisasterM3 is a comprehensive remote sensing vision-language dataset designed for global disaster assessment, enabling improved multi-task understanding and response to diverse natural and man-made disasters across various sensors and regions.

Contribution

We created a large-scale, multi-hazard, multi-sensor, multi-task dataset for disaster assessment, and demonstrated its effectiveness in fine-tuning models for better disaster understanding and response.

Findings

01

State-of-the-art VLMs perform poorly on disaster tasks.

02

Fine-tuning improves model performance and generalization.

03

DisasterM3 enhances cross-sensor and cross-disaster robustness.

Abstract

Large vision-language models (VLMs) have made great achievements in Earth vision. However, complex disaster scenes with diverse disaster types, geographic regions, and satellite sensors have posed new challenges for VLM applications. To fill this gap, we curate a remote sensing vision-language dataset (DisasterM3) for global-scale disaster assessment and response. DisasterM3 includes 26,988 bi-temporal satellite images and 123k instruction pairs across 5 continents, with three characteristics: 1) Multi-hazard: DisasterM3 involves 36 historical disaster events with significant impacts, which are categorized into 10 common natural and man-made disasters. 2)Multi-sensor: Extreme weather during disasters often hinders optical sensor imaging, making it necessary to combine Synthetic Aperture Radar (SAR) imagery for post-disaster scenes. 3) Multi-task: Based on real-world scenarios,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

Videos

DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response· slideslive

Taxonomy

TopicsRemote-Sensing Image Classification