Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
Jie Wei, Erika Ardiles-Cruz, Aleksey Panasyuk, Erik Blasch

TL;DR
This paper proposes a novel method that leverages vision-language models to fuse imagery with human knowledge, generating diverse damage data to improve classification accuracy in disaster response scenarios.
Contribution
It introduces a new damage data generation approach using vision-language models to address data scarcity and labeling issues in humanitarian disaster assessment.
Findings
Encouraging data generation quality demonstrated.
Improved classification of damage levels achieved.
Addresses data imbalance and labeling inaccuracies.
Abstract
It is of crucial importance to assess damages promptly and accurately in humanitarian assistance and disaster response (HADR). Current deep learning approaches struggle to generalize effectively due to the imbalance of data classes, scarcity of moderate damage examples, and human inaccuracy in pixel labeling during HADR situations. To accommodate for these limitations and exploit state-of-the-art techniques in vision-language models (VLMs) to fuse imagery with human knowledge understanding, there is an opportunity to generate a diversified set of image-based damage data effectively. Our initial experimental results suggest encouraging data generation quality, which demonstrates an improvement in classifying scenes with different levels of structural damage to buildings, roads, and infrastructures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsInfrastructure Maintenance and Monitoring · Multimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning
