DisasterInsight: A Multimodal Benchmark for Function-Aware and Grounded Disaster Assessment
By: Sara Tehrani , Yonghao Xu , Leif Haglund and more
Potential Business Impact:
Helps computers understand disaster damage from pictures.
Timely interpretation of satellite imagery is critical for disaster response, yet existing vision-language benchmarks for remote sensing largely focus on coarse labels and image-level recognition, overlooking the functional understanding and instruction robustness required in real humanitarian workflows. We introduce DisasterInsight, a multimodal benchmark designed to evaluate vision-language models (VLMs) on realistic disaster analysis tasks. DisasterInsight restructures the xBD dataset into approximately 112K building-centered instances and supports instruction-diverse evaluation across multiple tasks, including building-function classification, damage-level and disaster-type classification, counting, and structured report generation aligned with humanitarian assessment guidelines. To establish domain-adapted baselines, we propose DI-Chat, obtained by fine-tuning existing VLM backbones on disaster-specific instruction data using parameter-efficient Low-Rank Adaptation (LoRA). Extensive experiments on state-of-the-art generic and remote-sensing VLMs reveal substantial performance gaps across tasks, particularly in damage understanding and structured report generation. DI-Chat achieves significant improvements on damage-level and disaster-type classification as well as report generation quality, while building-function classification remains challenging for all evaluated models. DisasterInsight provides a unified benchmark for studying grounded multimodal reasoning in disaster imagery.
Similar Papers
DisasterM3: A Remote Sensing Vision-Language Dataset for Disaster Damage Assessment and Response
CV and Pattern Recognition
Helps computers understand disaster damage from space.
Effective Damage Data Generation by Fusing Imagery with Human Knowledge Using Vision-Language Models
CV and Pattern Recognition
Helps rescue teams quickly see disaster damage.
DisasterVQA: A Visual Question Answering Benchmark Dataset for Disaster Scenes
CV and Pattern Recognition
Helps computers understand disaster damage from photos.