EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues
Sagar Soni, Akshay Dudhane, Hiyam Debary, Mustansar Fiaz, Muhammad, Akhtar Munir, Muhammad Sohail Danish, Paolo Fraccaro, Campbell D Watson,, Levente J Klein, Fahad Shahbaz Khan, Salman Khan

TL;DR
EarthDial is a specialized conversational AI for Earth Observation data that supports multi-sensory, multi-temporal, and multi-resolution analysis, enabling diverse remote sensing tasks through extensive instruction tuning.
Contribution
The paper introduces EarthDial, a novel Earth Observation-specific Vision-Language Model with a large instruction tuning dataset covering multiple modalities and resolutions, surpassing existing models in EO tasks.
Findings
Outperforms existing models on 44 EO datasets
Supports multi-spectral, multi-temporal, and multi-resolution data
Enables diverse remote sensing applications
Abstract
Automated analysis of vast Earth observation data via interactive Vision-Language Models (VLMs) can unlock new opportunities for environmental monitoring, disaster response, and {resource management}. Existing generic VLMs do not perform well on Remote Sensing data, while the recent Geo-spatial VLMs remain restricted to a fixed resolution and few sensor modalities. In this paper, we introduce EarthDial, a conversational assistant specifically designed for Earth Observation (EO) data, transforming complex, multi-sensory Earth observations into interactive, natural language dialogues. EarthDial supports multi-spectral, multi-temporal, and multi-resolution imagery, enabling a wide range of remote sensing tasks, including classification, detection, captioning, question answering, visual reasoning, and visual grounding. To achieve this, we introduce an extensive instruction tuning dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Speech and dialogue systems · Semantic Web and Ontologies
