Geospatial Chain of Thought Reasoning for Enhanced Visual Question Answering on Satellite Imagery
Shambhavi Shanker, Manikandan Padmanaban, Jagabondhu Hazra

TL;DR
This paper introduces a geospatial chain of thought reasoning framework combined with direct preference optimization to significantly improve the interpretability, robustness, and accuracy of visual question answering on satellite imagery for climate-related applications.
Contribution
It presents a novel VQA framework that integrates CoT reasoning with DPO, enhancing reasoning capabilities and performance on complex geospatial tasks.
Findings
CoT supervision improves accuracy by 34.9% over baselines
DPO further enhances accuracy and reasoning quality
Enables richer geospatial reasoning for climate applications
Abstract
Geospatial chain of thought (CoT) reasoning is essential for advancing Visual Question Answering (VQA) on satellite imagery, particularly in climate related applications such as disaster monitoring, infrastructure risk assessment, urban resilience planning, and policy support. Existing VQA models enable scalable interpretation of remote sensing data but often lack the structured reasoning required for complex geospatial queries. We propose a VQA framework that integrates CoT reasoning with Direct Preference Optimization (DPO) to improve interpretability, robustness, and accuracy. By generating intermediate rationales, the model better handles tasks involving detection, classification, spatial relations, and comparative analysis, which are critical for reliable decision support in high stakes climate domains. Experiments show that CoT supervision improves accuracy by 34.9\% over direct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Constraint Satisfaction and Optimization · Advanced Image and Video Retrieval Techniques
