Observing Health Outcomes Using Remote Sensing Imagery and Geo-Context Guided Visual Transformer
Yu Li, Guilherme N. DeSouza, Praveen Rao, Chi-Ren Shyu

TL;DR
This paper introduces a novel geospatially-guided visual transformer model that effectively integrates remote sensing imagery with auxiliary geospatial data, improving disease prevalence prediction accuracy.
Contribution
It presents a new geospatial embedding and guided attention mechanism tailored for multimodal remote sensing analysis, enhancing interpretability and performance.
Findings
Outperforms existing geospatial models in disease prediction
Effective integration of diverse geospatial data types
Improved interpretability of model predictions
Abstract
Visual transformers have driven major progress in remote sensing image analysis, particularly in object detection and segmentation. Recent vision-language and multimodal models further extend these capabilities by incorporating auxiliary information, including captions, question and answer pairs, and metadata, which broadens applications beyond conventional computer vision tasks. However, these models are typically optimized for semantic alignment between visual and textual content rather than geospatial understanding, and therefore are not suited for representing or reasoning with structured geospatial layers. In this study, we propose a novel model that enhances remote sensing imagery processing with guidance from auxiliary geospatial information. Our approach introduces a geospatial embedding mechanism that transforms diverse geospatial data into embedding patches that are spatially…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Geographic Information Systems Studies · Remote-Sensing Image Classification
