On the Opportunities and Challenges of Foundation Models for Geospatial Artificial Intelligence
Gengchen Mai, Weiming Huang, Jin Sun, Suhang Song, Deepak Mishra,, Ninghao Liu, Song Gao, Tianming Liu, Gao Cong, Yingjie Hu, Chris Cundy,, Ziyuan Li, Rui Zhu, Ni Lao

TL;DR
This paper explores the potential and challenges of developing multimodal foundation models for geospatial AI, evaluating existing models across diverse tasks and emphasizing the importance of multimodal reasoning.
Contribution
It provides an empirical assessment of foundation models on geospatial tasks and discusses the challenges of multimodality in GeoAI, proposing directions for future multimodal foundation models.
Findings
LLMs outperform task-specific models on text-only geospatial tasks in zero-shot/few-shot settings.
Foundation models underperform on multimodal geospatial tasks compared to specialized models.
Addressing multimodality is crucial for developing effective foundation models for GeoAI.
Abstract
Large pre-trained models, also known as foundation models (FMs), are trained in a task-agnostic manner on large-scale data and can be adapted to a wide range of downstream tasks by fine-tuning, few-shot, or even zero-shot learning. Despite their successes in language and vision tasks, we have yet seen an attempt to develop foundation models for geospatial artificial intelligence (GeoAI). In this work, we explore the promises and challenges of developing multimodal foundation models for GeoAI. We first investigate the potential of many existing FMs by testing their performances on seven tasks across multiple geospatial subdomains including Geospatial Semantics, Health Geography, Urban Geography, and Remote Sensing. Our results indicate that on several geospatial tasks that only involve text modality such as toponym recognition, location description recognition, and US…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData-Driven Disease Surveillance · Human Mobility and Location-Based Analysis
