GEO-Bench-2: From Performance to Capability, Rethinking Evaluation in Geospatial AI
Naomi Simumba, Nils Lehmann, Paolo Fraccaro, Hamed Alemohammad, Geeth De Mel, Salman Khan, Manil Maskey, Nicolas Longepe, Xiao Xiang Zhu, Hannah Kerner, Juan Bernabe-Moreno, Alexandre Lacoste

TL;DR
GEO-Bench-2 introduces a comprehensive, standardized evaluation framework for Geospatial Foundation Models, enabling nuanced performance assessment across diverse tasks and datasets, and highlighting the importance of task-specific model selection.
Contribution
It provides a flexible, capability-based benchmarking protocol and a diverse dataset suite, facilitating fair comparison and research into model adaptation strategies for GeoFMs.
Findings
No single model dominates across all tasks
Pretraining on natural images benefits high-resolution tasks
EO-specific models excel in multispectral applications
Abstract
Geospatial Foundation Models (GeoFMs) are transforming Earth Observation (EO), but evaluation lacks standardized protocols. GEO-Bench-2 addresses this with a comprehensive framework spanning classification, segmentation, regression, object detection, and instance segmentation across 19 permissively-licensed datasets. We introduce ''capability'' groups to rank models on datasets that share common characteristics (e.g., resolution, bands, temporality). This enables users to identify which models excel in each capability and determine which areas need improvement in future work. To support both fair comparison and methodological innovation, we define a prescriptive yet flexible evaluation protocol. This not only ensures consistency in benchmarking but also facilitates research into model adaptation strategies, a key and open challenge in advancing GeoFMs for downstream tasks. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRemote-Sensing Image Classification · Geographic Information Systems Studies · Remote Sensing in Agriculture
