Geospatial foundation-model embeddings improve population estimation unevenly across space and scale

Wenbin Zhang; Eimear Cleary; Francisco Rowe; Somnath Chaudhuri; Maksym Bondarenko; Shengjie Lai; Andrew J. Tatem

arXiv:2605.01650·cs.LG·May 5, 2026

Geospatial foundation-model embeddings improve population estimation unevenly across space and scale

Wenbin Zhang, Eimear Cleary, Francisco Rowe, Somnath Chaudhuri, Maksym Bondarenko, Shengjie Lai, Andrew J. Tatem

PDF

TL;DR

This study benchmarks a geospatial foundation model for population estimation, showing it improves accuracy in data-sparse areas but faces limitations with scale mismatches across different regions.

Contribution

It demonstrates the effectiveness and limitations of foundation-model embeddings compared to traditional covariates in subnational population estimation.

Findings

01

PDFM increased predictive fit by median 20.1% across countries.

02

PDFM reduced Kullback-Leibler divergence by 23.2%.

03

Performance was most advantageous in areas with weak traditional covariates.

Abstract

Reliable subnational population estimates are essential for applications, yet remain difficult where censuses are sparse, outdated or spatially coarse. Existing population-mapping workflows rely on hand-built geospatial covariates, such as settlement extent, night-time lights, and environmental conditions, which must be assembled and harmonised across scales and geographies. Geospatial foundation models offer an alternative by learning reusable representations of place from more multifaceted and heterogeneous data sources. Here, we benchmark Population Dynamics Foundation Model (PDFM) embeddings against the harmonised geospatial covariates for subnational population estimation in Brazil, Nigeria and the United States. Under geographically structured validation, PDFM increased predictive fit by a median of 20.1% (IQR: 10.0-33.2%, across country-model comparisons) reduction in unexplained…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.