MoRA: Mobility as the Backbone for Geospatial Representation Learning at Scale
Ya Wen, Jixuan Cai, Qiyao Ma, Linyan Li, Xinhua Chen, Chris Webster, Yulun Zhou

TL;DR
MoRA introduces a human-centric geospatial embedding framework that leverages mobility graphs and multiple data modalities to improve predictions of socio-economic and functional location characteristics, demonstrating significant performance gains.
Contribution
The paper presents MoRA, a novel framework integrating mobility graphs with diverse data sources for scalable, human-centric geospatial representation learning, outperforming existing models.
Findings
MoRA achieves 12.9% better predictive performance than state-of-the-art models.
The model demonstrates predictable scaling behavior similar to LLMs.
It effectively fuses multiple modalities into a compact 128-dimensional embedding.
Abstract
Representation learning of geospatial locations remains a core challenge in achieving general geospatial intelligence, with increasingly diverging philosophies and techniques. While Earth observation paradigms excel at depicting locations in their physical states, we claim that a location's comprehensive "meaning" is better grounded in its internal human activity patterns and, crucially, its functional relationships with other locations, as revealed by human movement. We present MoRA, a human-centric geospatial framework that leverages a mobility graph as its core backbone to fuse various data modalities, aiming to learn embeddings that represent the socio-economic context and functional role of a location. MoRA achieves this through the integration of spatial tokenization, GNNs, and asymmetric contrastive learning to align 100M+ POIs, massive remote sensing imagery, and structured…
Peer Reviews
Decision·ICLR 2026 Poster
* Clear and well-justified framework for graph-based location representation learning via relative transitions across hexagonal grid cells. * Strong empirical performance on spatial and socioeconomic downstream tasks. * Well-written and clearly presented.
* Fairly complex methodology including single-modality encoders and a separate graph neural network. A joint unifying architecture would be less engineering-focused. However, ablations justify the individual components well * It would have been nice to also capturing natural tasks, where mobility matters. For instance, species distribution modelling, like iNaturalist of BirdSnap would be good choices here
- Strong empirical results. Mora outperforms all baselines across several experiments - Ablation studies validates the additional complexity of their method - Experiments are done across several training runs, informing reproducibility. Means and standard deviations are provided. - The introduction of the new geospatial benchmark is noteworthy - The experiments around scaling laws are useful for the GeoAI community
- Heavy reliance on proprietary mobility data: - This causes concerns related to reproducibility. I don't believe this resource is publicly available - What about rural areas where human activity is sparse? - Single country mobility data: The mobility data is unique to china, casting fairly large doubts on the generalizability of the method. For example, does this method generalize to Europe or the US? - The new benchmark primarily describes human-centric socio-economic tasks. It would
STRENGTHS: - I like the framing of two paradigms for geo representation learning, the human and EO centric ones. Not sure I necessarily agree but this is an interesting framing to motivate the paper! The argument about the relative nature of a "place" (i.e. a location being majorly defined by its relationship with other locations) is thought provoking and interesting. Overall, really great motivation section! - Exploring scaling laws for geospatial representation learning is a critical researc
- I would love to have seen a comparison to a simple neighborhood graph; the way understand it, edges between cells/nodes are based on a-priori known WeChat Pay interactions. This is great if you have access, but what if you didn't have that data? How would this method perform if you simply built your graph based on direct neighborhood/adjacency of cells? How would the model perform then? This seems like a crucial ablation missing, unless I am missing it. This also would allow the authors to com
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Mobility and Location-Based Analysis · Geographic Information Systems Studies · Data-Driven Disease Surveillance
