Around the World in 24 Hours: Probing LLM Knowledge of Time and Place
Carolin Holtermann, Paul R\"ottger, Anne Lauscher

TL;DR
This paper evaluates how well large language models understand and reason about time and place, revealing strengths in temporal reasoning but limitations in integrating geographic and temporal knowledge.
Contribution
It introduces GeoTemp, a large dataset for joint reasoning over time and space, and provides the first comprehensive evaluation of LLMs in this domain.
Findings
Models perform well on temporal reasoning tasks.
Performance improves with model scale.
Connecting temporal and geographic knowledge remains challenging.
Abstract
Reasoning over time and space is essential for understanding our world. However, the abilities of language models in this area are largely unexplored as previous work has tested their abilities for logical reasoning in terms of time and space in isolation or only in simple or artificial environments. In this paper, we present the first evaluation of the ability of language models to jointly reason over time and space. To enable our analysis, we create GeoTemp, a dataset of 320k prompts covering 289 cities in 217 countries and 37 time zones. Using GeoTemp, we evaluate eight open chat models of three different model families for different combinations of temporal and geographic knowledge. We find that most models perform well on reasoning tasks involving only temporal knowledge and that overall performance improves with scale. However, performance remains constrained in tasks that require…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Multimodal Machine Learning Applications · Human Mobility and Location-Based Analysis
