The World As Large Language Models See It: Exploring the reliability of LLMs in representing geographical features
Omid Reza Abbasi, Franz Welscher, Georg Weinberger, Johannes Scholz

TL;DR
This study evaluates GPT-4o and Gemini 2.0 Flash on geospatial tasks, revealing they can approximate geographic info but with inconsistent accuracy, highlighting the need for specialized fine-tuning.
Contribution
It provides a systematic assessment of LLMs' geographic representations, highlighting their limitations and potential for improvement in GIScience applications.
Findings
Gemini 2.0 Flash outperforms GPT-4o in accuracy and consistency.
Both models tend to underestimate elevations in Austria.
Neither model accurately reconstructs Austria's federal states.
Abstract
As large language models (LLMs) continue to evolve, questions about their trustworthiness in delivering factual information have become increasingly important. This concern also applies to their ability to accurately represent the geographic world. With recent advancements in this field, it is relevant to consider whether and to what extent LLMs' representations of the geographical world can be trusted. This study evaluates the performance of GPT-4o and Gemini 2.0 Flash in three key geospatial tasks: geocoding, elevation estimation, and reverse geocoding. In the geocoding task, both models exhibited systematic and random errors in estimating the coordinates of St. Anne's Column in Innsbruck, Austria, with GPT-4o showing greater deviations and Gemini 2.0 Flash demonstrating more precision but a significant systematic offset. For elevation estimation, both models tended to underestimate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Geographic Information Systems Studies · Natural Language Processing Techniques
