Measuring Geographic Diversity of Foundation Models with a Natural Language--based Geo-guessing Experiment on GPT-4
Zilong Liu, Krzysztof Janowicz, Kitty Currier, Meilin Shi

TL;DR
This study evaluates GPT-4's geographic knowledge using a natural language geo-guessing experiment, revealing regional disparities and limitations in representing geographic features globally and locally.
Contribution
It introduces a novel natural language-based geo-guessing method to assess the geographic diversity of GPT-4's knowledge, highlighting regional disparities and modality differences.
Findings
GPT-4 encodes insufficient geographic knowledge globally.
Regional disparities exist in GPT-4's geo-guessing performance.
Inter-model performance varies between unimodal and multimodal GPT-4 variants.
Abstract
Generative AI based on foundation models provides a first glimpse into the world represented by machines trained on vast amounts of multimodal data ingested by these models during training. If we consider the resulting models as knowledge bases in their own right, this may open up new avenues for understanding places through the lens of machines. In this work, we adopt this thinking and select GPT-4, a state-of-the-art representative in the family of multimodal large language models, to study its geographic diversity regarding how well geographic features are represented. Using DBpedia abstracts as a ground-truth corpus for probing, our natural language--based geo-guessing experiment shows that GPT-4 may currently encode insufficient knowledge about several geographic feature types on a global level. On a local level, we observe not only this insufficiency but also inter-regional…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGeographic Information Systems Studies · Semantic Web and Ontologies · Multimodal Machine Learning Applications
