Exploring Geographic Relative Space in Large Language Models through Activation Patching
Stef De Sabbata, Rahul Baiju, Stefano Mizzaro, Kevin Roitero

TL;DR
This paper investigates how large language models understand relative geographic space by applying activation patching, aiming to enhance interpretability and safety in geographic applications.
Contribution
It introduces the use of activation patching to analyze LLMs' processing of geographic space, contributing to mechanistic interpretability in this domain.
Findings
Activation patching reveals how LLMs encode geographic relations.
The study provides insights into the internal representations of geographic concepts.
Results suggest pathways for safer and more reliable geographic LLM applications.
Abstract
The increased use of Large Language Models (LLMs) in geography raises substantial questions about the safety of integrating these tools across a wide range of processes and analyses, given our very limited understanding of their inner workings. In this extended abstract, we examine how LLMs process relative geographic space using activation patching, an emerging tool for mechanistic interpretability.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
