From Human Cognition to Neural Activations: Probing the Computational Primitives of Spatial Reasoning in LLMs
Jiyuan An, Liner Yang, Mengyan Wang, Luming Lu, Weihua An, Erhong Yang

TL;DR
This paper investigates how large language models represent and process spatial reasoning, revealing that their internal spatial representations are limited, transient, and context-dependent, which challenges assumptions about their spatial reasoning capabilities.
Contribution
The study introduces a mechanistic framework based on human spatial cognition primitives and evaluates multilingual LLMs, uncovering their limited and fragmented internal spatial representations.
Findings
Spatial information is encoded in intermediate layers and influences behavior.
Representations are transient, fragmented, and weakly integrated into final predictions.
Similar performance can arise from different internal pathways across languages.
Abstract
As spatial intelligence becomes an increasingly important capability for foundation models, it remains unclear whether large language models' (LLMs) performance on spatial reasoning benchmarks reflects structured internal spatial representations or reliance on linguistic heuristics. We address this question from a mechanistic perspective by examining how spatial information is internally represented and used. Drawing on computational theories of human spatial cognition, we decompose spatial reasoning into three primitives, relational composition, representational transformation, and stateful spatial updating, and design controlled task families for each. We evaluate multilingual LLMs in English, Chinese, and Arabic under single pass inference, and analyze internal representations using linear probing, sparse autoencoder based feature analysis, and causal interventions. We find that task…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
