LLM world models are mental: Output layer evidence of brittle world model use in LLM mechanical reasoning

Cole Robertson; Philip Wolff

arXiv:2507.15521·cs.AI·July 22, 2025

LLM world models are mental: Output layer evidence of brittle world model use in LLM mechanical reasoning

Cole Robertson, Philip Wolff

PDF

TL;DR

This study investigates whether large language models construct internal world models for mechanical reasoning, finding evidence they manipulate internal representations to some extent but may lack detailed structural understanding.

Contribution

The paper introduces cognitive science methods to evaluate LLMs' internal world models, revealing their partial use of internal representations in mechanical reasoning tasks.

Findings

01

LLMs estimate mechanical advantage slightly above chance

02

Models can differentiate functional pulley systems from jumbled ones

03

Models struggle to identify systems with no force transfer, indicating limits in structural reasoning

Abstract

Do large language models (LLMs) construct and manipulate internal world models, or do they rely solely on statistical associations represented as output layer token probabilities? We adapt cognitive science methodologies from human mental models research to test LLMs on pulley system problems using TikZ-rendered stimuli. Study 1 examines whether LLMs can estimate mechanical advantage (MA). State-of-the-art models performed marginally but significantly above chance, and their estimates correlated significantly with ground-truth MA. Significant correlations between number of pulleys and model estimates suggest that models employed a pulley counting heuristic, without necessarily simulating pulley systems to derive precise values. Study 2 tested this by probing whether LLMs represent global features crucial to MA estimation. Models evaluated a functionally connected pulley system against a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.