General agents contain world models

Jonathan Richens; David Abel; Alexis Bellot; Tom Everitt

arXiv:2506.01622·cs.AI·October 21, 2025

General agents contain world models

Jonathan Richens, David Abel, Alexis Bellot, Tom Everitt

PDF

TL;DR

This paper demonstrates that for agents to perform flexible, goal-directed tasks across multiple steps, they must learn an internal predictive world model, which can be derived from their policy and is essential for complex behavior.

Contribution

It provides a formal proof that general agents require learned world models for multi-step goal-directed behavior and introduces methods to extract and improve these models.

Findings

01

Agents need accurate world models for complex, goal-directed tasks.

02

World models can be extracted from an agent's policy.

03

Improving world model accuracy enhances agent capabilities.

Abstract

Are world models a necessary ingredient for flexible, goal-directed behaviour, or is model-free learning sufficient? We provide a formal answer to this question, showing that any agent capable of generalizing to multi-step goal-directed tasks must have learned a predictive model of its environment. We show that this model can be extracted from the agent's policy, and that increasing the agents performance or the complexity of the goals it can achieve requires learning increasingly accurate world models. This has a number of consequences: from developing safe and general agents, to bounding agent capabilities in complex environments, and providing new algorithms for eliciting world models from agents.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.