Better World Models Can Lead to Better Post-Training Performance

Prakhar Gupta; Henry Conklin; Sarah-Jane Leslie; Andrew Lee

arXiv:2512.03400·cs.LG·December 4, 2025

Better World Models Can Lead to Better Post-Training Performance

Prakhar Gupta, Henry Conklin, Sarah-Jane Leslie, Andrew Lee

PDF

Open Access

TL;DR

This paper investigates how explicit world-modeling objectives during training enhance the internal representations of Transformers and improve their post-training performance on complex tasks like solving a Rubik's Cube.

Contribution

It demonstrates that explicit world-model pretraining improves internal representations and post-training performance, especially on difficult tasks, compared to standard next-token prediction.

Findings

01

Explicit world-modeling yields more decodable state representations.

02

Better representations lead to higher gains in post-training performance.

03

Improved state representations particularly benefit harder cube states.

Abstract

In this work we study how explicit world-modeling objectives affect the internal representations and downstream capability of Transformers across different training stages. We use a controlled 2x2x2 Rubik's Cube and ask: (1) how does explicitly pretraining a world model affect the model's latent representations, and (2) how does world-model quality affect the model's performance after reinforcement learning post-training? We compare standard next-token prediction to two explicit world-modeling strategies -- (i) state-prediction pretraining and (ii) a joint state-prediction + next-token objective -- and assess task performance after Group Relative Policy Optimization (GRPO) is applied as post-training. We evaluate the representation quality with linear probes and causal interventions. We find that explicit world-modeling yields more linearly decodable and causally steerable state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Artificial Intelligence in Games