From Word to World: Can Large Language Models be Implicit Text-based World Models?
Yixia Li, Hongru Wang, Jiahao Qiu, Zhenfei Yin, Dongdong Zhang, Cheng Qian, Zeping Li, Pony Ma, Guanhua Chen, Heng Ji

TL;DR
This paper investigates whether large language models can serve as effective implicit world models in text-based environments, demonstrating their potential to improve agent learning under certain conditions.
Contribution
It introduces a three-level framework for evaluating LLM-based world models and shows their ability to enhance agent performance in specific scenarios.
Findings
LLMs maintain coherent latent states in text environments
World models scale predictably with data and size
They improve agent learning through various methods
Abstract
Agentic reinforcement learning increasingly relies on experience-driven scaling, yet real-world environments remain non-adaptive, limited in coverage, and difficult to scale. World models offer a potential way to improve learning efficiency through simulated experience, but it remains unclear whether large language models can reliably serve this role and under what conditions they meaningfully benefit agents. We study these questions in text-based environments, which provide a controlled setting to reinterpret language modeling as next-state prediction under interaction. We introduce a three-level framework for evaluating LLM-based world models: (i) fidelity and consistency, (ii) scalability and robustness, and (iii) agent utility. Across five representative environments, we find that sufficiently trained world models maintain coherent latent state, scale predictably with data and model…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗X1AOX1A/WorldModel-Textworld-Llama3.1-8Bmodel· 153 dl153 dl
- 🤗X1AOX1A/WorldModel-Webshop-Llama3.1-8Bmodel· 433 dl433 dl
- 🤗X1AOX1A/WorldModel-Alfworld-Qwen2.5-7Bmodel· 12 dl12 dl
- 🤗X1AOX1A/WorldModel-Sciworld-Qwen2.5-7Bmodel
- 🤗X1AOX1A/WorldModel-Textworld-Qwen2.5-7Bmodel· 348 dl348 dl
- 🤗X1AOX1A/WorldModel-Webshop-Qwen2.5-7Bmodel· 756 dl756 dl
- 🤗X1AOX1A/WorldModel-Alfworld-Llama3.1-8Bmodel· 1 dl1 dl
- 🤗X1AOX1A/WorldModel-Sciworld-Llama3.1-8Bmodel
- 🤗X1AOX1A/WorldModel-Stabletoolbench-Qwen2.5-7Bmodel· 3 dl· ♡ 13 dl♡ 1
- 🤗X1AOX1A/WorldModel-Stabletoolbench-Llama3.1-8Bmodel· 4 dl· ♡ 14 dl♡ 1
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Explainable Artificial Intelligence (XAI)
