SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments
Xinyi Li, Zaishuo Xia, Weyl Lu, Chenjie Hao, Yubei Chen

TL;DR
This paper introduces the SmallWorld Benchmark, a controlled test environment for evaluating world models' ability to learn environment dynamics, providing insights into their strengths and limitations across various architectures.
Contribution
The paper presents the SmallWorld Benchmark for systematic evaluation of world models, enabling controlled assessment of their dynamic understanding without reward signals.
Findings
Models capture environment structure to varying degrees
Prediction quality deteriorates over long rollouts
Different architectures exhibit distinct strengths and weaknesses
Abstract
Current world models lack a unified and controlled setting for systematic evaluation, making it difficult to assess whether they truly capture the underlying rules that govern environment dynamics. In this work, we address this open challenge by introducing the SmallWorld Benchmark, a testbed designed to assess world model capability under isolated and precisely controlled dynamics without relying on handcrafted reward signals. Using this benchmark, we conduct comprehensive experiments in the fully observable state space on representative architectures including Recurrent State Space Model, Transformer, Diffusion model, and Neural ODE, examining their behavior across six distinct domains. The experimental results reveal how effectively these models capture environment structure and how their predictions deteriorate over extended rollouts, highlighting both the strengths and limitations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)
