SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments

Xinyi Li; Zaishuo Xia; Weyl Lu; Chenjie Hao; Yubei Chen

arXiv:2511.23465·cs.LG·December 1, 2025

SmallWorlds: Assessing Dynamics Understanding of World Models in Isolated Environments

Xinyi Li, Zaishuo Xia, Weyl Lu, Chenjie Hao, Yubei Chen

PDF

Open Access

TL;DR

This paper introduces the SmallWorld Benchmark, a controlled test environment for evaluating world models' ability to learn environment dynamics, providing insights into their strengths and limitations across various architectures.

Contribution

The paper presents the SmallWorld Benchmark for systematic evaluation of world models, enabling controlled assessment of their dynamic understanding without reward signals.

Findings

01

Models capture environment structure to varying degrees

02

Prediction quality deteriorates over long rollouts

03

Different architectures exhibit distinct strengths and weaknesses

Abstract

Current world models lack a unified and controlled setting for systematic evaluation, making it difficult to assess whether they truly capture the underlying rules that govern environment dynamics. In this work, we address this open challenge by introducing the SmallWorld Benchmark, a testbed designed to assess world model capability under isolated and precisely controlled dynamics without relying on handcrafted reward signals. Using this benchmark, we conduct comprehensive experiments in the fully observable state space on representative architectures including Recurrent State Space Model, Transformer, Diffusion model, and Neural ODE, examining their behavior across six distinct domains. The experimental results reveal how effectively these models capture environment structure and how their predictions deteriorate over extended rollouts, highlighting both the strengths and limitations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGenerative Adversarial Networks and Image Synthesis · Machine Learning in Healthcare · Explainable Artificial Intelligence (XAI)