Probing the Impact of Scale on Data-Efficient, Generalist Transformer World Models for Atari
Jooyeon Kim

TL;DR
This paper investigates how scaling transformer-based world models affects data efficiency and performance across Atari games, revealing environment-specific regimes and the stabilizing effect of joint training.
Contribution
It provides a detailed analysis of scaling behaviors in transformer world models on Atari, highlighting the importance of scaling strategies and joint training for generalist agents.
Findings
Environments fall into distinct scaling regimes with different fidelity trends.
Joint training stabilizes scaling dynamics across multiple environments.
Policies learned in simulation achieve high scores, close to expert performance.
Abstract
Developing generalist systems that retain human-like data efficiency is a central challenge. While world models (WMs) offer a promising path, existing research often conflates architectural mechanisms with the independent impact of model \emph{scale}. In this work, we use a minimalist transformer world model to analyze scaling behaviors on the Atari 100k benchmark, using fixed offline datasets derived from a presupposed expert policy. Our results reveal that environments fundamentally fall into distinct scaling regimes, even when constrained by identical offline data budgets and model capacities. For individual tasks, some environments naturally allow models to pass the interpolation threshold, yielding monotonic improvements in the overparameterized regime, while others remain trapped in the classical regime, where larger world models degrade fidelity. In the unified setting, i.e., a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
