Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Yuguang Yue; Irakli Salia; Samuel Hunt; Chris Green; Wenzhe Shi; Jonathan J Hunt

arXiv:2601.04575·cs.AI·January 30, 2026

Scaling Behavior Cloning Improves Causal Reasoning: An Open Model for Real-Time Video Game Playing

Yuguang Yue, Irakli Salia, Samuel Hunt, Chris Green, Wenzhe Shi, Jonathan J Hunt

PDF

Open Access 1 Models 2 Datasets

TL;DR

This paper presents an open foundation model for real-time video game playing that leverages behavior cloning, demonstrating that larger models and more data enhance causal reasoning and performance across various 3D games.

Contribution

Introduces an open recipe and dataset for training a real-time video game foundation model, and investigates how scaling affects causal reasoning in behavior cloning.

Findings

01

Larger models and more data improve causal reasoning.

02

The model achieves performance comparable to human players.

03

Scaling laws observed in toy settings hold at large scale.

Abstract

Behavior cloning has seen a resurgence as scaling model and data sizes demonstrate strong performance. In this work, we introduce an open recipe for training a video game playing foundation model designed for inference in realtime on a consumer GPU. We release all data (8300+ hours of high quality human gameplay), training and inference code, and pretrained checkpoints under an open license. Empirically, we show that our best model achieves performance competitive with human players across a variety of 3D games. We use this recipe to investigate the scaling laws of behavior cloning, with a focus on causal reasoning. In a controlled toy setting, we first demonstrate that increasing training data and network depth leads to the model learning a more causal policy. We then validate these findings at scale, analyzing models up to 1.2 billion parameters. We observe that the causal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

🤗
elefantai/open-p2p
model· ♡ 8
♡ 8

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI)