Playing DOOM with 1.3M Parameters: Specialized Small Models vs Large Language Models for Real-Time Game Control

David Golchinfar; Daryoush Vaziri; Alexander Marquardt

arXiv:2604.07385·cs.LG·April 10, 2026

Playing DOOM with 1.3M Parameters: Specialized Small Models vs Large Language Models for Real-Time Game Control

David Golchinfar, Daryoush Vaziri, Alexander Marquardt

PDF

TL;DR

A small, specialized model with 1.3 million parameters outperforms large language models in real-time DOOM gameplay, demonstrating efficiency and effectiveness in domain-specific tasks.

Contribution

Introduces SauerkrautLM-Doom-MultiVec, a compact model that surpasses large LLMs in real-time game control through domain-specific training and architecture design.

Findings

01

Our model achieves 17.8 frags per episode, outperforming all tested LLMs.

02

It actively engages enemies, unlike other models that only evade.

03

The model operates at 31ms per decision, suitable for real-time gameplay.

Abstract

We present SauerkrautLM-Doom-MultiVec, a 1.3 million parameter model that plays the classic first-person shooter DOOM in real time, outperforming large language models up to 92,000x its size, including Nemotron-120B, Qwen3.5-27B, and GPT-4o-mini. Our model combines a ModernBERT encoder with hash embeddings, depth-aware token representations, and an attention pooling classification head to select game actions from ASCII frame representations at 31ms per decision. Trained on just 31,000 human gameplay demonstrations, it achieves 178 frags in 10 episodes (17.8 per episode) in the defend_the_center scenario, more than all tested LLMs combined (13 frags total). All agents receive equivalent input: ASCII frames and depth maps. Despite having 92,000x fewer parameters than Nemotron-120B, our model is the only agent that actively engages enemies rather than purely evading them. These results…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.