Mastering Board Games by External and Internal Planning with Language Models

John Schultz; Jakub Adamek; Matej Jusup; Marc Lanctot; Michael Kaisers; Sarah Perrin; Daniel Hennes; Jeremy Shar; Cannada Lewis; Anian Ruoss; Tom Zahavy; Petar Veli\v{c}kovi\'c; Laurel Prince; Satinder Singh; Eric Malmi; Nenad Toma\v{s}ev

arXiv:2412.12119·cs.AI·May 26, 2025·2 cites

Mastering Board Games by External and Internal Planning with Language Models

John Schultz, Jakub Adamek, Matej Jusup, Marc Lanctot, Michael Kaisers, Sarah Perrin, Daniel Hennes, Jeremy Shar, Cannada Lewis, Anian Ruoss, Tom Zahavy, Petar Veli\v{c}kovi\'c, Laurel Prince, Satinder Singh, Eric Malmi, Nenad Toma\v{s}ev

PDF

Open Access

TL;DR

This paper demonstrates that search-based planning methods, both external and internal, significantly enhance the game-playing strength of Large Language Models across various board games, achieving near-human and Grandmaster-level performance.

Contribution

The paper introduces and compares two novel search-based planning approaches for LLMs in board games, achieving state-of-the-art performance without external game engines.

Findings

01

LLMs with search outperform base models in multiple board games.

02

Achieved Grandmaster-level performance in chess with LLM-based planning.

03

Both approaches operate efficiently with minimal hallucinations.

Abstract

Advancing planning and reasoning capabilities of Large Language Models (LLMs) is one of the key prerequisites towards unlocking their potential for performing reliably in complex and impactful domains. In this paper, we aim to demonstrate this across board games (Chess, Fischer Random / Chess960, Connect Four, and Hex), and we show that search-based planning can yield significant improvements in LLM game-playing strength. We introduce, compare and contrast two major approaches: In external search, the model guides Monte Carlo Tree Search (MCTS) rollouts and evaluations without calls to an external game engine, and in internal search, the model is trained to generate in-context a linearized tree of search and a resulting final choice. Both build on a language model pre-trained on relevant domain knowledge, reliably capturing the transition and value functions in the respective…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEducational Games and Gamification · BIM and Construction Integration · Artificial Intelligence in Games

MethodsBalanced Selection