Can Large Language Models Play Games? A Case Study of A Self-Play Approach
Hongyi Guo, Zhihan Liu, Yufeng Zhang, Zhaoran Wang

TL;DR
This paper presents a novel method combining Large Language Models with Monte-Carlo Tree Search self-play to effectively play deterministic turn-based games like chess and go without additional training, improving decision-making reliability.
Contribution
The work introduces a self-play approach that uses LLMs as action pruners and value proxies, with theoretical guarantees on value estimation suboptimality, enhancing game-playing performance.
Findings
Successfully applied to chess and go
Improves decision-making over direct LLM application
Theoretically bounds value estimation errors
Abstract
Large Language Models (LLMs) harness extensive data from the Internet, storing a broad spectrum of prior knowledge. While LLMs have proven beneficial as decision-making aids, their reliability is hampered by limitations in reasoning, hallucination phenomenon, and so on. On the other hand, Monte-Carlo Tree Search (MCTS) is a heuristic search algorithm that provides reliable decision-making solutions, achieved through recursive rollouts and self-play. However, the effectiveness of MCTS relies heavily on heuristic pruning and external value functions, particularly in complex decision scenarios. This work introduces an innovative approach that bolsters LLMs with MCTS self-play to efficiently resolve deterministic turn-based zero-sum games (DTZG), such as chess and go, without the need for additional training. Specifically, we utilize LLMs as both action pruners and proxies for value…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsPruning · Monte-Carlo Tree Search
