Pok\'eAI: A Goal-Generating, Battle-Optimizing Multi-agent System for Pokemon Red
Zihao Liu, Xinhang Sui, Yueran Song, Siwen Wang

TL;DR
PokéAI is a multi-agent framework using large language models to autonomously play and improve at Pokémon Red, demonstrating near-human battle performance and unique strategic behaviors.
Contribution
This work introduces the first text-based multi-agent LLM system for autonomous gameplay in Pokémon Red, integrating planning, execution, and critique modules.
Findings
Achieves 80.8% win rate in battles, close to experienced humans.
Battle performance correlates with language task scores, linking linguistic and strategic skills.
Models develop distinct playstyles, indicating personalized strategic behaviors.
Abstract
We introduce Pok\'eAI, the first text-based, multi-agent large language model (LLM) framework designed to autonomously play and progress through Pok\'emon Red. Our system consists of three specialized agents-Planning, Execution, and Critique-each with its own memory bank, role, and skill set. The Planning Agent functions as the central brain, generating tasks to progress through the game. These tasks are then delegated to the Execution Agent, which carries them out within the game environment. Upon task completion, the Critique Agent evaluates the outcome to determine whether the objective was successfully achieved. Once verification is complete, control returns to the Planning Agent, forming a closed-loop decision-making system. As a preliminary step, we developed a battle module within the Execution Agent. Our results show that the battle AI achieves an average win rate of 80.8%…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Multimodal Machine Learning Applications · Reinforcement Learning in Robotics
