PokeLLMon: A Human-Parity Agent for Pokemon Battles with Large Language Models
Sihao Hu, Tiansheng Huang, Ling Liu

TL;DR
PokeLLMon is a novel LLM-based agent that achieves human-parity in Pokemon battles by combining in-context reinforcement learning, knowledge retrieval, and consistent action generation, demonstrating competitive win rates against humans.
Contribution
This paper introduces PokeLLMon, the first LLM-embodied agent capable of human-parity performance in tactical Pokemon battles, integrating multiple strategies for improved decision-making.
Findings
Achieves 49% win rate in Ladder competitions
Achieves 56% win rate in invited battles
Demonstrates human-like battle strategies and decision-making
Abstract
We introduce PokeLLMon, the first LLM-embodied agent that achieves human-parity performance in tactical battle games, as demonstrated in Pokemon battles. The design of PokeLLMon incorporates three key strategies: (i) In-context reinforcement learning that instantly consumes text-based feedback derived from battles to iteratively refine the policy; (ii) Knowledge-augmented generation that retrieves external knowledge to counteract hallucination and enables the agent to act timely and properly; (iii) Consistent action generation to mitigate the panic switching phenomenon when the agent faces a powerful opponent and wants to elude the battle. We show that online battles against human demonstrates PokeLLMon's human-like battle strategies and just-in-time decision making, achieving 49% of win rate in the Ladder competitions and 56% of win rate in the invited battles. Our implementation and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNetwork Security and Intrusion Detection · Advanced Malware Detection Techniques
