VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pok\'emon
Cameron Angliss, Jiaxun Cui, Jiaheng Hu, Arrasy Rahman, Peter Stone

TL;DR
This paper introduces VGC-Bench, a comprehensive benchmark for developing and evaluating AI agents capable of mastering diverse team strategies in Pokémon VGC, a domain with an extremely large configuration space.
Contribution
The paper presents VGC-Bench, a new benchmark with datasets, evaluation protocols, and baseline agents for multi-agent learning in Pokémon VGC, addressing the challenge of strategy generalization across diverse teams.
Findings
Methods can beat professional players in single-team mirror matches.
Performance degrades as team diversity increases, but generalizes better to unseen teams.
Baseline agents include heuristics, language models, and reinforcement learning approaches.
Abstract
Developing AI agents that can robustly adapt to varying strategic landscapes without retraining is a central challenge in multi-agent learning. Pok\'emon Video Game Championships (VGC) is a domain with a vast space of approximately team configurations, far larger than those of other games such as Chess, Go, Poker, StarCraft, or Dota. The combinatorial nature of team building in Pok\'emon VGC causes optimal strategies to vary substantially depending on both the controlled team and the opponent's team, making generalization uniquely challenging. To advance research on this problem, we introduce VGC-Bench: a benchmark that provides critical infrastructure, standardizes evaluation protocols, and supplies a human-play dataset of over 700,000 battle logs and a range of baseline agents based on heuristics, large language models, behavior cloning, and multi-agent reinforcement…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Digital Games and Media
