VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pok\'emon

Cameron Angliss; Jiaxun Cui; Jiaheng Hu; Arrasy Rahman; Peter Stone

arXiv:2506.10326·cs.AI·January 14, 2026

VGC-Bench: Towards Mastering Diverse Team Strategies in Competitive Pok\'emon

Cameron Angliss, Jiaxun Cui, Jiaheng Hu, Arrasy Rahman, Peter Stone

PDF

Open Access 1 Repo 1 Models 1 Datasets

TL;DR

This paper introduces VGC-Bench, a comprehensive benchmark for developing and evaluating AI agents capable of mastering diverse team strategies in Pokémon VGC, a domain with an extremely large configuration space.

Contribution

The paper presents VGC-Bench, a new benchmark with datasets, evaluation protocols, and baseline agents for multi-agent learning in Pokémon VGC, addressing the challenge of strategy generalization across diverse teams.

Findings

01

Methods can beat professional players in single-team mirror matches.

02

Performance degrades as team diversity increases, but generalizes better to unseen teams.

03

Baseline agents include heuristics, language models, and reinforcement learning approaches.

Abstract

Developing AI agents that can robustly adapt to varying strategic landscapes without retraining is a central challenge in multi-agent learning. Pok\'emon Video Game Championships (VGC) is a domain with a vast space of approximately $1 0^{139}$ team configurations, far larger than those of other games such as Chess, Go, Poker, StarCraft, or Dota. The combinatorial nature of team building in Pok\'emon VGC causes optimal strategies to vary substantially depending on both the controlled team and the opponent's team, making generalization uniquely challenging. To advance research on this problem, we introduce VGC-Bench: a benchmark that provides critical infrastructure, standardizes evaluation protocols, and supplies a human-play dataset of over 700,000 battle logs and a range of baseline agents based on heuristics, large language models, behavior cloning, and multi-agent reinforcement…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

cameronangliss/vgc-bench
pytorchOfficial

Models

🤗
cameronangliss/vgc-bench-models
model

Datasets

cameronangliss/vgc-battle-logs
dataset· 142 dl
142 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Games · Reinforcement Learning in Robotics · Digital Games and Media