Efficient Reinforcement Learning for Zero-Shot Coordination in Evolving Games
Bingyu Hui, Lebin Yu, Quanming Yao, Yunpeng Qu, Xudong Zhang, Jian Wang

TL;DR
This paper introduces ScaPT, a scalable reinforcement learning framework that enhances zero-shot coordination in evolving games by efficiently managing population diversity and size, leading to improved generalization with less computational cost.
Contribution
It proposes a novel scalable RL training method with a meta-agent and mutual information regularizer to improve zero-shot coordination in complex evolving games.
Findings
ScaPT outperforms existing methods in Hanabi cooperative game.
The approach effectively scales population size without increased computational costs.
Enhanced generalization to unseen partners in evolving game scenarios.
Abstract
Zero-shot coordination(ZSC), a key challenge in multi-agent game theory, has become a hot topic in reinforcement learning (RL) research recently, especially in complex evolving games. It focuses on the generalization ability of agents, requiring them to coordinate well with collaborators from a diverse, potentially evolving, pool of partners that are not seen before without any fine-tuning. Population-based training, which approximates such an evolving partner pool, has been proven to provide good zero-shot coordination performance; nevertheless, existing methods are limited by computational resources, mainly focusing on optimizing diversity in small populations while neglecting the potential performance gains from scaling population size. To address this issue, this paper proposes the Scalable Population Training (ScaPT), an efficient RL training framework comprising two key…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsReinforcement Learning in Robotics · Evolutionary Game Theory and Cooperation · Adaptive Dynamic Programming Control
