Improving Zero-Shot Coordination Performance Based on Policy Similarity
Lebin Yu, Yunbo Qiu, Quanming Yao, Xudong Zhang, Jian Wang

TL;DR
This paper introduces a similarity-based training method that enhances zero-shot coordination in multi-agent reinforcement learning by leveraging policy similarity, leading to improved performance with less time and better generalizability.
Contribution
It reveals the correlation between policy similarity and coordination performance and proposes a novel SBRT scheme that disturbs training partners based on this similarity.
Findings
SBRT improves zero-shot coordination across multiple frameworks.
Coordination performance correlates linearly with policy similarity.
The method outperforms previous approaches in efficiency and effectiveness.
Abstract
Over these years, multi-agent reinforcement learning has achieved remarkable performance in multi-agent planning and scheduling tasks. It typically follows the self-play setting, where agents are trained by playing with a fixed group of agents. However, in the face of zero-shot coordination, where an agent must coordinate with unseen partners, self-play agents may fail. Several methods have been proposed to handle this problem, but they either take a lot of time or lack generalizability. In this paper, we firstly reveal an important phenomenon: the zero-shot coordination performance is strongly linearly correlated with the similarity between an agent's training partner and testing partner. Inspired by it, we put forward a Similarity-Based Robust Training (SBRT) scheme that improves agents' zero-shot coordination performance by disturbing their partners' actions during training according…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques
