RPM: Generalizable Behaviors for Multi-Agent Reinforcement Learning
Wei Qiu, Xiao Ma, Bo An, Svetlana Obraztsova, Shuicheng Yan, Zhongwen, Xu

TL;DR
This paper introduces RPM, a method that enhances the generalizability of multi-agent reinforcement learning policies by maintaining a ranked memory of past policies to promote diverse interactions during training.
Contribution
The paper proposes RPM, a novel self-play framework that improves MARL generalization by leveraging a ranked policy memory to diversify training interactions.
Findings
RPM significantly improves generalization to unseen agents.
Performance boosts up to 402% on average in experiments.
Diverse multi-agent trajectories enhance policy robustness.
Abstract
Despite the recent advancement in multi-agent reinforcement learning (MARL), the MARL agents easily overfit the training environment and perform poorly in the evaluation scenarios where other agents behave differently. Obtaining generalizable policies for MARL agents is thus necessary but challenging mainly due to complex multi-agent interactions. In this work, we model the problem with Markov Games and propose a simple yet effective method, ranked policy memory (RPM), to collect diverse multi-agent trajectories for training MARL policies with good generalizability. The main idea of RPM is to maintain a look-up memory of policies. In particular, we try to acquire various levels of behaviors by saving policies via ranking the training episode return, i.e., the episode return of agents in the training environment; when an episode starts, the learning agent can then choose a policy from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Machine Learning and Data Classification · Data Stream Mining Techniques
