Off-Beat Multi-Agent Reinforcement Learning

Wei Qiu; Weixun Wang; Rundong Wang; Bo An; Yujing Hu; Svetlana; Obraztsova; Zinovi Rabinovich; Jianye Hao; Yingfeng Chen; Changjie Fan

arXiv:2205.13718·cs.MA·June 22, 2022

Off-Beat Multi-Agent Reinforcement Learning

Wei Qiu, Weixun Wang, Rundong Wang, Bo An, Yujing Hu, Svetlana, Obraztsova, Zinovi Rabinovich, Jianye Hao, Yingfeng Chen, Changjie Fan

PDF

Open Access

TL;DR

This paper introduces LeGEM, a novel episodic memory framework for multi-agent reinforcement learning in environments with off-beat actions, improving coordination and sample efficiency.

Contribution

It develops a new algorithmic framework and a memory scheme to handle off-beat actions and temporal credit assignment in MARL, which were previously unaddressed.

Findings

01

LeGEM significantly improves multi-agent coordination.

02

LeGEM achieves leading performance in various scenarios.

03

LeGEM enhances sample efficiency in MARL.

Abstract

We investigate model-free multi-agent reinforcement learning (MARL) in environments where off-beat actions are prevalent, i.e., all actions have pre-set execution durations. During execution durations, the environment changes are influenced by, but not synchronised with, action execution. Such a setting is ubiquitous in many real-world problems. However, most MARL methods assume actions are executed immediately after inference, which is often unrealistic and can lead to catastrophic failure for multi-agent coordination with off-beat actions. In order to fill this gap, we develop an algorithmic framework for MARL with off-beat actions. We then propose a novel episodic memory, LeGEM, for model-free MARL algorithms. LeGEM builds agents' episodic memories by utilizing agents' individual experiences. It boosts multi-agent learning by addressing the challenging temporal credit assignment…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural and Behavioral Psychology Studies