Parameter Sharing Deep Deterministic Policy Gradient for Cooperative Multi-agent Reinforcement Learning
Xiangxiang Chu, Hangjun Ye

TL;DR
This paper introduces a parameter sharing approach for multi-agent deep deterministic policy gradient methods, significantly improving scalability, learning speed, and memory efficiency in cooperative multi-agent reinforcement learning tasks.
Contribution
It proposes a novel parameter sharing deterministic policy gradient method with three variants, enhancing scalability and efficiency over existing multi-agent DRL approaches.
Findings
Outperforms existing methods in multi-agent games
Scales well with increasing number of agents
Improves learning speed and memory efficiency
Abstract
Deep reinforcement learning for multi-agent cooperation and competition has been a hot topic recently. This paper focuses on cooperative multi-agent problem based on actor-critic methods under local observations settings. Multi agent deep deterministic policy gradient obtained state of art results for some multi-agent games, whereas, it cannot scale well with growing amount of agents. In order to boost scalability, we propose a parameter sharing deterministic policy gradient method with three variants based on neural networks, including actor-critic sharing, actor sharing and actor sharing with partially shared critic. Benchmarks from rllab show that the proposed method has advantages in learning speed and memory efficiency, well scales with growing amount of agents, and moreover, it can make full use of reward sharing and exchangeability if possible.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Autonomous Vehicle Technology and Safety
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
