Efficient Distributed Framework for Collaborative Multi-Agent Reinforcement Learning
Shuhan Qi, Shuhao Zhang, Xiaohan Hou, Jiajia Zhang, Xuan Wang, Jing, Xiao

TL;DR
This paper introduces an efficient distributed multi-agent reinforcement learning framework that enhances sample collection, diversity, and training speed by leveraging asynchronous modules and decoupling model updates from environment interactions.
Contribution
The paper proposes a novel distributed MARL framework based on actor-work-learner architecture, suitable for multi-agent environments with incomplete information, improving efficiency and stability.
Findings
Framework accelerates sample collection and policy iteration.
Effective in military simulation and real-time strategy environments.
Improves training stability and efficiency in multi-agent settings.
Abstract
Multi-agent reinforcement learning for incomplete information environments has attracted extensive attention from researchers. However, due to the slow sample collection and poor sample exploration, there are still some problems in multi-agent reinforcement learning, such as unstable model iteration and low training efficiency. Moreover, most of the existing distributed framework are proposed for single-agent reinforcement learning and not suitable for multi-agent. In this paper, we design an distributed MARL framework based on the actor-work-learner architecture. In this framework, multiple asynchronous environment interaction modules can be deployed simultaneously, which greatly improves the sample collection speed and sample diversity. Meanwhile, to make full use of computing resources, we decouple the model iteration from environment interaction, and thus accelerate the policy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings
