Message-Dropout: An Efficient Training Method for Multi-Agent Deep   Reinforcement Learning

Woojun Kim; Myungsik Cho; Youngchul Sung

arXiv:1902.06527·cs.LG·February 19, 2019·6 cites

Message-Dropout: An Efficient Training Method for Multi-Agent Deep Reinforcement Learning

Woojun Kim, Myungsik Cho, Youngchul Sung

PDF

Open Access

TL;DR

This paper introduces message-dropout, a novel training method for multi-agent deep reinforcement learning that enhances robustness and performance by selectively dropping communication messages during training.

Contribution

The paper proposes message-dropout, a new technique that improves multi-agent reinforcement learning by handling communication errors and increasing training efficiency.

Findings

01

Message-dropout improves training speed.

02

It enhances steady-state performance.

03

It effectively manages communication errors.

Abstract

In this paper, we propose a new learning technique named message-dropout to improve the performance for multi-agent deep reinforcement learning under two application scenarios: 1) classical multi-agent reinforcement learning with direct message communication among agents and 2) centralized training with decentralized execution. In the first application scenario of multi-agent systems in which direct message communication among agents is allowed, the message-dropout technique drops out the received messages from other agents in a block-wise manner with a certain probability in the training phase and compensates for this effect by multiplying the weights of the dropped-out block units with a correction probability. The applied message-dropout technique effectively handles the increased input dimension in multi-agent reinforcement learning with communication and makes learning robust…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adaptive Dynamic Programming Control · Neural Networks and Reservoir Computing

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Dropout