Remember and Forget Experience Replay for Multi-Agent Reinforcement   Learning

Pascal Weber; Daniel W\"alchli; Mustafa Zeqiri; Petros Koumoutsakos

arXiv:2203.13319·cs.LG·September 5, 2022

Remember and Forget Experience Replay for Multi-Agent Reinforcement Learning

Pascal Weber, Daniel W\"alchli, Mustafa Zeqiri, Petros Koumoutsakos

PDF

Open Access

TL;DR

This paper extends the ReF-ER algorithm to multi-agent reinforcement learning, demonstrating improved performance in complex environments by using a simple neural network architecture and tailored value estimation strategies.

Contribution

The paper introduces ReF-ER for MARL, showing its effectiveness and simplicity compared to complex neural network approaches in multi-agent settings.

Findings

01

ReF-ER MARL outperforms existing algorithms on SISL benchmarks.

02

Using individual rewards improves collaborative environment performance.

03

A single feed-forward neural network suffices for policy and value estimation.

Abstract

We present the extension of the Remember and Forget for Experience Replay (ReF-ER) algorithm to Multi-Agent Reinforcement Learning (MARL). ReF-ER was shown to outperform state of the art algorithms for continuous control in problems ranging from the OpenAI Gym to complex fluid flows. In MARL, the dependencies between the agents are included in the state-value estimator and the environment dynamics are modeled via the importance weights used by ReF-ER. In collaborative environments, we find the best performance when the value is estimated using individual rewards and we ignore the effects of other actions on the transition map. We benchmark the performance of ReF-ER MARL on the Stanford Intelligent Systems Laboratory (SISL) environments. We find that employing a single feed-forward neural network for the policy and the value function in ReF-ER MARL, outperforms state of the art…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Data Stream Mining Techniques

MethodsExperience Replay