MAMBPO: Sample-efficient multi-robot reinforcement learning using   learned world models

Dani\"el Willemsen; Mario Coppola; Guido C.H.E. de Croon

arXiv:2103.03662·cs.RO·March 8, 2021

MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models

Dani\"el Willemsen, Mario Coppola, Guido C.H.E. de Croon

PDF

Open Access 1 Repo

TL;DR

This paper introduces MAMBPO, a multi-agent model-based reinforcement learning algorithm that significantly improves sample efficiency in multi-robot systems by using learned world models within a centralized training and decentralized execution framework.

Contribution

The paper presents a novel multi-agent model-based RL algorithm, MAMBPO, that enhances sample efficiency over existing model-free methods like MASAC using learned world models.

Findings

01

MAMBPO achieves similar performance to MASAC with fewer samples.

02

MAMBPO demonstrates effective learning in simulated multi-robot tasks.

03

The approach advances the feasibility of real-world multi-robot learning.

Abstract

Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials, a property known as sample efficiency. This research thus investigates the use of learned world models to improve sample efficiency. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO), utilizing the Centralized Learning for Decentralized Execution (CLDE) framework. CLDE algorithms allow a group of agents to act in a fully decentralized manner after training. This is a desirable property for many systems comprising of multiple robots. MAMBPO uses a learned world model to improve sample efficiency compared to model-free Multi-Agent Soft Actor-Critic (MASAC). We demonstrate this on two simulated multi-robot tasks, where MAMBPO achieves a similar performance to MASAC, but requires far fewer samples to do so.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

danielwillemsen/mambpo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Software Engineering Research