MAMBPO: Sample-efficient multi-robot reinforcement learning using learned world models
Dani\"el Willemsen, Mario Coppola, Guido C.H.E. de Croon

TL;DR
This paper introduces MAMBPO, a multi-agent model-based reinforcement learning algorithm that significantly improves sample efficiency in multi-robot systems by using learned world models within a centralized training and decentralized execution framework.
Contribution
The paper presents a novel multi-agent model-based RL algorithm, MAMBPO, that enhances sample efficiency over existing model-free methods like MASAC using learned world models.
Findings
MAMBPO achieves similar performance to MASAC with fewer samples.
MAMBPO demonstrates effective learning in simulated multi-robot tasks.
The approach advances the feasibility of real-world multi-robot learning.
Abstract
Multi-robot systems can benefit from reinforcement learning (RL) algorithms that learn behaviours in a small number of trials, a property known as sample efficiency. This research thus investigates the use of learned world models to improve sample efficiency. We present a novel multi-agent model-based RL algorithm: Multi-Agent Model-Based Policy Optimization (MAMBPO), utilizing the Centralized Learning for Decentralized Execution (CLDE) framework. CLDE algorithms allow a group of agents to act in a fully decentralized manner after training. This is a desirable property for many systems comprising of multiple robots. MAMBPO uses a learned world model to improve sample efficiency compared to model-free Multi-Agent Soft Actor-Critic (MASAC). We demonstrate this on two simulated multi-robot tasks, where MAMBPO achieves a similar performance to MASAC, but requires far fewer samples to do so.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Software Engineering Research
