Decentralized Multi-Agent Reinforcement Learning with Global State Prediction
Joshua Bloom, Pranjal Paliwal, Apratim Mukherjee, Carlo Pinciroli

TL;DR
This paper investigates decentralized multi-agent reinforcement learning for robot swarms, introducing a global state prediction network to improve performance without relying on global information, demonstrated through collective transport tasks.
Contribution
The paper proposes Global State Prediction (GSP), a novel network that predicts swarm states, enhancing decentralized learning and robustness in multi-agent systems without global knowledge.
Findings
GSP improves success rates in collective transport tasks.
Including GSP increases robustness against obstacles.
Decentralized methods with GSP outperform those without global information.
Abstract
Deep reinforcement learning (DRL) has seen remarkable success in the control of single robots. However, applying DRL to robot swarms presents significant challenges. A critical challenge is non-stationarity, which occurs when two or more robots update individual or shared policies concurrently, thereby engaging in an interdependent training process with no guarantees of convergence. Circumventing non-stationarity typically involves training the robots with global information about other agents' states and/or actions. In contrast, in this paper we explore how to remove the need for global information. We pose our problem as a Partially Observable Markov Decision Process, due to the absence of global knowledge on other agents. Using collective transport as a testbed scenario, we study two approaches to multi-agent training. In the first, the robots exchange no messages, and are trained to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing · Domain Adaptation and Few-Shot Learning
