Multiagent Model-based Credit Assignment for Continuous Control

Dongge Han; Chris Xiaoxuan Lu; Tomasz Michalak; Michael Wooldridge

arXiv:2112.13937·cs.AI·December 30, 2021·5 cites

Multiagent Model-based Credit Assignment for Continuous Control

Dongge Han, Chris Xiaoxuan Lu, Tomasz Michalak, Michael Wooldridge

PDF

Open Access

TL;DR

This paper introduces a decentralized multiagent reinforcement learning framework for continuous control in robotics, combining cooperative PPO, game-theoretic credit assignment, and model-based RL to improve sample efficiency and enable decentralized operation.

Contribution

It presents a novel decentralized multiagent RL framework with a game-theoretic credit assignment and model-based components for continuous control tasks.

Findings

01

Effective in Mujoco locomotion tasks

02

Improves sample efficiency significantly

03

Enables decentralized control without communication

Abstract

Deep reinforcement learning (RL) has recently shown great promise in robotic continuous control tasks. Nevertheless, prior research in this vein center around the centralized learning setting that largely relies on the communication availability among all the components of a robot. However, agents in the real world often operate in a decentralised fashion without communication due to latency requirements, limited power budgets and safety concerns. By formulating robotic components as a system of decentralised agents, this work presents a decentralised multiagent reinforcement learning framework for continuous control. To this end, we first develop a cooperative multiagent PPO framework that allows for centralized optimisation during training and decentralised operation during execution. However, the system only receives a global reward signal which is not attributed towards each agent.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Robotic Locomotion and Control · Zebrafish Biomedical Research Applications

MethodsEntropy Regularization · Proximal Policy Optimization