Centralized Training with Hybrid Execution in Multi-Agent Reinforcement   Learning

Pedro P. Santos; Diogo S. Carvalho; Miguel Vasco; Alberto Sardinha,; Pedro A. Santos; Ana Paiva; Francisco S. Melo

arXiv:2210.06274·cs.LG·June 6, 2023·1 cites

Centralized Training with Hybrid Execution in Multi-Agent Reinforcement Learning

Pedro P. Santos, Diogo S. Carvalho, Miguel Vasco, Alberto Sardinha,, Pedro A. Santos, Ana Paiva, Francisco S. Melo

PDF

Open Access 1 Repo

TL;DR

This paper introduces hybrid execution in multi-agent reinforcement learning, enabling agents to adapt to varying communication levels at runtime using a centralized predictive model, improving cooperation under partial observability.

Contribution

The paper proposes hybrid-POMDPs to model variable communication levels and introduces MARO, a centralized auto-regressive model that estimates missing observations during execution.

Findings

01

MARO outperforms baselines in standard benchmarks.

02

Agents effectively cooperate with faulty communication.

03

Hybrid execution improves robustness in partial observability.

Abstract

We introduce hybrid execution in multi-agent reinforcement learning (MARL), a new paradigm in which agents aim to successfully complete cooperative tasks with arbitrary communication levels at execution time by taking advantage of information-sharing among the agents. Under hybrid execution, the communication level can range from a setting in which no communication is allowed between agents (fully decentralized), to a setting featuring full communication (fully centralized), but the agents do not know beforehand which communication level they will encounter at execution time. To formalize our setting, we define a new class of multi-agent partially observable Markov decision processes (POMDPs) that we name hybrid-POMDPs, which explicitly model a communication process between the agents. We contribute MARO, an approach that makes use of an auto-regressive predictive model, trained in a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

PPSantos/hybrid-marl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics