Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent   Policy Optimization

Mohammad Mehdi Nasiri; Mansoor Rezghi

arXiv:2308.06741·cs.LG·August 15, 2023

Heterogeneous Multi-Agent Reinforcement Learning via Mirror Descent Policy Optimization

Mohammad Mehdi Nasiri, Mansoor Rezghi

PDF

Open Access 1 Repo

TL;DR

This paper introduces HAMDPO, a novel mirror descent-based algorithm for heterogeneous multi-agent reinforcement learning that improves policy stability and performance across diverse tasks.

Contribution

It extends mirror descent methods to heterogeneous MARL, enabling efficient, stable policy updates for agents with different capabilities and action spaces.

Findings

01

HAMDPO outperforms HATRPO and HAPPO on MuJoCo and StarCraftII tasks.

02

The algorithm effectively handles both continuous and discrete action spaces.

03

Results demonstrate improved stability and performance in cooperative MARL settings.

Abstract

This paper presents an extension of the Mirror Descent method to overcome challenges in cooperative Multi-Agent Reinforcement Learning (MARL) settings, where agents have varying abilities and individual policies. The proposed Heterogeneous-Agent Mirror Descent Policy Optimization (HAMDPO) algorithm utilizes the multi-agent advantage decomposition lemma to enable efficient policy updates for each agent while ensuring overall performance improvements. By iteratively updating agent policies through an approximate solution of the trust-region problem, HAMDPO guarantees stability and improves performance. Moreover, the HAMDPO algorithm is capable of handling both continuous and discrete action spaces for heterogeneous agents in various MARL problems. We evaluate HAMDPO on Multi-Agent MuJoCo and StarCraftII tasks, demonstrating its superiority over state-of-the-art algorithms such as HATRPO…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mehdinasiri/mirror-descent-in-marl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics