Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal   Policy Optimization

Aditya Kapoor; Benjamin Freed; Howie Choset; Jeff Schneider

arXiv:2408.04295·cs.MA·February 10, 2025·2 cites

Assigning Credit with Partial Reward Decoupling in Multi-Agent Proximal Policy Optimization

Aditya Kapoor, Benjamin Freed, Howie Choset, Jeff Schneider

PDF

Open Access 1 Repo

TL;DR

This paper introduces PRD-MAPPO, a novel multi-agent reinforcement learning algorithm that improves credit assignment by dynamically decomposing agent groups using learned attention, leading to better efficiency and performance.

Contribution

It proposes partial reward decoupling with attention mechanisms to enhance credit assignment in MAPPO, including a version for shared reward scenarios.

Findings

01

PRD-MAPPO outperforms MAPPO and other methods in multi-agent tasks.

02

It improves data efficiency and asymptotic performance.

03

The shared reward version of PRD-MAPPO is effective.

Abstract

Multi-agent proximal policy optimization (MAPPO) has recently demonstrated state-of-the-art performance on challenging multi-agent reinforcement learning tasks. However, MAPPO still struggles with the credit assignment problem, wherein the sheer difficulty in ascribing credit to individual agents' actions scales poorly with team size. In this paper, we propose a multi-agent reinforcement learning algorithm that adapts recent developments in credit assignment to improve upon MAPPO. Our approach leverages partial reward decoupling (PRD), which uses a learned attention mechanism to estimate which of a particular agent's teammates are relevant to its learning updates. We use this estimate to dynamically decompose large groups of agents into smaller, more manageable subgroups. We empirically demonstrate that our approach, PRD-MAPPO, decouples agents from teammates that do not influence their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

uoe-agents/pressureplate
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuction Theory and Applications · Efficiency Analysis Using DEA · Supply Chain and Inventory Management

MethodsSoftmax · Attention Is All You Need