Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

Jiahui Li; Kun Kuang; Baoxiang Wang; Furui Liu; Long Chen; Fei Wu and; Jun Xiao

arXiv:2106.00285·cs.AI·January 25, 2022

Shapley Counterfactual Credits for Multi-Agent Reinforcement Learning

Jiahui Li, Kun Kuang, Baoxiang Wang, Furui Liu, Long Chen, Fei Wu and, Jun Xiao

PDF

TL;DR

This paper introduces a novel credit assignment method for multi-agent reinforcement learning using Shapley values, improving cooperation and performance by explicitly considering agent interactions.

Contribution

The paper proposes Shapley Counterfactual Credit Assignment, leveraging Shapley values with Monte Carlo approximation to better assign credit among agents in MARL.

Findings

01

Outperforms existing MARL algorithms on StarCraft II benchmarks.

02

Achieves state-of-the-art results, especially in complex scenarios.

03

Effectively estimates individual agent contributions through coalition-based credit assignment.

Abstract

Centralized Training with Decentralized Execution (CTDE) has been a popular paradigm in cooperative Multi-Agent Reinforcement Learning (MARL) settings and is widely used in many real applications. One of the major challenges in the training process is credit assignment, which aims to deduce the contributions of each agent according to the global rewards. Existing credit assignment methods focus on either decomposing the joint value function into individual value functions or measuring the impact of local observations and actions on the global value function. These approaches lack a thorough consideration of the complicated interactions among multiple agents, leading to an unsuitable assignment of credit and subsequently mediocre results on MARL. We propose Shapley Counterfactual Credit Assignment, a novel method for explicit credit assignment which accounts for the coalition of agents.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.