Value-Decomposition Multi-Agent Actor-Critics

Jianyu Su; Stephen Adams; Peter A. Beling

arXiv:2007.12306·cs.AI·December 21, 2020·5 cites

Value-Decomposition Multi-Agent Actor-Critics

Jianyu Su, Stephen Adams, Peter A. Beling

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces VDACs, a novel value-decomposition actor-critic framework for multi-agent reinforcement learning that balances training efficiency and performance, demonstrated on StarCraft II benchmarks.

Contribution

The paper proposes VDACs, extending value-decomposition to actor-critics compatible with A2C, improving training efficiency and performance in multi-agent tasks.

Findings

01

VDACs outperform other actor-critic methods on StarCraft II tasks.

02

Ablation experiments identify key factors influencing VDACs' performance.

03

VDACs achieve better median performance compared to existing methods.

Abstract

The exploitation of extra state information has been an active research area in multi-agent reinforcement learning (MARL). QMIX represents the joint action-value using a non-negative function approximator and achieves the best performance, by far, on multi-agent benchmarks, StarCraft II micromanagement tasks. However, our experiments show that, in some cases, QMIX is incompatible with A2C, a training paradigm that promotes algorithm training efficiency. To obtain a reasonable trade-off between training efficiency and algorithm performance, we extend value-decomposition to actor-critics that are compatible with A2C and propose a novel actor-critic framework, value-decomposition actor-critics (VDACs). We evaluate VDACs on the testbed of StarCraft II micromanagement tasks and demonstrate that the proposed framework improves median performance over other actor-critic methods. Furthermore,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

hahayonghuming/VDACs
pytorchOfficial

Videos

Value-Decomposition Multi-Agent Actor-Critics· underline

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · Metaheuristic Optimization Algorithms Research

MethodsA2C