Boosting Value Decomposition via Unit-Wise Attentive State   Representation for Cooperative Multi-Agent Reinforcement Learning

Qingpeng Zhao; Yuanyang Zhu; Zichuan Liu; Zhi Wang; Chunlin Chen

arXiv:2305.07182·cs.MA·May 15, 2023·2 cites

Boosting Value Decomposition via Unit-Wise Attentive State Representation for Cooperative Multi-Agent Reinforcement Learning

Qingpeng Zhao, Yuanyang Zhu, Zichuan Liu, Zhi Wang, Chunlin Chen

PDF

Open Access

TL;DR

This paper introduces UNSR, a transformer-based unit-wise state representation method that improves value decomposition and coordination in cooperative multi-agent reinforcement learning, especially under environmental uncertainties.

Contribution

The paper proposes UNSR, a novel transformer-based approach for compact state representation that enhances value decomposition and coordination in multi-agent RL.

Findings

01

Outperforms baselines on StarCraft II micromanagement tasks

02

Achieves higher data efficiency in cooperative MARL

03

Ablation studies identify key factors for UNSR's success

Abstract

In cooperative multi-agent reinforcement learning (MARL), the environmental stochasticity and uncertainties will increase exponentially when the number of agents increases, which puts hard pressure on how to come up with a compact latent representation from partial observation for boosting value decomposition. To tackle these issues, we propose a simple yet powerful method that alleviates partial observability and efficiently promotes coordination by introducing the UNit-wise attentive State Representation (UNSR). In UNSR, each agent learns a compact and disentangled unit-wise state representation outputted from transformer blocks, and produces its local action-value function. The proposed UNSR is used to boost the value decomposition with a multi-head attention mechanism for producing efficient credit assignment in the mixing network, providing an efficient reasoning path between the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsSoftmax · Linear Layer