Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Muning Wen; Jakub Grudzien Kuba; Runji Lin; Weinan Zhang; Ying Wen,; Jun Wang; Yaodong Yang

arXiv:2205.14953·cs.MA·October 31, 2022·79 cites

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem

Muning Wen, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen,, Jun Wang, Yaodong Yang

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces Multi-Agent Transformer (MAT), a novel sequence modeling approach for cooperative multi-agent reinforcement learning that achieves superior performance, efficiency, and adaptability across various complex benchmarks.

Contribution

The paper presents MAT, a new architecture that reformulates MARL as a sequence modeling problem, enabling linear complexity and performance guarantees, trained via online interaction.

Findings

01

MAT outperforms baselines like MAPPO and HAPPO on multiple benchmarks.

02

MAT demonstrates high data efficiency and few-shot learning ability.

03

The approach provides a new perspective linking MARL with sequence modeling.

Abstract

Large sequence model (SM) such as GPT series and BERT has displayed outstanding performance and generalization capabilities on vision, language, and recently reinforcement learning tasks. A natural follow-up question is how to abstract multi-agent decision making into an SM problem and benefit from the prosperous development of SMs. In this paper, we introduce a novel architecture named Multi-Agent Transformer (MAT) that effectively casts cooperative multi-agent reinforcement learning (MARL) into SM problems wherein the task is to map agents' observation sequence to agents' optimal action sequence. Our goal is to build the bridge between MARL and SMs so that the modeling power of modern sequence models can be unleashed for MARL. Central to our MAT is an encoder-decoder architecture which leverages the multi-agent advantage decomposition theorem to transform the joint policy search…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

pku-marl/multi-agent-transformer
pytorchOfficial

Videos

Multi-Agent Reinforcement Learning is a Sequence Modeling Problem· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics · Domain Adaptation and Few-Shot Learning · Human Pose and Action Recognition

MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Absolute Position Encodings · WordPiece · Position-Wise Feed-Forward Layer · Weight Decay · Byte Pair Encoding · Discriminative Fine-Tuning