An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning

Christopher Amato

arXiv:2405.06161·cs.LG·May 22, 2025·2 cites

An Initial Introduction to Cooperative Multi-Agent Reinforcement Learning

Christopher Amato

PDF

Open Access

TL;DR

This paper introduces cooperative multi-agent reinforcement learning, explaining core concepts, main methods, and recent advances like value function factorization and centralized critic approaches, providing a foundational overview.

Contribution

It offers an accessible introduction to cooperative MARL, covering key methods, concepts, and recent developments, serving as a foundational resource for understanding the field.

Findings

01

Overview of cooperative MARL settings and methods

02

Explanation of value function factorization techniques like QMIX and VDN

03

Discussion of centralized critic methods such as MADDPG and MAPPO

Abstract

Multi-agent reinforcement learning (MARL) has exploded in popularity in recent years. While numerous approaches have been developed, they can be broadly categorized into three main types: centralized training and execution (CTE), centralized training for decentralized execution (CTDE), and decentralized training and execution (DTE). CTE methods assume centralization during training and execution (e.g., with fast, free, and perfect communication) and have the most information during execution. CTDE methods are the most common, as they leverage centralized information during training while enabling decentralized execution -- using only information available to that agent during execution. Decentralized training and execution methods make the fewest assumptions and are often simple to implement. This text is an introduction to cooperative MARL -- MARL in which all agents share a single,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic control and management

Methods*Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Experience Replay · Adam · Weight Decay · MADDPG · Convolution · Dense Connections · REINFORCE · Deep Q-Network