Rethinking the Implementation Tricks and Monotonicity Constraint in   Cooperative Multi-Agent Reinforcement Learning

Jian Hu; Siyang Jiang; Seth Austin Harding; Haibin Wu; Shih-wei Liao

arXiv:2102.03479·cs.LG·June 9, 2023·35 cites

Rethinking the Implementation Tricks and Monotonicity Constraint in Cooperative Multi-Agent Reinforcement Learning

Jian Hu, Siyang Jiang, Seth Austin Harding, Haibin Wu, Shih-wei Liao

PDF

Open Access 2 Repos

TL;DR

This paper critically examines the implementation details and the role of the monotonicity constraint in QMIX-based multi-agent reinforcement learning, revealing that code optimizations and the constraint itself significantly impact performance and sample efficiency.

Contribution

It provides a detailed analysis of code-level optimizations and the effects of the monotonicity constraint in QMIX, challenging common assumptions and offering theoretical insights.

Findings

01

Normalized code optimizations improve QMIX performance in SMAC.

02

Monotonicity constraint enhances sample efficiency in cooperative tasks.

03

Code-level factors significantly influence the effectiveness of MARL algorithms.

Abstract

Many complex multi-agent systems such as robot swarms control and autonomous vehicle coordination can be modeled as Multi-Agent Reinforcement Learning (MARL) tasks. QMIX, a widely popular MARL algorithm, has been used as a baseline for the benchmark environments, e.g., Starcraft Multi-Agent Challenge (SMAC), Difficulty-Enhanced Predator-Prey (DEPP). Recent variants of QMIX target relaxing the monotonicity constraint of QMIX, allowing for performance improvement in SMAC. In this paper, we investigate the code-level optimizations of these variants and the monotonicity constraint. (1) We find that such improvements of the variants are significantly affected by various code-level optimizations. (2) The experiment results show that QMIX with normalized optimizations outperforms other works in SMAC; (3) beyond the common wisdom from these works, the monotonicity constraint can improve sample…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Autonomous Vehicle Technology and Safety · Robotic Path Planning Algorithms