Coordination Failure in Cooperative Offline MARL

Callum Rhys Tilbury; Claude Formanek; Louise Beyers; Jonathan P.; Shock; Arnu Pretorius

arXiv:2407.01343·cs.LG·July 2, 2024

Coordination Failure in Cooperative Offline MARL

Callum Rhys Tilbury, Claude Formanek, Louise Beyers, Jonathan P., Shock, Arnu Pretorius

PDF

Open Access

TL;DR

This paper investigates coordination failure in offline multi-agent reinforcement learning, revealing a failure mode in joint action policies and proposing a sample prioritization method to improve coordination, supported by theoretical analysis and experiments.

Contribution

It identifies a previously overlooked failure mode in offline MARL and introduces a sample prioritization approach to mitigate coordination failure, grounded in game-theoretic analysis.

Findings

01

Identified a catastrophic coordination failure mode in BRUD-based algorithms.

02

Proposed a sample prioritization method based on joint-action similarity.

03

Demonstrated the effectiveness of the approach through detailed experiments.

Abstract

Offline multi-agent reinforcement learning (MARL) leverages static datasets of experience to learn optimal multi-agent control. However, learning from static data presents several unique challenges to overcome. In this paper, we focus on coordination failure and investigate the role of joint actions in multi-agent policy gradients with offline data, focusing on a common setting we refer to as the 'Best Response Under Data' (BRUD) approach. By using two-player polynomial games as an analytical tool, we demonstrate a simple yet overlooked failure mode of BRUD-based algorithms, which can lead to catastrophic coordination failure in the offline setting. Building on these insights, we propose an approach to mitigate such failure, by prioritising samples from the dataset based on joint-action similarity during policy learning and demonstrate its effectiveness in detailed experiments. More…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEnergy Efficient Wireless Sensor Networks · Mobile Agent-Based Network Management · Petri Nets in System Modeling

MethodsFocus