TransfQMix: Transformers for Leveraging the Graph Structure of   Multi-Agent Reinforcement Learning Problems

Matteo Gallici; Mario Martin; Ivan Masmitja

arXiv:2301.05334·cs.LG·January 16, 2023

TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems

Matteo Gallici, Mario Martin, Ivan Masmitja

PDF

Open Access 1 Repo

TL;DR

TransfQMix introduces a transformer-based approach that leverages latent graph structures in multi-agent reinforcement learning to improve coordination, transferability, and performance in complex environments like StarCraft II.

Contribution

It presents a novel transformer-based method, TransfQMix, that exploits latent graph structures for better coordination and transferability in multi-agent reinforcement learning.

Findings

01

Outperforms state-of-the-art Q-Learning models in Spread and StarCraft II environments.

02

Demonstrates effectiveness in zero-shot transfer and curriculum learning scenarios.

03

Solves complex coordination problems that other methods cannot.

Abstract

Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team reward. However, the structure in which the environment is presented to the agents and to the CF is typically overlooked. We have observed that the features used to describe the coordination problem can be represented as vertex features of a latent graph structure. Here, we present TransfQMix, a new approach that uses transformers to leverage this latent structure and learn better coordination policies. Our transformer agents perform a graph reasoning over the state of the observable entities.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mttga/pymarl_transformers
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics

MethodsQ-Learning