TransfQMix: Transformers for Leveraging the Graph Structure of Multi-Agent Reinforcement Learning Problems
Matteo Gallici, Mario Martin, Ivan Masmitja

TL;DR
TransfQMix introduces a transformer-based approach that leverages latent graph structures in multi-agent reinforcement learning to improve coordination, transferability, and performance in complex environments like StarCraft II.
Contribution
It presents a novel transformer-based method, TransfQMix, that exploits latent graph structures for better coordination and transferability in multi-agent reinforcement learning.
Findings
Outperforms state-of-the-art Q-Learning models in Spread and StarCraft II environments.
Demonstrates effectiveness in zero-shot transfer and curriculum learning scenarios.
Solves complex coordination problems that other methods cannot.
Abstract
Coordination is one of the most difficult aspects of multi-agent reinforcement learning (MARL). One reason is that agents normally choose their actions independently of one another. In order to see coordination strategies emerging from the combination of independent policies, the recent research has focused on the use of a centralized function (CF) that learns each agent's contribution to the team reward. However, the structure in which the environment is presented to the agents and to the CF is typically overlooked. We have observed that the features used to describe the coordination problem can be represented as vertex features of a latent graph structure. Here, we present TransfQMix, a new approach that uses transformers to leverage this latent structure and learn better coordination policies. Our transformer agents perform a graph reasoning over the state of the observable entities.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsQ-Learning
