Transformer World Model for Sample Efficient Multi-Agent Reinforcement Learning

Azad Deihim; Eduardo Alonso; Dimitra Apostolopoulou

arXiv:2506.18537·cs.LG·June 24, 2025

Transformer World Model for Sample Efficient Multi-Agent Reinforcement Learning

Azad Deihim, Eduardo Alonso, Dimitra Apostolopoulou

PDF

1 Repo

TL;DR

This paper introduces MATWM, a transformer-based multi-agent world model that enhances sample efficiency and coordination in reinforcement learning environments by modeling agent interactions and adapting to non-stationarity.

Contribution

The paper proposes a novel transformer-based world model for multi-agent RL that integrates teammate prediction and a prioritized replay mechanism for improved performance.

Findings

01

Achieves state-of-the-art results on multiple benchmarks.

02

Demonstrates strong sample efficiency, reaching near-optimal performance with 50K interactions.

03

Ablation studies highlight the importance of each component in the model.

Abstract

We present the Multi-Agent Transformer World Model (MATWM), a novel transformer-based world model designed for multi-agent reinforcement learning in both vector- and image-based environments. MATWM combines a decentralized imagination framework with a semi-centralized critic and a teammate prediction module, enabling agents to model and anticipate the behavior of others under partial observability. To address non-stationarity, we incorporate a prioritized replay mechanism that trains the world model on recent experiences, allowing it to adapt to agents' evolving policies. We evaluated MATWM on a broad suite of benchmarks, including the StarCraft Multi-Agent Challenge, PettingZoo, and MeltingPot. MATWM achieves state-of-the-art performance, outperforming both model-free and prior world model approaches, while demonstrating strong sample efficiency, achieving near-optimal performance in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

azaddeihim/matwm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsLayer Normalization · Dropout · Absolute Position Encodings · Dense Connections · Byte Pair Encoding · Softmax · Label Smoothing · Transformer