MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Maria Nesterova; Mikhail Kolosov; Anton Andreychuk; Egor Cherepanov; Oleg Bulichev; Alexey Kovalev; Konstantin Yakovlev; Aleksandr Panov; Alexey Skrynnik

arXiv:2604.05943·cs.AI·April 8, 2026

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning

Maria Nesterova, Mikhail Kolosov, Anton Andreychuk, Egor Cherepanov, Oleg Bulichev, Alexey Kovalev, Konstantin Yakovlev, Aleksandr Panov, Alexey Skrynnik

PDF

1 Video

TL;DR

This paper introduces MARL-GPT, a transformer-based model trained via offline reinforcement learning to perform well across diverse multi-agent environments without task-specific tuning.

Contribution

It presents a unified, multi-task MARL model using a single GPT architecture trained on large expert datasets, enabling broad applicability.

Findings

01

MARL-GPT achieves competitive performance across multiple environments.

02

The model requires no task-specific tuning.

03

Training on large datasets enables generalization across tasks.

Abstract

Recent advances in multi-agent reinforcement learning (MARL) have demonstrated success in numerous challenging domains and environments, but typically require specialized models for each task. In this work, we propose a coherent methodology that makes it possible for a single GPT-based model to learn and perform well across diverse MARL environments and tasks, including StarCraft Multi-Agent Challenge, Google Research Football and POGEMA. Our method, MARL-GPT, applies offline reinforcement learning to train at scale on the expert trajectories (400M for SMACv2, 100M for GRF, and 1B for POGEMA) combined with a single transformer-based observation encoder that requires no task-specific tuning. Experiments show that MARL-GPT achieves competitive performance compared to specialized baselines in all tested environments. Thus, our findings suggest that it is, indeed, possible to build a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

MARL-GPT: Foundation Model for Multi-Agent Reinforcement Learning· underline