Multi-Agent Craftax: Benchmarking Open-Ended Multi-Agent Reinforcement Learning at the Hyperscale

Bassel Al Omari; Michael Matthews; Alexander Rutherford; Jakob Nicolaus Foerster

arXiv:2511.04904·cs.LG·November 10, 2025

Multi-Agent Craftax: Benchmarking Open-Ended Multi-Agent Reinforcement Learning at the Hyperscale

Bassel Al Omari, Michael Matthews, Alexander Rutherford, Jakob Nicolaus Foerster

PDF

Open Access

TL;DR

This paper introduces Craftax-MA and Craftax-Coop, new fast, open-ended multi-agent RL benchmarks designed to evaluate long-term dependencies, cooperation, and generalization, highlighting current algorithms' struggles and driving future research.

Contribution

It presents novel, scalable benchmarks for multi-agent RL that incorporate long-term challenges and heterogeneous cooperation, filling gaps in existing evaluation environments.

Findings

01

Existing algorithms struggle with long-horizon credit assignment.

02

Current methods have difficulty with exploration and cooperation.

03

Craftax benchmarks reveal limitations of state-of-the-art MARL algorithms.

Abstract

Progress in multi-agent reinforcement learning (MARL) requires challenging benchmarks that assess the limits of current methods. However, existing benchmarks often target narrow short-horizon challenges that do not adequately stress the long-term dependencies and generalization capabilities inherent in many multi-agent systems. To address this, we first present \textit{Craftax-MA}: an extension of the popular open-ended RL environment, Craftax, that supports multiple agents and evaluates a wide range of general abilities within a single environment. Written in JAX, \textit{Craftax-MA} is exceptionally fast with a training run using 250 million environment interactions completing in under an hour. To provide a more compelling challenge for MARL, we also present \textit{Craftax-Coop}, an extension introducing heterogeneous agents, trading and more mechanics that require complex…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Explainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI