Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Omayma Mahjoub; Sasha Abramowitz; Ruan de Kock; Wiem Khlifi; Simon du Toit; Jemma Daniel; Louay Ben Nessir; Louise Beyers; Claude Formanek; Liam Clark; Arnu Pretorius

arXiv:2410.01706·cs.LG·May 27, 2025

Sable: a Performant, Efficient and Scalable Sequence Model for MARL

Omayma Mahjoub, Sasha Abramowitz, Ruan de Kock, Wiem Khlifi, Simon du Toit, Jemma Daniel, Louay Ben Nessir, Louise Beyers, Claude Formanek, Liam Clark, Arnu Pretorius

PDF

Open Access 1 Video

TL;DR

Sable is a new sequence modeling approach for multi-agent reinforcement learning that offers high performance, memory efficiency, and scalability, effectively handling large numbers of agents and long temporal contexts.

Contribution

This paper introduces Sable, a novel, scalable, and memory-efficient sequence model for MARL that adapts retention mechanisms for improved temporal reasoning and performance.

Findings

01

Outperforms state-of-the-art methods in 34 out of 45 environments

02

Maintains performance with over a thousand agents

03

Exhibits linear memory growth with increasing agents

Abstract

As multi-agent reinforcement learning (MARL) progresses towards solving larger and more complex problems, it becomes increasingly important that algorithms exhibit the key properties of (1) strong performance, (2) memory efficiency, and (3) scalability. In this work, we introduce Sable, a performant, memory-efficient, and scalable sequence modeling approach to MARL. Sable works by adapting the retention mechanism in Retentive Networks (Sun et al., 2023) to achieve computationally efficient processing of multi-agent observations with long context memory for temporal reasoning. Through extensive evaluations across six diverse environments, we demonstrate how Sable is able to significantly outperform existing state-of-the-art methods in a large number of diverse tasks (34 out of 45 tested). Furthermore, Sable maintains performance as we scale the number of agents, handling environments…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Sable: a Performant, Efficient and Scalable Sequence Model for MARL· slideslive

Taxonomy

TopicsReinforcement Learning in Robotics