Scalable Offline Reinforcement Learning for Mean Field Games

Axel Brunnbauer; Julian Lemmel; Zahra Babaiee; Sophie Neubauer; Radu; Grosu

arXiv:2410.17898·cs.LG·October 24, 2024

Scalable Offline Reinforcement Learning for Mean Field Games

Axel Brunnbauer, Julian Lemmel, Zahra Babaiee, Sophie Neubauer, Radu, Grosu

PDF

Open Access 1 Repo

TL;DR

This paper introduces Off-MMD, a scalable offline reinforcement learning algorithm for mean-field games that estimates equilibrium policies solely from static datasets, avoiding the need for online interactions or environment models.

Contribution

It proposes a novel offline mean-field RL method combining mirror descent and importance sampling, addressing data limitations and overestimation issues for practical multi-agent applications.

Findings

01

Off-MMD performs well on benchmark tasks like crowd exploration.

02

The algorithm is robust to low-quality datasets.

03

Sensitivity analysis shows stability across hyperparameters.

Abstract

Reinforcement learning algorithms for mean-field games offer a scalable framework for optimizing policies in large populations of interacting agents. Existing methods often depend on online interactions or access to system dynamics, limiting their practicality in real-world scenarios where such interactions are infeasible or difficult to model. In this paper, we present Offline Munchausen Mirror Descent (Off-MMD), a novel mean-field RL algorithm that approximates equilibrium policies in mean-field games using purely offline data. By leveraging iterative mirror descent and importance sampling techniques, Off-MMD estimates the mean-field distribution from static datasets without relying on simulation or environment dynamics. Additionally, we incorporate techniques from offline reinforcement learning to address common issues like Q-value overestimation, ensuring robust policy learning even…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

axelbr/offline-mmd
jaxOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Advanced Bandit Algorithms Research