Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning

Matej Jusup; Barna P\'asztor; Tadeusz Janik; Kenan Zhang; Francesco; Corman; Andreas Krause; Ilija Bogunovic

arXiv:2306.17052·cs.LG·December 29, 2023·1 cites

Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning

Matej Jusup, Barna P\'asztor, Tadeusz Janik, Kenan Zhang, Francesco, Corman, Andreas Krause, Ilija Bogunovic

PDF

Open Access 1 Repo

TL;DR

This paper introduces Safe-M3-UCRL, a model-based mean-field reinforcement learning algorithm that guarantees safety constraints are met even with unknown dynamics, demonstrated on mobility and vehicle repositioning tasks.

Contribution

It presents the first safe, model-based mean-field RL algorithm that incorporates epistemic uncertainty and log-barrier methods to handle unknown transitions and constraints.

Findings

01

Successfully enforces safety constraints in synthetic and real-world scenarios.

02

Effectively manages demand and service accessibility in mobility applications.

03

Demonstrates high-probability constraint satisfaction with unknown transition models.

Abstract

Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where there exist global constraints on the distribution of agents (e.g., requiring capacity constraints or minimum coverage requirements to be met). We propose Safe-M $^{3}$ -UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions. As a key ingredient, it uses epistemic uncertainty in the transition model within a log-barrier approach to ensure pessimistic constraints satisfaction with high probability. Beyond the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mjusup1501/safe-m3-ucrl
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTraffic control and management · Transportation and Mobility Innovations · Transportation Planning and Optimization

Methodstravel james