Safe Model-Based Multi-Agent Mean-Field Reinforcement Learning
Matej Jusup, Barna P\'asztor, Tadeusz Janik, Kenan Zhang, Francesco, Corman, Andreas Krause, Ilija Bogunovic

TL;DR
This paper introduces Safe-M3-UCRL, a model-based mean-field reinforcement learning algorithm that guarantees safety constraints are met even with unknown dynamics, demonstrated on mobility and vehicle repositioning tasks.
Contribution
It presents the first safe, model-based mean-field RL algorithm that incorporates epistemic uncertainty and log-barrier methods to handle unknown transitions and constraints.
Findings
Successfully enforces safety constraints in synthetic and real-world scenarios.
Effectively manages demand and service accessibility in mobility applications.
Demonstrates high-probability constraint satisfaction with unknown transition models.
Abstract
Many applications, e.g., in shared mobility, require coordinating a large number of agents. Mean-field reinforcement learning addresses the resulting scalability challenge by optimizing the policy of a representative agent interacting with the infinite population of identical agents instead of considering individual pairwise interactions. In this paper, we address an important generalization where there exist global constraints on the distribution of agents (e.g., requiring capacity constraints or minimum coverage requirements to be met). We propose Safe-M-UCRL, the first model-based mean-field reinforcement learning algorithm that attains safe policies even in the case of unknown transitions. As a key ingredient, it uses epistemic uncertainty in the transition model within a log-barrier approach to ensure pessimistic constraints satisfaction with high probability. Beyond the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTraffic control and management · Transportation and Mobility Innovations · Transportation Planning and Optimization
Methodstravel james
