Quantitative propagation of chaos for mean field Markov decision process with common noise
M\'ed\'eric Motte (LPSM), Huy\^en Pham (LPSM, UPCit\'e - UFR, Math\'ematiques)

TL;DR
This paper establishes a quantitative rate of convergence for the propagation of chaos in mean field Markov decision processes with common noise, linking finite-agent models to their mean field limits and constructing near-optimal policies.
Contribution
It provides explicit convergence rates and a method to derive near-optimal policies for finite-agent systems from the mean field control problem.
Findings
Convergence rate of order M_N^γ for value functions
Explicit construction of near-optimal policies
Sharp comparison of Bellman operators
Abstract
We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order , where is the mean rate of convergence in Wasserstein distance of the empirical measure, and is an explicit constant, in the limit of the value functions of -agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct -optimal policies for the -agent model from -optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the -agent problem and the CMKV-MDP, and fine coupling of empirical measures.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMarkov Chains and Monte Carlo Methods · Diffusion and Search Dynamics · Reinforcement Learning in Robotics
