Quantitative propagation of chaos for mean field Markov decision process   with common noise

M\'ed\'eric Motte (LPSM); Huy\^en Pham (LPSM; UPCit\'e - UFR; Math\'ematiques)

arXiv:2207.12738·math.OC·July 27, 2022

Quantitative propagation of chaos for mean field Markov decision process with common noise

M\'ed\'eric Motte (LPSM), Huy\^en Pham (LPSM, UPCit\'e - UFR, Math\'ematiques)

PDF

Open Access

TL;DR

This paper establishes a quantitative rate of convergence for the propagation of chaos in mean field Markov decision processes with common noise, linking finite-agent models to their mean field limits and constructing near-optimal policies.

Contribution

It provides explicit convergence rates and a method to derive near-optimal policies for finite-agent systems from the mean field control problem.

Findings

01

Convergence rate of order M_N^γ for value functions

02

Explicit construction of near-optimal policies

03

Sharp comparison of Bellman operators

Abstract

We investigate propagation of chaos for mean field Markov Decision Process with common noise (CMKV-MDP), and when the optimization is performed over randomized open-loop controls on infinite horizon. We first state a rate of convergence of order $M_{N}^{γ}$ , where $M_{N}$ is the mean rate of convergence in Wasserstein distance of the empirical measure, and $γ \in (0, 1]$ is an explicit constant, in the limit of the value functions of $N$ -agent control problem with asymmetric open-loop controls, towards the value function of CMKV-MDP. Furthermore, we show how to explicitly construct $(ϵ + O (M_{N}^{γ}))$ -optimal policies for the $N$ -agent model from $ϵ$ -optimal policies for the CMKV-MDP. Our approach relies on sharp comparison between the Bellman operators in the $N$ -agent problem and the CMKV-MDP, and fine coupling of empirical measures.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMarkov Chains and Monte Carlo Methods · Diffusion and Search Dynamics · Reinforcement Learning in Robotics