Continuous-time mean field Markov decision models
Nicole B\"auerle, Sebastian H\"ofer

TL;DR
This paper studies a large population of agents modeled by continuous-time Markov decision processes, analyzing their collective behavior as the number of agents grows infinitely large, and providing a new MDP-based approach with convergence guarantees.
Contribution
It introduces an MDP framework for mean field models, offering less restrictive assumptions and explicit convergence rates for large agent systems.
Findings
Convergence rate of 1/√N for the approximation
Application to machine replacement and epidemic models
Optimal policies from the limit may not be asymptotically optimal
Abstract
We consider a finite number of statistically equal agents, each moving on a finite set of states according to a continuous-time Markov Decision Process (MDP). Transition intensities of the agents and generated rewards depend not only on the state and action of the agent itself, but also on the states of the other agents as well as the chosen action. Interactions like this are typical for a wide range of models in e.g. biology, epidemics, finance, social science and queueing systems among others. The aim is to maximize the expected discounted reward of the system, i.e. the agents have to cooperate as a team. Computationally this is a difficult task when is large. Thus, we consider the limit for In contrast to other papers we treat this problem from an MDP perspective. This has the advantage that we need less regularity assumptions in order to construct…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHealthcare Operations and Scheduling Optimization
