Singularly perturbed linear programs and Markov decision processes
Konstantin Avrachenkov (MAESTRO), Jerzy Filar, Vladimir Gaitsgory,, Andrew Stillman

TL;DR
This paper demonstrates that the linear programming formulations for discounted and long-run average Markov decision processes are manifestations of general properties of singularly perturbed linear programs, unifying their theoretical understanding.
Contribution
It establishes that the LP formulations for different types of MDPs are connected through the framework of singularly perturbed linear programs, confirming a conjecture from 2006.
Findings
Unified the LP formulations for discounted and average MDPs
Confirmed Altman's 2006 conjecture on singular perturbations
Provided theoretical insights into the structure of MDP linear programs
Abstract
Linear programming formulations for the discounted and long-run average MDPs have evolved along separate trajectories. In 2006, E. Altman conjectured that the two linear programming formulations of discounted and long-run average MDPs are, most likely, a manifestation of general properties of singularly perturbed linear programs. In this note we demonstrate that this is, indeed, the case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
