Inexact Policy Iteration Methods for Large-Scale Markov Decision Processes
Matilde Gargiani, Robin Sieber, Efe Balta, Dominic, Liao-McPherson, John Lygeros

TL;DR
This paper introduces inexact policy iteration methods for large-scale Markov decision processes, analyzing their convergence and performance with various iterative solvers, and demonstrating their effectiveness in epidemiological health policy design.
Contribution
The paper develops a general framework for inexact policy iteration using semismooth Newton-inspired stopping conditions, analyzing convergence and applying it to large-scale MDPs.
Findings
Contraction guarantees depend on the stopping condition parameter.
Iterative solvers' contraction properties are enhanced by problem structure.
Numerical experiments show improved performance in epidemiological health policy design.
Abstract
We consider inexact policy iteration methods for large-scale infinite-horizon discounted MDPs with finite spaces, a variant of policy iteration where the policy evaluation step is implemented inexactly using an iterative solver for linear systems. In the classical dynamic programming literature, a similar principle is deployed in optimistic policy iteration, where an a-priori fixed-number of iterations of value iteration is used to inexactly solve the policy evaluation step. Inspired by the connection between policy iteration and semismooth Newton's method, we investigate a class of iPI methods that mimic the inexact variants of semismooth Newton's method by adopting a parametric stopping condition to regulate the level of inexactness of the policy evaluation step. For this class of methods we discuss local and global convergence properties and derive a practical range of values for the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAccess Control and Trust · Reinforcement Learning in Robotics · Simulation Techniques and Applications
