Fast Approximate Dynamic Programming for Infinite-Horizon Markov Decision Processes
M. A. S. Kolarijani, G. F. Max, P. Mohajerin Esfahani

TL;DR
This paper introduces a novel numerical scheme for infinite-horizon stochastic control problems that significantly reduces computational complexity by operating in the conjugate domain, enabling faster value iteration.
Contribution
It proposes a new approach using the Legendre transform to implement value iteration more efficiently for nonlinear stochastic systems with constraints.
Findings
Reduces per-iteration complexity from O(XU) to O(X+U)
Provides convergence, time complexity, and error analysis
Applicable to nonlinear stochastic control with constraints
Abstract
In this study, we consider the infinite-horizon, discounted cost, optimal control of stochastic nonlinear systems with separable cost and constraints in the state and input variables. Using the linear-time Legendre transform, we propose a novel numerical scheme for implementation of the corresponding value iteration (VI) algorithm in the conjugate domain. Detailed analyses of the convergence, time complexity, and error of the proposed algorithm are provided. In particular, with a discretization of size and for the state and input spaces, respectively, the proposed approach reduces the time complexity of each iteration in the VI algorithm from to , by replacing the minimization operation in the primal domain with a simple addition in the conjugate domain.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Control Systems Optimization · Reinforcement Learning in Robotics · Risk and Portfolio Optimization
