An Analysis of Primal-Dual Algorithms for Discounted Markov Decision   Processes

Randy Cogill

arXiv:1601.04175·math.OC·January 19, 2016

An Analysis of Primal-Dual Algorithms for Discounted Markov Decision Processes

Randy Cogill

PDF

Open Access

TL;DR

This paper explores primal-dual algorithms for discounted Markov decision processes, deriving an optimal solution to the dual of the restricted primal, leading to a finite-iteration algorithm that guarantees optimality and relates to policy iteration.

Contribution

It introduces a new primal-dual algorithm for discounted MDPs with a closed-form dual solution, ensuring finite convergence to the optimal value function.

Findings

01

The derived algorithm guarantees optimality in finite steps.

02

The primal-dual method can be interpreted as repeated policy iteration.

03

Connections are made between primal-dual algorithms and policy iteration complexity.

Abstract

Several well-known algorithms in the field of combinatorial optimization can be interpreted in terms of the primal-dual method for solving linear programs. For example, Dijkstra's algorithm, the Ford-Fulkerson algorithm, and the Hungarian algorithm can all be viewed as the primal-dual method applied to the linear programming formulations of their respective optimization problems. Roughly speaking, successfully applying the primal-dual method to an optimization problem that can be posed as a linear program relies on the ability to find a simple characterization of the optimal solutions to a related linear program, called the `dual of the restricted primal' (DRP). This paper is motivated by the following question: What is the algorithm we obtain if we apply the primal-dual method to a linear programming formulation of a discounted cost Markov decision process? We will first show that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Simulation Techniques and Applications · Optimization and Search Problems