# Generalized Second Order Value Iteration in Markov Decision Processes

**Authors:** Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar

arXiv: 1905.03927 · 2021-09-21

## TL;DR

This paper introduces a second order value iteration method for Markov Decision Processes that accelerates convergence to the optimal value function by applying Newton-Raphson to the successive relaxation scheme, with proven convergence and demonstrated effectiveness.

## Contribution

It proposes a novel second order value iteration algorithm based on Newton-Raphson applied to successive relaxation, improving convergence speed in MDPs.

## Key findings

- Proves global convergence of the method
- Demonstrates second order convergence rate
- Shows improved efficiency through experiments

## Abstract

Value iteration is a fixed point iteration technique utilized to obtain the optimal value function and policy in a discounted reward Markov Decision Process (MDP). Here, a contraction operator is constructed and applied repeatedly to arrive at the optimal solution. Value iteration is a first order method and therefore it may take a large number of iterations to converge to the optimal solution. Successive relaxation is a popular technique that can be applied to solve a fixed point equation. It has been shown in the literature that, under a special structure of the MDP, successive over-relaxation technique computes the optimal value function faster than standard value iteration. In this work, we propose a second order value iteration procedure that is obtained by applying the Newton-Raphson method to the successive relaxation value iteration scheme. We prove the global convergence of our algorithm to the optimal solution asymptotically and show the second order convergence. Through experiments, we demonstrate the effectiveness of our proposed approach.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.03927/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1905.03927/full.md

## References

19 references — full list in the complete paper: https://tomesphere.com/paper/1905.03927/full.md

---
Source: https://tomesphere.com/paper/1905.03927