Rank-One Modified Value Iteration

Arman Sharifi Kolarijani; Tolga Ok; Peyman Mohajerin Esfahani; Mohamad Amin Sharif Kolarijani

arXiv:2505.01828·math.OC·October 23, 2025

Rank-One Modified Value Iteration

Arman Sharifi Kolarijani, Tolga Ok, Peyman Mohajerin Esfahani, Mohamad Amin Sharif Kolarijani

PDF

Open Access 1 Video

TL;DR

This paper introduces a novel rank-one modified value iteration algorithm for Markov decision processes that converges efficiently and outperforms existing methods in planning and learning tasks.

Contribution

It proposes a new algorithm using rank-one approximation and the power method, with proven convergence guarantees and improved empirical performance.

Findings

01

Converges at the same rate as value iteration and Q-learning.

02

Outperforms first-order algorithms in numerical simulations.

03

Maintains computational complexity comparable to standard methods.

Abstract

In this paper, we provide a novel algorithm for solving planning and learning problems of Markov decision processes. The proposed algorithm follows a policy iteration-type update by using a rank-one approximation of the transition probability matrix in the policy evaluation step. This rank-one approximation is closely related to the stationary distribution of the corresponding transition probability matrix, which is approximated using the power method. We provide theoretical guarantees for the convergence of the proposed algorithm to optimal (action-)value function with the same rate and computational complexity as the value iteration algorithm in the planning problem and as the Q-learning algorithm in the learning problem. Through our extensive numerical simulations, however, we show that the proposed algorithm consistently outperforms first-order algorithms and their accelerated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Rank-One Modified Value Iteration· slideslive

Taxonomy

TopicsNeural Networks and Applications · Matrix Theory and Algorithms

MethodsQ-Learning