Value Iteration with Guessing for Markov Chains and Markov Decision Processes

Krishnendu Chatterjee; Mahdi JafariRaviz; Raimundo Saona; Jakub Svoboda

arXiv:2505.06769·cs.AI·May 13, 2025

Value Iteration with Guessing for Markov Chains and Markov Decision Processes

Krishnendu Chatterjee, Mahdi JafariRaviz, Raimundo Saona, Jakub Svoboda

PDF

TL;DR

This paper introduces a novel value iteration approach with guessing for Markov chains and MDPs, achieving subexponential Bellman updates after linear preprocessing, and demonstrates practical improvements over existing methods.

Contribution

It proposes a new VI method with guessing, including an almost-linear-time preprocessing for MCs and an improved convergence analysis for MDPs, with practical algorithmic implementation.

Findings

01

Subexponential Bellman updates after linear preprocessing for MCs.

02

Enhanced convergence speed analysis for MDPs.

03

Practical algorithm outperforms existing VI approaches on benchmarks.

Abstract

Two standard models for probabilistic systems are Markov chains (MCs) and Markov decision processes (MDPs). Classic objectives for such probabilistic models for control and planning problems are reachability and stochastic shortest path. The widely studied algorithmic approach for these problems is the Value Iteration (VI) algorithm which iteratively applies local updates called Bellman updates. There are many practical approaches for VI in the literature but they all require exponentially many Bellman updates for MCs in the worst case. A preprocessing step is an algorithm that is discrete, graph-theoretical, and requires linear space. An important open question is whether, after a polynomial-time preprocessing, VI can be achieved with sub-exponentially many Bellman updates. In this work, we present a new approach for VI based on guessing values. Our theoretical contributions are…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings