Weighted Difference Approximation of Value Functions for   Slow-Discounting Markov Decision Processes

Yin-Lam Chow; Junjie Qin

arXiv:1412.4908·math.OC·December 17, 2014·CDC

Weighted Difference Approximation of Value Functions for Slow-Discounting Markov Decision Processes

Yin-Lam Chow, Junjie Qin

PDF

Open Access

TL;DR

This paper introduces a weighted difference method for approximating value functions in slow-discounting MDPs, providing faster convergence and explicit error bounds especially when the discount factor approaches one.

Contribution

The paper proposes a novel weighted difference approximation for value functions in slow-discounting MDPs, with rigorous error bounds and convergence analysis under ergodicity assumptions.

Findings

01

Weighted difference approximation outperforms classical value iteration.

02

Convergence rate is geometric and linked to system mixing time.

03

Numerical experiments confirm theoretical convergence properties.

Abstract

Processes (MDPs) often require frequent decision making, that is, taking an action every microsecond, second, or minute. Infinite horizon discount reward formulation is still relevant for a large portion of these applications, because actual time span of these problems can be months or years, during which discounting factors due to e.g. interest rates are of practical concern. In this paper, we show that, for such MDPs with discount rate $α$ close to $1$ , under a common ergodicity assumption, a weighted difference between two successive value function estimates obtained from the classical value iteration (VI) is a better approximation than the value function obtained directly from VI. Rigorous error bounds are established which in turn show that the approximation converges to the actual value function in a rate $(α β)^{k}$ with $β < 1$ . This indicates a geometric…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Queuing Theory Analysis · Transportation and Mobility Innovations · Stochastic processes and financial applications