Weighted Difference Approximation of Value Functions for Slow-Discounting Markov Decision Processes
Yin-Lam Chow, Junjie Qin

TL;DR
This paper introduces a weighted difference method for approximating value functions in slow-discounting MDPs, providing faster convergence and explicit error bounds especially when the discount factor approaches one.
Contribution
The paper proposes a novel weighted difference approximation for value functions in slow-discounting MDPs, with rigorous error bounds and convergence analysis under ergodicity assumptions.
Findings
Weighted difference approximation outperforms classical value iteration.
Convergence rate is geometric and linked to system mixing time.
Numerical experiments confirm theoretical convergence properties.
Abstract
Processes (MDPs) often require frequent decision making, that is, taking an action every microsecond, second, or minute. Infinite horizon discount reward formulation is still relevant for a large portion of these applications, because actual time span of these problems can be months or years, during which discounting factors due to e.g. interest rates are of practical concern. In this paper, we show that, for such MDPs with discount rate close to , under a common ergodicity assumption, a weighted difference between two successive value function estimates obtained from the classical value iteration (VI) is a better approximation than the value function obtained directly from VI. Rigorous error bounds are established which in turn show that the approximation converges to the actual value function in a rate with . This indicates a geometric…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Queuing Theory Analysis · Transportation and Mobility Innovations · Stochastic processes and financial applications
