General limit value in Dynamic Programming
J\'er\^ome Renault (GREMAQ)

TL;DR
This paper establishes conditions under which a unique limit value exists in dynamic programming problems as decision-makers become infinitely patient, unifying various payoff models and providing a comprehensive theoretical framework.
Contribution
It introduces a general condition for the uniform convergence of value functions in dynamic programming as patience tends to infinity, identifying a unique limit value independent of evaluation sequences.
Findings
Uniform convergence occurs iff the sequence of value functions is totally bounded.
A unique limit value function $v^*$ exists, independent of the evaluation sequence.
The results apply to discounted, average, and stochastic transition models.
Abstract
We consider a dynamic programming problem with arbitrary state space and bounded rewards. Is it possible to define in an unique way a limit value for the problem, where the "patience" of the decision-maker tends to infinity ? We consider, for each evaluation (a probability distribution over positive integers) the value function of the problem where the weight of any stage is given by , and we investigate the uniform convergence of a sequence when the "impatience" of the evaluations vanishes, in the sense that . We prove that this uniform convergence happens if and only if the metric space is totally bounded. Moreover there exists a particular function , independent of the particular chosen sequence , such that any…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEconomic theories and models · Risk and Portfolio Optimization · Supply Chain and Inventory Management
