Contractivity of Bellman Operator in Risk Averse Dynamic Programming with Infinite Horizon
Martin \v{S}m\'id, Milo\v{s} Kopa

TL;DR
This paper proves the contraction property of the Bellman operator in risk-averse infinite horizon dynamic programming, ensuring convergence of solution algorithms like value iteration in reinforcement learning contexts.
Contribution
It establishes the contraction property of the Bellman operator under risk aversion assumptions, providing theoretical guarantees for convergence in infinite horizon problems.
Findings
Bellman operator is a contraction under specified assumptions
Convergence of value iteration is guaranteed
Framework applicable to risk-averse reinforcement learning
Abstract
The paper deals with a risk averse dynamic programming problem with infinite horizon. First, the required assumptions are formulated to have the problem well defined. Then the Bellman equation is derived, which may be also seen as a standalone reinforcement learning problem. The fact that the Bellman operator is contraction is proved, guaranteeing convergence of various solution algorithms used for dynamic programming as well as reinforcement learning problems, which we demonstrate on the value iteration algorithm.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Economic theories and models · Optimization and Variational Analysis
