Contractivity of Bellman Operator in Risk Averse Dynamic Programming   with Infinite Horizon

Martin \v{S}m\'id; Milo\v{s} Kopa

arXiv:2208.01990·math.OC·August 4, 2022·Oper. Res. Lett.

Contractivity of Bellman Operator in Risk Averse Dynamic Programming with Infinite Horizon

Martin \v{S}m\'id, Milo\v{s} Kopa

PDF

Open Access

TL;DR

This paper proves the contraction property of the Bellman operator in risk-averse infinite horizon dynamic programming, ensuring convergence of solution algorithms like value iteration in reinforcement learning contexts.

Contribution

It establishes the contraction property of the Bellman operator under risk aversion assumptions, providing theoretical guarantees for convergence in infinite horizon problems.

Findings

01

Bellman operator is a contraction under specified assumptions

02

Convergence of value iteration is guaranteed

03

Framework applicable to risk-averse reinforcement learning

Abstract

The paper deals with a risk averse dynamic programming problem with infinite horizon. First, the required assumptions are formulated to have the problem well defined. Then the Bellman equation is derived, which may be also seen as a standalone reinforcement learning problem. The fact that the Bellman operator is contraction is proved, guaranteeing convergence of various solution algorithms used for dynamic programming as well as reinforcement learning problems, which we demonstrate on the value iteration algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdaptive Dynamic Programming Control · Economic theories and models · Optimization and Variational Analysis