Asymptotic Properties of Optimal Trajectories in Dynamic Programming

Sylvain Sorin (EC); Xavier Venel (SAF; C&O); Guillaume Vigeral; (CEREMADE)

arXiv:1012.5149·math.OC·December 24, 2010

Asymptotic Properties of Optimal Trajectories in Dynamic Programming

Sylvain Sorin (EC), Xavier Venel (SAF, C&O), Guillaume Vigeral, (CEREMADE)

PDF

Open Access

TL;DR

This paper proves that in dynamic programming, uniform convergence of finite horizon values ensures the average payoff stabilizes on optimal paths, with extensions to two-player games discussed.

Contribution

It establishes a link between finite horizon convergence and long-term average payoff stability in dynamic programming, extending to multi-player scenarios.

Findings

01

Uniform convergence implies constant average payoff on optimal trajectories.

02

Theoretical framework connecting finite horizon and long-term behavior.

03

Extensions to two-person games are analyzed.

Abstract

We prove in a dynamic programming framework that uniform convergence of the finite horizon values implies that asymptotically the average accumulated payoff is constant on optimal trajectories. We analyze and discuss several possible extensions to two-person games.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsGame Theory and Applications · Optimization and Variational Analysis · Reinforcement Learning in Robotics