Asymptotic Properties of Optimal Trajectories in Dynamic Programming
Sylvain Sorin (EC), Xavier Venel (SAF, C&O), Guillaume Vigeral, (CEREMADE)

TL;DR
This paper proves that in dynamic programming, uniform convergence of finite horizon values ensures the average payoff stabilizes on optimal paths, with extensions to two-player games discussed.
Contribution
It establishes a link between finite horizon convergence and long-term average payoff stability in dynamic programming, extending to multi-player scenarios.
Findings
Uniform convergence implies constant average payoff on optimal trajectories.
Theoretical framework connecting finite horizon and long-term behavior.
Extensions to two-person games are analyzed.
Abstract
We prove in a dynamic programming framework that uniform convergence of the finite horizon values implies that asymptotically the average accumulated payoff is constant on optimal trajectories. We analyze and discuss several possible extensions to two-person games.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGame Theory and Applications · Optimization and Variational Analysis · Reinforcement Learning in Robotics
