Positive Dynamic Programming: A Critique
Aaqib Peerzada

TL;DR
This paper critically examines the existence of optimal stationary strategies in positive dynamic programming, highlighting conditions for weak optimality and providing counterexamples where optimal plans do not exist.
Contribution
It establishes conditions for weakly optimal stationary plans in positive dynamic programming and demonstrates limitations of extending discounted dynamic programming results.
Findings
Existence of weakly optimal stationary plans under bounded return
Counterexample showing no optimal plan exists in some cases
Limitations of previous results on discounted dynamic programming
Abstract
In the article Positive Dynamic Programming, David Blackwell tries to answer the question concerning the existence of optimal stationary strategies for a positive dynamic programming problem. The principal results obtained in the paper are indicative of the existence of weakly optimal stationary plans where the optimal return need not be Borel measurable. More specifically, the main theorem establishes the condition for the weakly optimal stationary plan when the optimal return is bounded. Blackwell also gives an example that demonstrates that no optimal plan exists showing that the results obtained in his earlier work on discounted dynamic programming, cannot be generally extended to the positive case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdaptive Dynamic Programming Control · Reinforcement Learning in Robotics
