Killed Markov Decision Processes on Finite Time Interval for Countable Models
Nestor Parolya, Yaroslav Yeleyko

TL;DR
This paper studies killed Markov decision processes on finite intervals for countable models, establishing existence of near-optimal policies, fundamental equations, and a dynamic programming approach for constructing simple optimal policies.
Contribution
It introduces a framework for killed MDPs on finite intervals with countable states, proving existence of uniform epsilon-optimal policies and deriving fundamental optimality equations.
Findings
Existence of uniform epsilon-optimal policies.
Correctness of the fundamental optimality equation.
Method for constructing simple optimal policies.
Abstract
We consider killed Markov decision processes for countable models on a finite time-interval. Existence of a uniform -optimal policy is proven. We show the correctness of the fundamental equation. The optimal control problem is reduced to a similar problem for the derived model. We receive an optimality equation and a method for the construction of simple optimal policies. The sufficiency of simple policies for countable models is proven. We show the correctness of the Markovian property. Additionally, a dynamic programming principle is considered.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Formal Methods in Verification
