Maximizing the probability of attaining a target prior to extinction
Debasish Chatterjee, Eugenio Cinquemani, John Lygeros

TL;DR
This paper introduces a dynamic programming approach to optimize the probability of reaching a target set before extinction in a Markov control process, establishing existence and characterization of optimal policies.
Contribution
It provides a novel method to compute the maximum probability of reaching a target before hitting an absorbing state using dynamic programming and martingale techniques.
Findings
Existence of a deterministic stationary policy that maximizes the probability.
The maximization problem reduces to an expected total reward calculation.
Martingale characterizations of optimal policies are established.
Abstract
We present a dynamic programming-based solution to the problem of maximizing the probability of attaining a target set before hitting a cemetery set for a discrete-time Markov control process. Under mild hypotheses we establish that there exists a deterministic stationary policy that achieves the maximum value of this probability. We demonstrate how the maximization of this probability can be computed through the maximization of an expected total reward until the first hitting time to either the target or the cemetery set. Martingale characterizations of thrifty, equalizing, and optimal policies in the context of our problem are also established.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
