Reach-avoid semi-Markov decision processes with time-varying obstacles
Yanyun Li, Xianping Guo

TL;DR
This paper develops a novel approach for calculating the maximum probability of reaching a target while avoiding obstacles in semi-Markov decision processes with time-varying obstacles, using a related two-dimensional model and an improved algorithm.
Contribution
It introduces a new two-dimensional model to handle non-homogeneous semi-Markov processes with time-varying obstacles and provides an efficient algorithm for computing reach-avoid probabilities.
Findings
Proves equivalence between original and two-dimensional models.
Develops an improved value-type algorithm for reach-avoid probability.
Achieves $\epsilon$-optimal policies for complex semi-Markov decision processes.
Abstract
We consider the maximal reach-avoid probability to a target in finite horizon for semi-Markov decision processes with time-varying obstacles. Since the variance of the obstacle set, the model \eqref{Model} is non-homogeneous. To overcome such difficulty, we construct a related two-dimensional model \eqref{newModel}, and then prove the equivalence between such reach-avoid probability of the original model and that of the related two-dimensional one. For the related two-dimensional model, we analyze some special characteristics of the equivalent reach-avoid probability. On this basis, we provide a special improved value-type algorithm to obtain the equivalent maximal reach-avoid probability and its -optimal policy. Then, at the last step of the algorithm, by the equivalence between these two models, we obtain the original maximal reach-avoid probability and its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems · AI-based Problem Solving and Planning · Reinforcement Learning in Robotics
