On the Detection of Markov Decision Processes
Xiaoming Duan, Yagiz Savas, Rui Yan, Zhe Xu, Ufuk Topcu

TL;DR
This paper investigates the conditions under which one can asymptotically identify the true Markov decision process from a set of candidates using observed data, and develops algorithms for policy synthesis to achieve perfect detection.
Contribution
It provides necessary and sufficient conditions for perfect detection between two MDPs and extends these results to multiple MDPs with efficient algorithms for policy synthesis.
Findings
Established a necessary and sufficient condition for perfect detection between two MDPs.
Developed polynomial-time algorithms for policy synthesis to enable perfect detection.
Extended the detection framework to multiple MDPs with recursive algorithms.
Abstract
We study the detection problem for a finite set of Markov decision processes (MDPs) where the MDPs have the same state and action spaces but possibly different probabilistic transition functions. Any one of these MDPs could be the model for some underlying controlled stochastic process, but it is unknown a priori which MDP is the ground truth. We investigate whether it is possible to asymptotically detect the ground truth MDP model perfectly based on a single observed history (state-action sequence). Since the generation of histories depends on the policy adopted to control the MDPs, we discuss the existence and synthesis of policies that allow for perfect detection. We start with the case of two MDPs and establish a necessary and sufficient condition for the existence of policies that lead to perfect detection. Based on this condition, we then develop an algorithm that efficiently (in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFormal Methods in Verification · Petri Nets in System Modeling · Flexible and Reconfigurable Manufacturing Systems
