Computing Policies That Account For The Effects Of Human Agent Uncertainty During Execution In Markov Decision Processes
Sriram Gopalakrishnan, Mudit Verma, Subbarao Kambhampati

TL;DR
This paper introduces a framework for computing MDP policies that consider human errors and uncertainties during execution, improving real-world decision-making where humans are involved.
Contribution
It presents a novel model of human behavior under uncertainty and algorithms to compute policies that account for human execution errors in MDPs.
Findings
The proposed algorithms effectively find policies that mitigate human errors.
Experimental results demonstrate improved performance in Gridworld and warehouse domains.
Human-subject studies support the validity of the human behavior model.
Abstract
When humans are given a policy to execute, there can be policy execution errors and deviations in policy if there is uncertainty in identifying a state. This can happen due to the human agent's cognitive limitations and/or perceptual errors. So an algorithm that computes a policy for a human to execute ought to consider these effects in its computations. An optimal Markov Decision Process (MDP) policy that is poorly executed (because of a human agent) maybe much worse than another policy that is suboptimal in the MDP, but considers the human-agent's execution behavior. In this paper we consider two problems that arise from state uncertainty; these are erroneous state-inference, and extra-sensing actions that a person might take as a result of their uncertainty. We present a framework to model the human agent's behavior with respect to state uncertainty, and can be used to compute MDP…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
