Plan Explanations as Model Reconciliation -- An Empirical Study
Tathagata Chakraborti, Sarath Sreedharan, Sachin Grover, Subbarao, Kambhampati

TL;DR
This paper empirically evaluates explanation algorithms for autonomous systems in a human-in-the-loop search and rescue scenario, focusing on trust dynamics and model reconciliation.
Contribution
It provides the first empirical assessment of explanation algorithms with actual humans, analyzing their effectiveness and impact on trust in human-AI interactions.
Findings
Explanation properties hold to some extent in human evaluations
Trust between human and robot evolves during interactions
Certain explanation algorithms improve understanding and trust
Abstract
Recent work in explanation generation for decision making agents has looked at how unexplained behavior of autonomous systems can be understood in terms of differences in the model of the system and the human's understanding of the same, and how the explanation process as a result of this mismatch can be then seen as a process of reconciliation of these models. Existing algorithms in such settings, while having been built on contrastive, selective and social properties of explanations as studied extensively in the psychology literature, have not, to the best of our knowledge, been evaluated in settings with actual humans in the loop. As such, the applicability of such explanations to human-AI and human-robot interactions remains suspect. In this paper, we set out to evaluate these explanation generation algorithms in a series of studies in a mock search and rescue scenario with an…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
