Experiential Explanations for Reinforcement Learning
Amal Alabdulkarim, Madhuri Singh, Gennie Mansi, Kaely Hall, Upol, Ehsan, Mark O. Riedl

TL;DR
This paper introduces Experiential Explanations, a novel method for making reinforcement learning agents more interpretable by generating counterfactual explanations that reveal how different rewards influence decisions, improving human understanding.
Contribution
The paper proposes a new technique using influence predictors to generate counterfactual explanations for RL agents, enhancing interpretability and user comprehension.
Findings
Participants better predicted agent actions with Experiential Explanations.
Participants found these explanations more understandable and useful.
Experiential Explanations improved perceived explanation quality.
Abstract
Reinforcement learning (RL) systems can be complex and non-interpretable, making it challenging for non-AI experts to understand or intervene in their decisions. This is due in part to the sequential nature of RL in which actions are chosen because of their likelihood of obtaining future rewards. However, RL agents discard the qualitative features of their training, making it difficult to recover user-understandable information for "why" an action is chosen. We propose a technique Experiential Explanations to generate counterfactual explanations by training influence predictors along with the RL policy. Influence predictors are models that learn how different sources of reward affect the agent in different states, thus restoring information about how the policy reflects the environment. Two human evaluation studies revealed that participants presented with Experiential Explanations were…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI)
