Decisions that Explain Themselves: A User-Centric Deep Reinforcement Learning Explanation System
Xiaoran Wu, Zihan Yan, Chongjie Zhang, Tongshuang Wu

TL;DR
This paper introduces a user-centric explanation system for deep reinforcement learning that improves understanding and debugging of RL agents through natural language explanations and interactive exploration, validated by user studies.
Contribution
It presents a novel counterfactual-inference-based explanation method and an interactive system tailored to RL developers' needs, addressing gaps in existing explainability tools.
Findings
Developers identified 20.9% more abnormal behaviors using the system.
End users improved actionability test performance by 25.1% in autonomous driving.
End users improved actionability test performance by 16.9% in StarCraft II.
Abstract
With deep reinforcement learning (RL) systems like autonomous driving being wildly deployed but remaining largely opaque, developers frequently use explainable RL (XRL) tools to better understand and work with deep RL agents. However, previous XRL works employ a techno-centric research approach, ignoring how RL developers perceive the generated explanations. Through a pilot study, we identify major goals for RL practitioners to use XRL methods and four pitfalls that widen the gap between existing XRL methods and these goals. The pitfalls include inaccessible reasoning processes, inconsistent or unintelligible explanations, and explanations that cannot be generalized. To fill the discovered gap, we propose a counterfactual-inference-based explanation method that discovers the details of the reasoning process of RL agents and generates natural language explanations. Surrounding this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Ethics and Social Impacts of AI
