Interactive Explanations: Diagnosis and Repair of Reinforcement Learning Based Agent Behaviors
Christian Arzate Cruz, Takeo Igarashi

TL;DR
This paper introduces an interactive explanation method for reinforcement learning agents that enables transparent communication and user-guided behavior repair through natural language templates, demonstrated in a Super Mario Bros. clone.
Contribution
It presents a novel two-way interaction approach using natural language templates for explaining and repairing reinforcement learning agent behaviors.
Findings
Effective diagnosis of agent behaviors
Successful repair of behaviors via user feedback
Enhanced transparency of agent decision-making
Abstract
Reinforcement learning techniques successfully generate convincing agent behaviors, but it is still difficult to tailor the behavior to align with a user's specific preferences. What is missing is a communication method for the system to explain the behavior and for the user to repair it. In this paper, we present a novel interaction method that uses interactive explanations using templates of natural language as a communication method. The main advantage of this interaction method is that it enables a two-way communication channel between users and the agent; the bot can explain its thinking procedure to the users, and the users can communicate their behavior preferences to the bot using the same interactive explanations. In this manner, the thinking procedure of the bot is transparent, and users can provide corrections to the bot that include a suggested action to take, a goal to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Reinforcement Learning in Robotics
