Autonomous Self-Explanation of Behavior for Interactive Reinforcement Learning Agents
Yosuke Fukuchi, Masahiko Osawa, Hiroshi Yamakawa, Michita Imai

TL;DR
This paper introduces Instruction-based Behavior Explanation (IBE), a method enabling autonomous agents to explain their behavior by reusing human instructions, thereby improving transparency and adaptability in human-robot cooperation.
Contribution
The paper presents a novel IBE method that allows agents to autonomously generate explanations of their behavior using human instructions, enhancing interpretability during learning and adaptation.
Findings
Enables agents to explain behavior with human instruction reuse.
Supports developmental agents with changing policies.
Improves human understanding and cooperation with robots.
Abstract
In cooperation, the workers must know how co-workers behave. However, an agent's policy, which is embedded in a statistical machine learning model, is hard to understand, and requires much time and knowledge to comprehend. Therefore, it is difficult for people to predict the behavior of machine learning robots, which makes Human Robot Cooperation challenging. In this paper, we propose Instruction-based Behavior Explanation (IBE), a method to explain an autonomous agent's future behavior. In IBE, an agent can autonomously acquire the expressions to explain its own behavior by reusing the instructions given by a human expert to accelerate the learning of the agent's policy. IBE also enables a developmental agent, whose policy may change during the cooperation, to explain its own behavior with sufficient time granularity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
