On Generating Explanations for Reinforcement Learning Policies: An Empirical Study
Mikihisa Yuasa, Huy T. Tran, Ramavarapu S. Sreenivas

TL;DR
This paper presents an empirical study on generating explanations for reinforcement learning policies using linear temporal logic, aiming to improve human understanding of policy objectives and conditions in simulated environments.
Contribution
It introduces a set of linear temporal logic formulas and an algorithm to find the best explanation for RL policies, enhancing interpretability.
Findings
Effective explanations generated for RL policies in simulated environments
Demonstrated approach's ability to elucidate policy objectives and conditions
Applicable to complex RL scenarios with improved interpretability
Abstract
Understanding a \textit{reinforcement learning} policy, which guides state-to-action mappings to maximize rewards, necessitates an accompanying explanation for human comprehension. In this paper, we introduce a set of \textit{linear temporal logic} formulae designed to provide explanations for policies, and an algorithm for searching through those formulae for the one that best explains a given policy. Our focus is on explanations that elucidate both the ultimate objectives accomplished by the policy and the prerequisite conditions it upholds throughout its execution. The effectiveness of our proposed approach is illustrated through a simulated game of capture-the-flag and a car-parking environment,
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics
MethodsFocus
