Integrating Policy Summaries with Reward Decomposition for Explaining   Reinforcement Learning Agents

Yael Septon; Tobias Huber; Elisabeth Andr\'e; Ofra Amir

arXiv:2210.11825·cs.LG·February 28, 2024·1 cites

Integrating Policy Summaries with Reward Decomposition for Explaining Reinforcement Learning Agents

Yael Septon, Tobias Huber, Elisabeth Andr\'e, Ofra Amir

PDF

Open Access

TL;DR

This paper proposes a novel approach combining local reward decomposition and global highlights to improve understanding of reinforcement learning agents, validated through user studies showing enhanced interpretability.

Contribution

It introduces a combined explanation framework for RL agents that integrates reward decomposition with global behavior summaries, a novel approach in the field.

Findings

01

Reward decomposition helps identify agent priorities.

02

Global highlights improve understanding when preferences are similar.

03

Combined explanations enhance interpretability in RL agents.

Abstract

Explaining the behavior of reinforcement learning agents operating in sequential decision-making settings is challenging, as their behavior is affected by a dynamic environment and delayed rewards. Methods that help users understand the behavior of such agents can roughly be divided into local explanations that analyze specific decisions of the agents and global explanations that convey the general strategy of the agents. In this work, we study a novel combination of local and global explanations for reinforcement learning agents. Specifically, we combine reward decomposition, a local explanation method that exposes which components of the reward function influenced a specific decision, and HIGHLIGHTS, a global explanation method that shows a summary of the agent's behavior in decisive states. We conducted two user studies to evaluate the integration of these explanation methods and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning · Data Stream Mining Techniques