BET: Explaining Deep Reinforcement Learning through The Error-Prone   Decisions

Xiao Liu; Jie Zhao; Wubing Chen; Mao Tan; Yongxing Su

arXiv:2401.07263·cs.LG·January 17, 2024·1 cites

BET: Explaining Deep Reinforcement Learning through The Error-Prone Decisions

Xiao Liu, Jie Zhao, Wubing Chen, Mao Tan, Yongxing Su

PDF

Open Access

TL;DR

This paper introduces BET, a novel interpretable model that identifies error-prone states in deep reinforcement learning agents by analyzing decision consistency and state neighborhoods, improving explanation fidelity especially in complex environments.

Contribution

BET is a new self-interpretable structure that pinpoints error-prone states in DRL by modeling state neighborhoods and decision uniformity, advancing explainability in complex scenarios.

Findings

01

BET outperforms existing models in explanation fidelity.

02

BET effectively identifies error-prone states in various RL environments.

03

First to explain complex multi-agent scenarios like StarCraft II transparently.

Abstract

Despite the impressive capabilities of Deep Reinforcement Learning (DRL) agents in many challenging scenarios, their black-box decision-making process significantly limits their deployment in safety-sensitive domains. Several previous self-interpretable works focus on revealing the critical states of the agent's decision. However, they cannot pinpoint the error-prone states. To address this issue, we propose a novel self-interpretable structure, named Backbone Extract Tree (BET), to better explain the agent's behavior by identify the error-prone states. At a high level, BET hypothesizes that states in which the agent consistently executes uniform decisions exhibit a reduced propensity for errors. To effectively model this phenomenon, BET expresses these states within neighborhoods, each defined by a curated set of representative states. Therefore, states positioned at a greater distance…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Topic Modeling · Adversarial Robustness in Machine Learning

MethodsSparse Evolutionary Training · Focus