Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps   for Deep Reinforcement Learning

Akanksha Atrey; Kaleigh Clary; David Jensen

arXiv:1912.05743·cs.LG·February 24, 2020·31 cites

Exploratory Not Explanatory: Counterfactual Analysis of Saliency Maps for Deep Reinforcement Learning

Akanksha Atrey, Kaleigh Clary, David Jensen

PDF

Open Access 1 Repo

TL;DR

This paper introduces a counterfactual reasoning approach to evaluate the explanations provided by saliency maps in deep reinforcement learning, highlighting their exploratory nature rather than definitive explanations.

Contribution

It proposes an empirical, counterfactual framework to test saliency map hypotheses in deep RL, providing a more rigorous evaluation method.

Findings

01

Saliency maps often lack falsifiability in explaining RL behavior.

02

Counterfactual analysis reveals limitations of saliency maps as explanations.

03

Saliency maps are more suitable for exploration than definitive explanation.

Abstract

Saliency maps are frequently used to support explanations of the behavior of deep reinforcement learning (RL) agents. However, a review of how saliency maps are used in practice indicates that the derived explanations are often unfalsifiable and can be highly subjective. We introduce an empirical approach grounded in counterfactual reasoning to test the hypotheses generated from saliency maps and assess the degree to which they correspond to the semantics of RL environments. We use Atari games, a common benchmark for deep RL, to evaluate three types of saliency maps. Our results show the extent to which existing claims about Atari games can be evaluated and suggest that saliency maps are best viewed as an exploratory tool rather than an explanatory tool.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

KDL-umass/saliency_maps
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Reinforcement Learning in Robotics · Adversarial Robustness in Machine Learning

MethodsTest