Benchmarking Perturbation-based Saliency Maps for Explaining Atari   Agents

Tobias Huber; Benedikt Limmer; Elisabeth Andr\'e

arXiv:2101.07312·cs.LG·February 3, 2022

Benchmarking Perturbation-based Saliency Maps for Explaining Atari Agents

Tobias Huber, Benedikt Limmer, Elisabeth Andr\'e

PDF

Open Access 1 Repo

TL;DR

This paper evaluates and compares five perturbation-based saliency map methods for explaining Deep Reinforcement Learning agents in Atari games, addressing challenges in fidelity assessment and proposing solutions for improved evaluation.

Contribution

It provides a systematic comparison of saliency map approaches for DRL agents, introduces a fix for sanity check issues, and analyzes factors influencing method selection.

Findings

01

Identified issues with one saliency approach during sanity checks

02

Proposed a solution to fix sanity check issues

03

Determined key factors affecting saliency method effectiveness

Abstract

One of the most prominent methods for explaining the behavior of Deep Reinforcement Learning (DRL) agents is the generation of saliency maps that show how much each pixel attributed to the agents' decision. However, there is no work that computationally evaluates and compares the fidelity of different saliency map approaches specifically for DRL agents. It is particularly challenging to computationally evaluate saliency maps for DRL agents since their decisions are part of an overarching policy. For instance, the output neurons of value-based DRL algorithms encode both the value of the current state as well as the value of doing each action in this state. This ambiguity should be considered when evaluating saliency maps for such agents. In this paper, we compare five popular perturbation-based approaches to create saliency maps for DRL agents trained on four different Atari 2600 games.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

belimmer/PerturbationSaliencyEvaluation
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Adversarial Robustness in Machine Learning

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network