Graying the black box: Understanding DQNs

Tom Zahavy; Nir Ben Zrihem; Shie Mannor

arXiv:1602.02658·cs.LG·April 25, 2017·58 cites

Graying the black box: Understanding DQNs

Tom Zahavy, Nir Ben Zrihem, Shie Mannor

PDF

Open Access

TL;DR

This paper introduces tools and a new model to analyze Deep Q-networks, revealing hierarchical feature aggregation and improving interpretability, debugging, and optimization in reinforcement learning.

Contribution

It presents a methodology for analyzing DQNs and proposes the SAMDP model to understand their learned representations and policies.

Findings

01

Features learned by DQNs aggregate the state space hierarchically.

02

The tools reveal how DQNs develop policies in Atari games.

03

SAMDP helps interpret and debug deep reinforcement learning models.

Abstract

In recent years there is a growing interest in using deep representations for reinforcement learning. In this paper, we present a methodology and tools to analyze Deep Q-networks (DQNs) in a non-blind matter. Moreover, we propose a new model, the Semi Aggregated Markov Decision Process (SAMDP), and an algorithm that learns it automatically. The SAMDP model allows us to identify spatio-temporal abstractions directly from features and may be used as a sub-goal detector in future work. Using our tools we reveal that the features learned by DQNs aggregate the state space in a hierarchical fashion, explaining its success. Moreover, we are able to understand and describe the policies learned by DQNs for three different Atari2600 games and suggest ways to interpret, debug and optimize deep neural networks in reinforcement learning.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)