Towards Better Interpretability in Deep Q-Networks

Raghuram Mandyam Annasamy; Katia Sycara

arXiv:1809.05630·cs.LG·November 16, 2018

Towards Better Interpretability in Deep Q-Networks

Raghuram Mandyam Annasamy, Katia Sycara

PDF

1 Repo

TL;DR

This paper introduces an interpretable neural network architecture for deep Q-learning that offers global explanations of its behavior, but reveals shallow features and overfitting issues during testing.

Contribution

It proposes a novel interpretable architecture for deep Q-networks using key-value memories and attention, enhancing understanding of learned policies.

Findings

01

Achieves training rewards comparable to state-of-the-art models.

02

Features extracted are extremely shallow, indicating limited depth of learned representations.

03

Model overfits to training trajectories, affecting out-of-sample performance.

Abstract

Deep reinforcement learning techniques have demonstrated superior performance in a wide variety of environments. As improvements in training algorithms continue at a brisk pace, theoretical or empirical studies on understanding what these networks seem to learn, are far behind. In this paper we propose an interpretable neural network architecture for Q-learning which provides a global explanation of the model's behavior using key-value memories, attention and reconstructible embeddings. With a directed exploration strategy, our model can reach training rewards comparable to the state-of-the-art deep Q-learning models. However, results suggest that the features extracted by the neural network are extremely shallow and subsequent testing using out-of-sample examples shows that the agent can easily overfit to trajectories seen during training.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maraghuram/I-DQN
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning