Interpretable Option Discovery using Deep Q-Learning and Variational   Autoencoders

Per-Arne Andersen; Ole-Christoffer Granmo; Morten Goodwin

arXiv:2210.01231·cs.LG·October 5, 2022

Interpretable Option Discovery using Deep Q-Learning and Variational Autoencoders

Per-Arne Andersen, Ole-Christoffer Granmo, Morten Goodwin

PDF

TL;DR

This paper introduces the Deep Variational Q-Network (DVQN), a novel method combining deep generative models and reinforcement learning to automatically discover options with interpretable initiation and termination conditions, improving sample efficiency and stability.

Contribution

The paper presents DVQN, a new algorithm that automates option discovery in deep RL using a variational autoencoder framework, enhancing interpretability and performance.

Findings

01

DVQN achieves comparable performance to Rainbow.

02

DVQN maintains stability during extended training.

03

Automatic initiation and termination improve option learning.

Abstract

Deep Reinforcement Learning (RL) is unquestionably a robust framework to train autonomous agents in a wide variety of disciplines. However, traditional deep and shallow model-free RL algorithms suffer from low sample efficiency and inadequate generalization for sparse state spaces. The options framework with temporal abstractions is perhaps the most promising method to solve these problems, but it still has noticeable shortcomings. It only guarantees local convergence, and it is challenging to automate initiation and termination conditions, which in practice are commonly hand-crafted. Our proposal, the Deep Variational Q-Network (DVQN), combines deep generative- and reinforcement learning. The algorithm finds good policies from a Gaussian distributed latent-space, which is especially useful for defining options. The DVQN algorithm uses MSE with KL-divergence as regularization,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning