Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

Arunselvan Ramaswamy; Eyke H\"ullermeier

arXiv:2008.10870·cs.LG·April 13, 2021

Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

Arunselvan Ramaswamy, Eyke H\"ullermeier

PDF

TL;DR

This paper provides a theoretical analysis of Deep Q-Learning, demonstrating its convergence and asymptotic behavior under realistic assumptions, thereby bridging gaps between empirical success and theoretical understanding.

Contribution

It offers the first convergence proof for a version of Deep Q-Learning considering multiple stationary distributions in Markov processes.

Findings

01

Proves convergence of Deep Q-Learning under realistic assumptions

02

Characterizes the asymptotic behavior of the learning process

03

Explains empirical performance inconsistencies

Abstract

Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions, serious gaps between theory and practice as well as a lack of formal guarantees prevent its use in the real world. Adopting a dynamical systems perspective, we provide a theoretical analysis of a popular version of Deep Q-Learning under realistic and verifiable assumptions. More specifically, we prove an important result on the convergence of the algorithm, characterizing the asymptotic behavior of the learning process. Our result sheds light on hitherto unexplained properties of the algorithm and helps understand empirical observations, such as performance inconsistencies even after training. Unlike previous theories, our analysis accommodates state…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsQ-Learning