Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study   of the DQN Target

J. Fernando Hernandez-Garcia; Richard S. Sutton

arXiv:1901.07510·cs.LG·February 11, 2019·41 cites

Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target

J. Fernando Hernandez-Garcia, Richard S. Sutton

PDF

Open Access 1 Repo

TL;DR

This paper systematically studies the effects of multi-step deep reinforcement learning algorithms, such as Retrace and Q-learning, on performance using statistical analysis in the mountain car environment.

Contribution

It provides a comprehensive analysis of how algorithmic details like off-policy correction and backup length influence multi-step RL performance.

Findings

01

Increasing backup length n improves performance.

02

Off-policy correction can negatively impact Sarsa and Q(σ).

03

Sarsa and Q-learning are more robust to target network update frequency.

Abstract

Multi-step methods such as Retrace( $λ$ ) and $n$ -step $Q$ -learning have become a crucial component of modern deep reinforcement learning agents. These methods are often evaluated as a part of bigger architectures and their evaluations rarely include enough samples to draw statistically significant conclusions about their performance. This type of methodology makes it difficult to understand how particular algorithmic details of multi-step methods influence learning. In this paper we combine the $n$ -step action-value algorithms Retrace, $Q$ -learning, Tree Backup, Sarsa, and $Q (σ)$ with an architecture analogous to DQN. We test the performance of all these algorithms in the mountain car environment; this choice of environment allows for faster training times and larger sample sizes. We present statistical analyses on the effects of the off-policy correction, the backup length…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kochlisGit/autonomous-vehicles-agent
tf

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Evolutionary Algorithms and Applications · VLSI and FPGA Design Techniques

MethodsConvolution · Dense Connections · Q-Learning · Deep Q-Network · Retrace · Sarsa