Diagnosing Bottlenecks in Deep Q-learning Algorithms
Justin Fu, Aviral Kumar, Matthew Soh, Sergey Levine

TL;DR
This paper investigates the causes of bottlenecks in Deep Q-learning, revealing how neural network size, overfitting, and sampling strategies impact learning stability and proposing a novel sampling method for improvement.
Contribution
It introduces a unit testing framework for Q-learning, analyzes the effects of neural network architecture and sampling, and proposes a new sampling method to mitigate function approximation errors.
Findings
Large neural networks improve stability
Overfitting can be mitigated with practical techniques
A new sampling method improves performance in continuous control
Abstract
Q-learning methods represent a commonly used class of algorithms in reinforcement learning: they are generally efficient and simple, and can be combined readily with function approximators for deep reinforcement learning (RL). However, the behavior of Q-learning methods with function approximation is poorly understood, both theoretically and empirically. In this work, we aim to experimentally investigate potential issues in Q-learning, by means of a "unit testing" framework where we can utilize oracles to disentangle sources of error. Specifically, we investigate questions related to function approximation, sampling error and nonstationarity, and where available, verify if trends found in oracle settings hold true with modern deep RL methods. We find that large neural network architectures have many benefits with regards to learning stability; offer several practical compensations for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Reinforcement Learning in Robotics · Advancements in Semiconductor Devices and Circuit Design
MethodsQ-Learning
