Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari
Kacper Kielak

TL;DR
This paper highlights the importance of using fair baselines in evaluating data-efficiency in deep reinforcement learning for Atari, demonstrating that recent improvements are often due to unfair experimental setups rather than novel methods.
Contribution
The study reveals that recent efficiency gains in RL are due to unfair baselines and proposes a modified DQN as a fair baseline for future research.
Findings
Allowing more frequent updates improves DQN performance.
Recent efficiency claims are often due to unfair experimental setups.
Modified DQN can match or outperform recent methods at lower complexity.
Abstract
Reinforcement learning (RL) has seen great advancements in the past few years. Nevertheless, the consensus among the RL community is that currently used methods, despite all their benefits, suffer from extreme data inefficiency, especially in the rich visual domains like Atari. To circumvent this problem, novel approaches were introduced that often claim to be much more efficient than popular variations of the state-of-the-art DQN algorithm. In this paper, however, we demonstrate that the newly proposed techniques simply used unfair baselines in their experiments. Namely, we show that the actual improvement in the efficiency came from allowing the algorithm for more training updates for each data sample, and not from employing the new methods. By allowing DQN to execute network updates more frequently we manage to reach similar or better results than the recently proposed advancement,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Evolutionary Algorithms and Applications
MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network
