Importance of using appropriate baselines for evaluation of   data-efficiency in deep reinforcement learning for Atari

Kacper Kielak

arXiv:2003.10181·cs.LG·April 1, 2020·1 cites

Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari

Kacper Kielak

PDF

Open Access

TL;DR

This paper highlights the importance of using fair baselines in evaluating data-efficiency in deep reinforcement learning for Atari, demonstrating that recent improvements are often due to unfair experimental setups rather than novel methods.

Contribution

The study reveals that recent efficiency gains in RL are due to unfair baselines and proposes a modified DQN as a fair baseline for future research.

Findings

01

Allowing more frequent updates improves DQN performance.

02

Recent efficiency claims are often due to unfair experimental setups.

03

Modified DQN can match or outperform recent methods at lower complexity.

Abstract

Reinforcement learning (RL) has seen great advancements in the past few years. Nevertheless, the consensus among the RL community is that currently used methods, despite all their benefits, suffer from extreme data inefficiency, especially in the rich visual domains like Atari. To circumvent this problem, novel approaches were introduced that often claim to be much more efficient than popular variations of the state-of-the-art DQN algorithm. In this paper, however, we demonstrate that the newly proposed techniques simply used unfair baselines in their experiments. Namely, we show that the actual improvement in the efficiency came from allowing the algorithm for more training updates for each data sample, and not from employing the new methods. By allowing DQN to execute network updates more frequently we manage to reach similar or better results than the recently proposed advancement,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsReinforcement Learning in Robotics · Neural dynamics and brain function · Evolutionary Algorithms and Applications

MethodsQ-Learning · Dense Connections · Convolution · Deep Q-Network