A Deeper Look at Experience Replay
Shangtong Zhang, Richard S. Sutton

TL;DR
This paper systematically studies experience replay in deep reinforcement learning, revealing that large buffers can harm performance and proposing a simple method to mitigate this issue, validated across various domains.
Contribution
It provides a comprehensive empirical analysis of experience replay, highlighting the importance of buffer size and introducing an effective O(1) remedy.
Findings
Large replay buffers can significantly degrade performance.
A simple O(1) method effectively mitigates negative effects of large buffers.
The proposed method improves results in both simple and complex RL environments.
Abstract
Recently experience replay is widely used in various deep reinforcement learning (RL) algorithms, in this paper we rethink the utility of experience replay. It introduces a new hyper-parameter, the memory buffer size, which needs carefully tuning. However unfortunately the importance of this new hyper-parameter has been underestimated in the community for a long time. In this paper we did a systematic empirical study of experience replay under various function representations. We showcase that a large replay buffer can significantly hurt the performance. Moreover, we propose a simple O(1) method to remedy the negative influence of a large replay buffer. We showcase its utility in both simple grid world and challenging domains like Atari games.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsReinforcement Learning in Robotics · Mind wandering and attention · Advanced Bandit Algorithms Research
MethodsExperience Replay
