Investigating the Interplay of Prioritized Replay and Generalization
Parham Mohammad Panahi, Andrew Patterson, Martha White, Adam White

TL;DR
This paper examines how prioritized replay influences learning in reinforcement learning, revealing its benefits and limitations with neural networks and proposing improvements for specific settings.
Contribution
It provides a detailed analysis of prioritized experience replay's effectiveness across different scenarios and introduces enhancements for its application in tabular and noisy environments.
Findings
PER improves value propagation in tabular settings
PER does not consistently outperform uniform replay with neural networks
Mitigations can reduce error spikes but often do not surpass uniform replay
Abstract
Experience replay, the reuse of past data to improve sample efficiency, is ubiquitous in reinforcement learning. Though a variety of smart sampling schemes have been introduced to improve performance, uniform sampling by far remains the most common approach. One exception is Prioritized Experience Replay (PER), where sampling is done proportionally to TD errors, inspired by the success of prioritized sweeping in dynamic programming. The original work on PER showed improvements in Atari, but follow-up results were mixed. In this paper, we investigate several variations on PER, to attempt to understand where and when PER may be useful. Our findings in prediction tasks reveal that while PER can improve value propagation in tabular settings, behavior is significantly different when combined with neural networks. Certain mitigations like delaying target network updates to control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCommunication in Education and Healthcare
MethodsPrioritized Sweeping · Prioritized Experience Replay · Experience Replay
