Understanding and Mitigating the Limitations of Prioritized Experience   Replay

Yangchen Pan; Jincheng Mei; Amir-massoud Farahmand; Martha White,; Hengshuai Yao; Mohsen Rohani; Jun Luo

arXiv:2007.09569·cs.AI·June 14, 2022·6 cites

Understanding and Mitigating the Limitations of Prioritized Experience Replay

Yangchen Pan, Jincheng Mei, Amir-massoud Farahmand, Martha White,, Hengshuai Yao, Mohsen Rohani, Jun Luo

PDF

Open Access 2 Repos

TL;DR

This paper analyzes the theoretical foundations of Prioritized Experience Replay, revealing its benefits and limitations, and proposes a model-based sampling method to address these issues, with experiments demonstrating improved performance.

Contribution

It provides a theoretical understanding of prioritized ER, identifies its limitations, and introduces a model-based sampling approach to mitigate these issues.

Findings

01

Prioritized ER improves early learning convergence.

02

Outdated priorities and limited coverage are key limitations.

03

Model-based sampling closely approximates ideal prioritized sampling.

Abstract

Prioritized Experience Replay (ER) has been empirically shown to improve sample efficiency across many domains and attracted great attention; however, there is little theoretical understanding of why such prioritized sampling helps and its limitations. In this work, we take a deep look at the prioritized ER. In a supervised learning setting, we show the equivalence between the error-based prioritized sampling method for mean squared error and uniform sampling for cubic power loss. We then provide theoretical insight into why it improves convergence rate upon uniform sampling during early learning. Based on the insight, we further point out two limitations of the prioritized ER method: 1) outdated priorities and 2) insufficient coverage of the sample space. To mitigate the limitations, we propose our model-based stochastic gradient Langevin dynamics sampling method. We show that our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Neural dynamics and brain function · Functional Brain Connectivity Studies

MethodsPrioritized Experience Replay · Experience Replay