Attention Loss Adjusted Prioritized Experience Replay
Zhuoying Chen, Huiping Li, Rizhong Wang

TL;DR
This paper introduces ALAP, an enhanced experience replay method that uses self-attention and double-sampling to reduce estimation errors in deep reinforcement learning, improving training efficiency.
Contribution
The paper proposes ALAP, a novel prioritized experience replay algorithm combining self-attention and double-sampling to regulate importance weights and mitigate sampling bias.
Findings
ALAP improves training efficiency across various RL algorithms.
ALAP reduces estimation error caused by non-uniform sampling.
ALAP demonstrates versatility in different RL environments.
Abstract
Prioritized Experience Replay (PER) is a technical means of deep reinforcement learning by selecting experience samples with more knowledge quantity to improve the training rate of neural network. However, the non-uniform sampling used in PER inevitably shifts the state-action space distribution and brings the estimation error of Q-value function. In this paper, an Attention Loss Adjusted Prioritized (ALAP) Experience Replay algorithm is proposed, which integrates the improved Self-Attention network with Double-Sampling mechanism to fit the hyperparameter that can regulate the importance sampling weights to eliminate the estimation error caused by PER. In order to verify the effectiveness and generality of the algorithm, the ALAP is tested with value-function based, policy-gradient based and multi-agent reinforcement learning algorithms in OPENAI gym, and comparison studies verify the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization
MethodsExperience Replay
