DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay
Daniel Perkins, Oscar J. Escobar, Luke Green

TL;DR
This paper investigates how epsilon-greedy exploration schedules and prioritized experience replay influence the performance of Deep Q-Networks, providing insights and practical recommendations for improving reinforcement learning efficiency.
Contribution
It systematically analyzes the effects of exploration decay and replay strategies on DQN learning, highlighting their interactions and practical implications.
Findings
Prioritized experience replay accelerates convergence.
Epsilon decay schedules significantly impact learning efficiency.
Trade-offs exist between exploration strategies and replay methods.
Abstract
We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic experimentation, we evaluate how variations in epsilon decay schedules affect learning efficiency, convergence behavior, and reward optimization. We investigate how prioritized experience replay leads to faster convergence and higher returns and show empirical results comparing uniform, no replay, and prioritized strategies across multiple simulations. Our findings illuminate the trade-offs and interactions between exploration strategies and memory management in DQN training, offering practical recommendations for robust reinforcement learning in resource-constrained settings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAge of Information Optimization · Software-Defined Networks and 5G · Reinforcement Learning in Robotics
