DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay

Daniel Perkins; Oscar J. Escobar; Luke Green

arXiv:2511.03670·cs.LG·November 6, 2025

DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay

Daniel Perkins, Oscar J. Escobar, Luke Green

PDF

Open Access

TL;DR

This paper investigates how epsilon-greedy exploration schedules and prioritized experience replay influence the performance of Deep Q-Networks, providing insights and practical recommendations for improving reinforcement learning efficiency.

Contribution

It systematically analyzes the effects of exploration decay and replay strategies on DQN learning, highlighting their interactions and practical implications.

Findings

01

Prioritized experience replay accelerates convergence.

02

Epsilon decay schedules significantly impact learning efficiency.

03

Trade-offs exist between exploration strategies and replay methods.

Abstract

We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic experimentation, we evaluate how variations in epsilon decay schedules affect learning efficiency, convergence behavior, and reward optimization. We investigate how prioritized experience replay leads to faster convergence and higher returns and show empirical results comparing uniform, no replay, and prioritized strategies across multiple simulations. Our findings illuminate the trade-offs and interactions between exploration strategies and memory management in DQN training, offering practical recommendations for robust reinforcement learning in resource-constrained settings.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAge of Information Optimization · Software-Defined Networks and 5G · Reinforcement Learning in Robotics