DQN Performance with Epsilon Greedy Policies and Prioritized Experience Replay
By: Daniel Perkins, Oscar J. Escobar, Luke Green
Potential Business Impact:
Teaches computers to learn faster and better.
We present a detailed study of Deep Q-Networks in finite environments, emphasizing the impact of epsilon-greedy exploration schedules and prioritized experience replay. Through systematic experimentation, we evaluate how variations in epsilon decay schedules affect learning efficiency, convergence behavior, and reward optimization. We investigate how prioritized experience replay leads to faster convergence and higher returns and show empirical results comparing uniform, no replay, and prioritized strategies across multiple simulations. Our findings illuminate the trade-offs and interactions between exploration strategies and memory management in DQN training, offering practical recommendations for robust reinforcement learning in resource-constrained settings.
Similar Papers
Uncertainty Prioritized Experience Replay
Machine Learning (CS)
Teaches robots to learn from important mistakes.
Optimization of Epsilon-Greedy Exploration
Machine Learning (CS)
Finds best way to show new things to people.
Sample Efficient Experience Replay in Non-stationary Environments
Machine Learning (CS)
Teaches robots to learn faster when things change.