Ensemble Elastic DQN: A novel multi-step ensemble approach to address overestimation in deep value-based reinforcement learning
By: Adrian Ly , Richard Dazeley , Peter Vamplew and more
Potential Business Impact:
Makes computer games learn faster and better.
While many algorithmic extensions to Deep Q-Networks (DQN) have been proposed, there remains limited understanding of how different improvements interact. In particular, multi-step and ensemble style extensions have shown promise in reducing overestimation bias, thereby improving sample efficiency and algorithmic stability. In this paper, we introduce a novel algorithm called Ensemble Elastic Step DQN (EEDQN), which unifies ensembles with elastic step updates to stabilise algorithmic performance. EEDQN is designed to address two major challenges in deep reinforcement learning: overestimation bias and sample efficiency. We evaluated EEDQN against standard and ensemble DQN variants across the MinAtar benchmark, a set of environments that emphasise behavioral learning while reducing representational complexity. Our results show that EEDQN achieves consistently robust performance across all tested environments, outperforming baseline DQN methods and matching or exceeding state-of-the-art ensemble DQNs in final returns on most of the MinAtar environments. These findings highlight the potential of systematically combining algorithmic improvements and provide evidence that ensemble and multi-step methods, when carefully integrated, can yield substantial gains.
Similar Papers
Two is Better than One: Efficient Ensemble Defense for Robust and Compact Models
CV and Pattern Recognition
Makes AI smarter and faster on phones.
Dual Ensembled Multiagent Q-Learning with Hypernet Regularizer
Multiagent Systems
Makes AI teams learn and work together better.
Double Q-learning for Value-based Deep Reinforcement Learning, Revisited
Machine Learning (CS)
Makes computer games learn better and faster.