Approximation to Deep Q-Network by Stochastic Delay Differential Equations
By: Jianya Lu, Yingjun Mo
Potential Business Impact:
Makes computer learning more stable and predictable.
Despite the significant breakthroughs that the Deep Q-Network (DQN) has brought to reinforcement learning, its theoretical analysis remains limited. In this paper, we construct a stochastic differential delay equation (SDDE) based on the DQN algorithm and estimate the Wasserstein-1 distance between them. We provide an upper bound for the distance and prove that the distance between the two converges to zero as the step size approaches zero. This result allows us to understand DQN's two key techniques, the experience replay and the target network, from the perspective of continuous systems. Specifically, the delay term in the equation, corresponding to the target network, contributes to the stability of the system. Our approach leverages a refined Lindeberg principle and an operator comparison to establish these results.
Similar Papers
Universal Approximation Theorem of Deep Q-Networks
Machine Learning (CS)
Makes AI learn better from continuous data.
DDEQs: Distributional Deep Equilibrium Models through Wasserstein Gradient Flows
Machine Learning (CS)
Helps computers understand shapes and groups of dots.
A new architecture of high-order deep neural networks that learn martingales
Machine Learning (CS)
Makes computer trading models more accurate.