The Influence of the Memory Capacity of Neural DDEs on the Universal Approximation Property
By: Christian Kuehn, Sara-Viola Kuntz
Potential Business Impact:
Makes computers learn better with memory.
Neural Ordinary Differential Equations (Neural ODEs), which are the continuous-time analog of Residual Neural Networks (ResNets), have gained significant attention in recent years. Similarly, Neural Delay Differential Equations (Neural DDEs) can be interpreted as an infinite depth limit of Densely Connected Residual Neural Networks (DenseResNets). In contrast to traditional ResNet architectures, DenseResNets are feed-forward networks that allow for shortcut connections across all layers. These additional connections introduce memory in the network architecture, as typical in many modern architectures. In this work, we explore how the memory capacity in neural DDEs influences the universal approximation property. The key parameter for studying the memory capacity is the product $K \tau$ of the Lipschitz constant and the delay of the DDE. In the case of non-augmented architectures, where the network width is not larger than the input and output dimensions, neural ODEs and classical feed-forward neural networks cannot have the universal approximation property. We show that if the memory capacity $K\tau$ is sufficiently small, the dynamics of the neural DDE can be approximated by a neural ODE. Consequently, non-augmented neural DDEs with a small memory capacity also lack the universal approximation property. In contrast, if the memory capacity $K\tau$ is sufficiently large, we can establish the universal approximation property of neural DDEs for continuous functions. If the neural DDE architecture is augmented, we can expand the parameter regions in which universal approximation is possible. Overall, our results show that by increasing the memory capacity $K\tau$, the infinite-dimensional phase space of DDEs with positive delay $\tau>0$ is not sufficient to guarantee a direct jump transition to universal approximation, but only after a certain memory threshold, universal approximation holds.
Similar Papers
Approximation properties of neural ODEs
Numerical Analysis
Makes smart computer programs learn better.
Universal approximation property of neural stochastic differential equations
Probability
Neural networks can copy complex math problems.
Quantitative Flow Approximation Properties of Narrow Neural ODEs
Optimization and Control
Helps computers learn faster by simplifying how they think.