Dynamically Learning to Integrate in Recurrent Neural Networks
By: Blake Bordelon , Jordan Cotler , Cengiz Pehlevan and more
Potential Business Impact:
Teaches computers to remember things for longer.
Learning to remember over long timescales is fundamentally challenging for recurrent neural networks (RNNs). While much prior work has explored why RNNs struggle to learn long timescales and how to mitigate this, we still lack a clear understanding of the dynamics involved when RNNs learn long timescales via gradient descent. Here we build a mathematical theory of the learning dynamics of linear RNNs trained to integrate white noise. We show that when the initial recurrent weights are small, the dynamics of learning are described by a low-dimensional system that tracks a single outlier eigenvalue of the recurrent weights. This reveals the precise manner in which the long timescale associated with white noise integration is learned. We extend our analyses to RNNs learning a damped oscillatory filter, and find rich dynamical equations for the evolution of a conjugate pair of outlier eigenvalues. Taken together, our analyses build a rich mathematical framework for studying dynamical learning problems salient for both machine learning and neuroscience.
Similar Papers
Generative System Dynamics in Recurrent Neural Networks
Machine Learning (CS)
Makes computer memory remember longer and better.
Transient Dynamics in Lattices of Differentiating Ring Oscillators
Neural and Evolutionary Computing
Makes AI chips learn faster and use less power.
Dynamical Learning in Deep Asymmetric Recurrent Neural Networks
Disordered Systems and Neural Networks
Learns from examples without needing a teacher.