On the Stability of the Jacobian Matrix in Deep Neural Networks
By: Benjamin Dadoun , Soufiane Hayou , Hanan Salam and more
Potential Business Impact:
Makes smart computer programs learn better.
Deep neural networks are known to suffer from exploding or vanishing gradients as depth increases, a phenomenon closely tied to the spectral behavior of the input-output Jacobian. Prior work has identified critical initialization schemes that ensure Jacobian stability, but these analyses are typically restricted to fully connected networks with i.i.d. weights. In this work, we go significantly beyond these limitations: we establish a general stability theorem for deep neural networks that accommodates sparsity (such as that introduced by pruning) and non-i.i.d., weakly correlated weights (e.g. induced by training). Our results rely on recent advances in random matrix theory, and provide rigorous guarantees for spectral stability in a much broader class of network models. This extends the theoretical foundation for initialization schemes in modern neural networks with structured and dependent randomness.
Similar Papers
Neural Networks with Orthogonal Jacobian
Machine Learning (CS)
Makes deep computer brains learn much faster.
An Analytical Characterization of Sloppiness in Neural Networks: Insights from Linear Models
Machine Learning (CS)
Finds simple patterns in how computer brains learn.
The stability of shallow neural networks on spheres: A sharp spectral analysis
Numerical Analysis
Makes AI learn better and work more reliably.