The Impact of Anisotropic Covariance Structure on the Training Dynamics and Generalization Error of Linear Networks
By: Taishi Watanabe, Ryo Karakida, Jun-nosuke Teramae
Potential Business Impact:
Data shape helps computers learn better and faster.
The success of deep neural networks largely depends on the statistical structure of the training data. While learning dynamics and generalization on isotropic data are well-established, the impact of pronounced anisotropy on these crucial aspects is not yet fully understood. We examine the impact of data anisotropy, represented by a spiked covariance structure, a canonical yet tractable model, on the learning dynamics and generalization error of a two-layer linear network in a linear regression setting. Our analysis reveals that the learning dynamics proceed in two distinct phases, governed initially by the input-output correlation and subsequently by other principal directions of the data structure. Furthermore, we derive an analytical expression for the generalization error, quantifying how the alignment of the spike structure of the data with the learning task improves performance. Our findings offer deep theoretical insights into how data anisotropy shapes the learning trajectory and final performance, providing a foundation for understanding complex interactions in more advanced network architectures.
Similar Papers
Low Rank Gradients and Where to Find Them
Machine Learning (CS)
Teaches computers to learn better from messy data.
Exact Dynamics of Multi-class Stochastic Gradient Descent
Machine Learning (Stat)
Helps computers learn better from messy data.
On the Anisotropy of Score-Based Generative Models
Machine Learning (CS)
Predicts how well AI learns from data.