Understanding Incremental Learning with Closed-form Solution to Gradient Flow on Overparamerterized Matrix Factorization
By: Hancheng Min, René Vidal
Potential Business Impact:
Teaches computers to learn things step-by-step.
Many theoretical studies on neural networks attribute their excellent empirical performance to the implicit bias or regularization induced by first-order optimization algorithms when training networks under certain initialization assumptions. One example is the incremental learning phenomenon in gradient flow (GF) on an overparamerterized matrix factorization problem with small initialization: GF learns a target matrix by sequentially learning its singular values in decreasing order of magnitude over time. In this paper, we develop a quantitative understanding of this incremental learning behavior for GF on the symmetric matrix factorization problem, using its closed-form solution obtained by solving a Riccati-like matrix differential equation. We show that incremental learning emerges from some time-scale separation among dynamics corresponding to learning different components in the target matrix. By decreasing the initialization scale, these time-scale separations become more prominent, allowing one to find low-rank approximations of the target matrix. Lastly, we discuss the possible avenues for extending this analysis to asymmetric matrix factorization problems.
Similar Papers
Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Optimization and Control
Makes deep computer learning work better.
Global Convergence of Four-Layer Matrix Factorization under Random Initialization
Optimization and Control
Makes deep computer learning work better.
Learning Rate Scheduling with Matrix Factorization for Private Training
Machine Learning (CS)
Makes private computer learning more accurate.