Dynamic Rank Adjustment for Accurate and Efficient Neural Network Training
By: Hyuntak Shin , Aecheon Jung , Sungeun Hong and more
Potential Business Impact:
Makes AI learn better without needing more power.
Low-rank training methods reduce the number of trainable parameters by re-parameterizing the weights with matrix decompositions (e.g., singular value decomposition). However, enforcing a fixed low-rank structure caps the rank of the weight matrices and can hinder the model's ability to learn complex patterns. Furthermore, the effective rank of the model's weights tends to decline during training, and this drop is accelerated when the model is reparameterized into a low-rank structure. In this study, we argue that strategically interleaving full-rank training epochs within low-rank training epochs can effectively restore the rank of the model's weights. Based on our findings, we propose a general dynamic-rank training framework that is readily applicable to a wide range of neural-network tasks. We first describe how to adjust the rank of weight matrix to alleviate the inevitable rank collapse that arises during training, and then present extensive empirical results that validate our claims and demonstrate the efficacy of the proposed framework. Our empirical study shows that the proposed method achieves almost the same computational cost as SVD-based low-rank training while achieving a comparable accuracy to full-rank training across various benchmarks.
Similar Papers
Dynamic Rank Adjustment for Accurate and Efficient Neural Network Training
Machine Learning (CS)
Makes computer learning faster without losing smarts.
Low-Rank Matrix Approximation for Neural Network Compression
Machine Learning (CS)
Makes smart computer programs run faster and smaller.
Low-Rank Prehab: Preparing Neural Networks for SVD Compression
Machine Learning (CS)
Prepares AI to shrink without losing smarts.