Low-Rank Matrix Approximation for Neural Network Compression
By: Kalyan Cherukuri, Aarav Lala
Potential Business Impact:
Makes smart computer programs run faster and smaller.
Deep Neural Networks (DNNs) have encountered an emerging deployment challenge due to large and expensive memory and computation requirements. In this paper, we present a new Adaptive-Rank Singular Value Decomposition (ARSVD) method that approximates the optimal rank for compressing weight matrices in neural networks using spectral entropy. Unlike conventional SVD-based methods that apply a fixed-rank truncation across all layers, ARSVD uses an adaptive selection of the rank per layer through the entropy distribution of its singular values. This approach ensures that each layer will retain a certain amount of its informational content, thereby reducing redundancy. Our method enables efficient, layer-wise compression, yielding improved performance with reduced space and time complexity compared to static-rank reduction techniques.
Similar Papers
Low-Rank Prehab: Preparing Neural Networks for SVD Compression
Machine Learning (CS)
Prepares AI to shrink without losing smarts.
A Precise Performance Analysis of the Randomized Singular Value Decomposition
Numerical Analysis
Makes computer math faster for big data.
ARA: Adaptive Rank Allocation for Efficient Large Language Model SVD Compression
Machine Learning (CS)
Makes big AI models smaller and faster.