Low-Rank Prehab: Preparing Neural Networks for SVD Compression
By: Haoran Qin , Shansita Sharma , Ali Abbasi and more
Potential Business Impact:
Prepares AI to shrink without losing smarts.
Low-rank approximation methods such as singular value decomposition (SVD) and its variants (e.g., Fisher-weighted SVD, Activation SVD) have recently emerged as effective tools for neural network compression. In this setting, decomposition acts as a "surgical" intervention, followed by fine-tuning that serves as "rehab" to recover accuracy. Inspired by prehabilitation in surgery, we introduce a pre-compression fine-tuning stage, Low-Rank Prehab, that explicitly encourages low-rank structure in weight matrices while preserving task performance. By conditioning the model before SVD, Prehab steers weights toward spectrally compact regions of the parameter space, enabling smoother low-rank approximation and improved recovery. Experiments on large language models (LLMs) and other Transformer-based architectures, including Vision Transformers (ViTs), show that Prehab substantially reduces the immediate accuracy drop after compression and consistently improves post-finetuning performance. Across a wide range of compression ratios, our method outperforms state-of-the-art SVD-based techniques such as SVD-LLM, highlighting the importance of preparing models for compression rather than only improving the compression and recovery stages. Source code is available at https://github.com/niqretnuh/PREHAB-SVD
Similar Papers
Low-Rank Matrix Approximation for Neural Network Compression
Machine Learning (CS)
Makes smart computer programs run faster and smaller.
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Computation and Language
Makes smart computer programs smaller and faster.
Dynamic Rank Adjustment for Accurate and Efficient Neural Network Training
Machine Learning (CS)
Makes AI learn better without needing more power.