CPSVD: Enhancing Large Language Model Compression via Column-Preserving Singular Value Decomposition
By: Lin Xv , Jingsheng Gao , Xian Gao and more
Potential Business Impact:
Makes big AI models smaller without losing smarts.
The rapid advancement of Large Language Models (LLMs) faces a critical bottleneck in their immense size, necessitating efficient compression techniques. While Singular Value Decomposition (SVD) is a promising approach, existing SVD-based methods treat the entire parameter matrix uniformly, overlooking that SVD approximation errors vary significantly across different matrix parts, which often leads to suboptimal compression. To address this, we propose \textbf{C}olumn-\textbf{P}reserving \textbf{S}ingular \textbf{V}alue \textbf{D}ecomposition (CPSVD), a novel method that refines SVD-based LLM compression by intelligently segmenting the parameter matrix. Unlike traditional SVD, CPSVD identifies and directly preserves matrix columns with high decomposition errors, applying SVD only to columns with low decomposition errors, while precisely determining the optimal balance point between these two strategies to minimize error. Furthermore, leveraging the inherent heterogeneity in decomposition errors across different matrices within an LLM, CPSVD adaptively allocates non-uniform compression rates to modules within that layer, while adhering to a target layer-wise compression ratio, thereby further enhancing compression performance. Extensive experiments demonstrate that CPSVD consistently outperforms state-of-the-art SVD-based LLM compression methods, achieving lower perplexity and higher accuracy on zero-shot tasks.
Similar Papers
Delta-SVD: Efficient Compression for Personalized Text-to-Image Models
CV and Pattern Recognition
Shrinks AI art models to save space.
Singular Value Few-shot Adaptation of Vision-Language Models
CV and Pattern Recognition
Teaches AI to learn new things with fewer examples.
Activation-Informed Pareto-Guided Low-Rank Compression for Efficient LLM/VLM
Computation and Language
Makes smart computer programs smaller and faster.