Towards Higher Effective Rank in Parameter-efficient Fine-tuning using Khatri--Rao Product
By: Paul Albert , Frederic Z. Zhang , Hemanth Saratchandran and more
Potential Business Impact:
Makes AI learn better without needing more power.
Parameter-efficient fine-tuning (PEFT) has become a standard approach for adapting large pre-trained models. Amongst PEFT methods, low-rank adaptation (LoRA) has achieved notable success. However, recent studies have highlighted its limitations compared against full-rank alternatives, particularly when applied to multimodal and large language models. In this work, we present a quantitative comparison amongst full-rank and low-rank PEFT methods using a synthetic matrix approximation benchmark with controlled spectral properties. Our results confirm that LoRA struggles to approximate matrices with relatively flat spectrums or high frequency components -- signs of high effective ranks. To this end, we introduce KRAdapter, a novel PEFT algorithm that leverages the Khatri-Rao product to produce weight updates, which, by construction, tends to produce matrix product with a high effective rank. We demonstrate performance gains with KRAdapter on vision-language models up to 1B parameters and on large language models up to 8B parameters, particularly on unseen common-sense reasoning tasks. In addition, KRAdapter maintains the memory and compute efficiency of LoRA, making it a practical and robust alternative to fine-tune billion-scale parameter models.
Similar Papers
HyperAdaLoRA: Accelerating LoRA Rank Allocation During Training via Hypernetworks without Sacrificing Performance
Machine Learning (CS)
Makes AI learn faster without needing more power.
DropLoRA: Sparse Low-Rank Adaptation for Parameter-Efficient Fine-Tuning
Computation and Language
Makes AI smarter without more training.
1LoRA: Summation Compression for Very Low-Rank Adaptation
CV and Pattern Recognition
Makes big computer brains learn faster with less effort.