WeightLoRA: Keep Only Necessary Adapters
By: Andrey Veprikov , Vladimir Solodkin , Alexander Zyl and more
Potential Business Impact:
Trains big computer brains with less memory.
The widespread utilization of language models in modern applications is inconceivable without Parameter-Efficient Fine-Tuning techniques, such as low-rank adaptation ($\texttt{LoRA}$), which adds trainable adapters to selected layers. Although $\texttt{LoRA}$ may obtain accurate solutions, it requires significant memory to train large models and intuition on which layers to add adapters. In this paper, we propose a novel method, $\texttt{WeightLoRA}$, which overcomes this issue by adaptive selection of the most critical $\texttt{LoRA}$ heads throughout the optimization process. As a result, we can significantly reduce the number of trainable parameters while maintaining the capability to obtain consistent or even superior metric values. We conduct experiments for a series of competitive benchmarks and DeBERTa, BART, and Llama models, comparing our method with different adaptive approaches. The experimental results demonstrate the efficacy of $\texttt{WeightLoRA}$ and the superior performance of $\texttt{WeightLoRA+}$ in almost all cases.
Similar Papers
Exploring Efficient Learning of Small BERT Networks with LoRA and DoRA
Machine Learning (CS)
Makes AI smarter, faster, and cheaper to train.
Less is More: Resource-Efficient Low-Rank Adaptation
Computation and Language
Makes AI learn faster and better with less effort.
TLoRA: Tri-Matrix Low-Rank Adaptation of Large Language Models
Machine Learning (CS)
Makes AI learn new things with less effort.