Merge before Forget: A Single LoRA Continual Learning via Continual Merging
By: Fuli Qiao, Mehrdad Mahdavi
Parameter-efficient continual learning has emerged as a promising approach for large language models (LLMs) to mitigate catastrophic forgetting while enabling adaptation to new tasks. Current Low-Rank Adaptation (LoRA) continual learning techniques often retain and freeze previously learned LoRAs or generate data representations to overcome forgetting, typically utilizing these to support new LoRAs learn new tasks. However, these methods not only ignore growing computational memory with tasks and limited storage space but also suffer from potential task interference due to the lack of effective LoRA merging mechanisms. In this paper, we propose a novel continual learning method that orthogonally initializes and sequentially merges LoRAs updates into a single unified LoRA. Our method leverages orthogonal basis extraction from previously learned LoRA to initialize the learning of new tasks, further exploits the intrinsic asymmetry property of LoRA components by using a time-aware scaling mechanism to balance new and old knowledge during continual merging. Our approach maintains constant memory complexity with respect to the number of tasks, minimizes interference between past and new tasks via orthogonal basis initialization, and improves performance over asymmetric LoRA merging via adaptive scaling. We provide theoretical analysis to justify our design and conduct extensive experiments across diverse continual learning benchmarks using various Llama models, demonstrating the effectiveness and efficiency of our method.
Similar Papers
LoRA-Based Continual Learning with Constraints on Critical Parameter Changes
CV and Pattern Recognition
Keeps AI smart when learning new things.
C-LoRA: Continual Low-Rank Adaptation for Pre-trained Models
Machine Learning (CS)
Helps AI learn new things without forgetting old ones.
Efficient Continual Learning in Neural Machine Translation: A Low-Rank Adaptation Approach
Computation and Language
Teaches computers new languages without forgetting old ones.