Efficient Continual Learning in Neural Machine Translation: A Low-Rank Adaptation Approach
By: Salvador Carrión, Francisco Casacuberta
Continual learning in Neural Machine Translation (NMT) faces the dual challenges of catastrophic forgetting and the high computational cost of retraining. This study establishes Low-Rank Adaptation (LoRA) as a parameter-efficient framework to address these challenges in dedicated NMT architectures. We first demonstrate that LoRA-based fine-tuning adapts NMT models to new languages and domains with performance on par with full-parameter techniques, while utilizing only a fraction of the parameter space. Second, we propose an interactive adaptation method using a calibrated linear combination of LoRA modules. This approach functions as a gate-free mixture of experts, enabling real-time, user-controllable adjustments to domain and style without retraining. Finally, to mitigate catastrophic forgetting, we introduce a novel gradient-based regularization strategy specifically designed for low-rank decomposition matrices. Unlike methods that regularize the full parameter set, our approach weights the penalty on the low-rank updates using historical gradient information. Experimental results indicate that this strategy efficiently preserves prior domain knowledge while facilitating the acquisition of new tasks, offering a scalable paradigm for interactive and continual NMT.
Similar Papers
Data Efficient Adaptation in Large Language Models via Continuous Low-Rank Fine-Tuning
Artificial Intelligence
Teaches AI new things without forgetting old.
Deep Generative Continual Learning using Functional LoRA: FunLoRA
CV and Pattern Recognition
Keeps AI learning new things without forgetting.
Parameter Efficient Continual Learning with Dynamic Low-Rank Adaptation
Machine Learning (CS)
Teaches computers new things without forgetting old ones.