Score: 0

Conditions for Catastrophic Forgetting in Multilingual Translation

Published: October 22, 2025 | arXiv ID: 2510.19546v1

By: Danni Liu, Jan Niehues

Potential Business Impact:

Keeps AI smart in many languages.

Business Areas:
Language Learning Education

Fine-tuning multilingual foundation models on specific languages often induces catastrophic forgetting, degrading performance on languages unseen in fine-tuning. While this phenomenon is widely-documented, the literature presents fragmented results about when forgetting occurs. To address this ambiguity, we conduct a systematic empirical study using machine translation as a testbed to identify the conditions that trigger catastrophic forgetting in multilingual fine-tuning. Through controlled experiments across different model architectures, data scales, and fine-tuning approaches, we reveal that the relative scale between model and data size is a primary determinant of forgetting. Moreover, we demonstrate that a model's instruction-following ability is more critical for retaining multilingual knowledge than its architecture. Contrary to assumptions, parameter-efficient fine-tuning offers no clear advantage over full fine-tuning in mitigating forgetting. Lastly, we show that cross-lingual alignment can mitigate forgetting while also facilitating positive transfer to unseen target languages.

Country of Origin
🇩🇪 Germany

Page Count
13 pages

Category
Computer Science:
Computation and Language