Evaluating Cross-Lingual Unlearning in Multilingual Language Models
By: Tyler Lizzo, Larry Heck
Potential Business Impact:
Removes unwanted info from AI in all languages.
We present the first comprehensive evaluation of cross-lingual unlearning in multilingual LLMs. Using translated TOFU benchmarks in seven language/script variants, we test major unlearning algorithms and show that most fail to remove facts outside the training language, even when utility remains high. However, subspace-projection consistently outperforms the other methods, achieving strong cross-lingual forgetting with minimal degradation. Analysis of learned task subspaces reveals a shared interlingua structure: removing this shared subspace harms all languages, while removing language-specific components selectively affects one. These results demonstrate that multilingual forgetting depends on geometry in weight space, motivating subspace-based approaches for future unlearning systems.
Similar Papers
Multilingual Amnesia: On the Transferability of Unlearning in Multilingual LLMs
Computation and Language
Makes AI forget bad ideas in many languages.
A Survey on Unlearning in Large Language Models
Computation and Language
Lets AI forget private or bad information.
Uncovering the Potential Risks in Unlearning: Danger of English-only Unlearning in Multilingual LLMs
Computation and Language
Fixes AI that mixes up languages when forgetting.