Language-specific Neurons Do Not Facilitate Cross-Lingual Transfer
By: Soumen Kumar Mondal , Sayambhu Sen , Abhishek Singhania and more
Potential Business Impact:
Helps computers understand less common languages better.
Multilingual large language models (LLMs) aim towards robust natural language understanding across diverse languages, yet their performance significantly degrades on low-resource languages. This work explores whether existing techniques to identify language-specific neurons can be leveraged to enhance cross-lingual task performance of lowresource languages. We conduct detailed experiments covering existing language-specific neuron identification techniques (such as Language Activation Probability Entropy and activation probability-based thresholding) and neuron-specific LoRA fine-tuning with models like Llama 3.1 and Mistral Nemo. We find that such neuron-specific interventions are insufficient to yield cross-lingual improvements on downstream tasks (XNLI, XQuAD) in lowresource languages. This study highlights the challenges in achieving cross-lingual generalization and provides critical insights for multilingual LLMs.
Similar Papers
How does Alignment Enhance LLMs' Multilingual Capabilities? A Language Neurons Perspective
Computation and Language
Helps computers learn many languages better.
Cross-Lingual Generalization and Compression: From Language-Specific to Shared Neurons
Computation and Language
Computers learn to understand words in many languages.
Language Arithmetics: Towards Systematic Language Neuron Identification and Manipulation
Computation and Language
Controls AI language use, making it better.