Semantic Pivots Enable Cross-Lingual Transfer in Large Language Models
By: Kaiyu He , Tong Zhou , Yubo Chen and more
Potential Business Impact:
Makes computers understand and translate languages better.
Large language models (LLMs) demonstrate remarkable ability in cross-lingual tasks. Understanding how LLMs acquire this ability is crucial for their interpretability. To quantify the cross-lingual ability of LLMs accurately, we propose a Word-Level Cross-Lingual Translation Task. To find how LLMs learn cross-lingual ability, we trace the outputs of LLMs' intermediate layers in the word translation task. We identify and distinguish two distinct behaviors in the forward pass of LLMs: co-occurrence behavior and semantic pivot behavior. We attribute LLMs' two distinct behaviors to the co-occurrence frequency of words and find the semantic pivot from the pre-training dataset. Finally, to apply our findings to improve the cross-lingual ability of LLMs, we reconstruct a semantic pivot-aware pre-training dataset using documents with a high proportion of semantic pivots. Our experiments validate the effectiveness of our approach in enhancing cross-lingual ability. Our research contributes insights into the interpretability of LLMs and offers a method for improving LLMs' cross-lingual ability.
Similar Papers
Can you map it to English? The Role of Cross-Lingual Alignment in Multilingual Performance of LLMs
Computation and Language
Helps computers understand many languages without extra training.
Enhancing LLM Language Adaption through Cross-lingual In-Context Pre-training
Computation and Language
Helps computers understand many languages better.
Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically
Computation and Language
Helps computers understand many languages by listening.