Score: 1

Can you map it to English? The Role of Cross-Lingual Alignment in Multilingual Performance of LLMs

Published: April 13, 2025 | arXiv ID: 2504.09378v2

By: Kartik Ravisankar, Hyojung Han, Marine Carpuat

Potential Business Impact:

Helps computers understand many languages without extra training.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large language models (LLMs) pre-trained predominantly on English text exhibit surprising multilingual capabilities, yet the mechanisms driving cross-lingual generalization remain poorly understood. This work investigates how the alignment of representations for text written in different languages correlates with LLM performance on natural language understanding tasks and translation tasks, both at the language and the instance level. For this purpose, we introduce cross-lingual alignment metrics such as the Discriminative Alignment Index (DALI) to quantify the alignment at an instance level for discriminative tasks. Through experiments on three natural language understanding tasks (Belebele, XStoryCloze, XCOPA), and machine translation, we find that while cross-lingual alignment metrics strongly correlate with task accuracy at the language level, the sample-level alignment often fails to distinguish correct from incorrect predictions, exposing alignment as a necessary but insufficient condition for success.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
33 pages

Category
Computer Science:
Computation and Language