Cross-Lingual Transfer of Cultural Knowledge: An Asymmetric Phenomenon
By: Chen Zhang, Zhiyuan Liao, Yansong Feng
Potential Business Impact:
Computers learn different cultures by reading many languages.
Despite substantial research efforts evaluating how well large language models~(LLMs) handle global cultural diversity, the mechanisms behind their cultural knowledge acquisition, particularly in multilingual settings, remain unclear. We study this question by investigating how cultural knowledge transfers across languages during language adaptation of LLMs. We introduce an interpretable framework for studying this transfer, ensuring training data transparency and controlling transfer effects. Through a study of four non-Anglophonic cultures, we observe bidirectional cultural transfer between English and other high-resource languages, while low-resource languages primarily transfer knowledge to English with limited reverse flow. To explain this asymmetric phenomenon, we propose a frequency-based hypothesis: cultural knowledge appearing more frequently in the pretraining data transfers more easily, which is supported by empirical analysis of the training corpora.
Similar Papers
Localized Cultural Knowledge is Conserved and Controllable in Large Language Models
Computation and Language
Makes computers speak other languages like locals.
On the Acquisition of Shared Grammatical Representations in Bilingual Language Models
Computation and Language
Teaches computers to understand many languages better.
Cross-Cultural Transfer of Commonsense Reasoning in LLMs: Evidence from the Arab World
Artificial Intelligence
Teaches computers about different cultures easily.