Score: 1

Are the LLMs Capable of Maintaining at Least the Language Genus?

Published: October 24, 2025 | arXiv ID: 2510.21561v1

By: Sandra Mitrović , David Kletz , Ljiljana Dolamic and more

Potential Business Impact:

Computers understand languages better when they're related.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Large Language Models (LLMs) display notable variation in multilingual behavior, yet the role of genealogical language structure in shaping this variation remains underexplored. In this paper, we investigate whether LLMs exhibit sensitivity to linguistic genera by extending prior analyses on the MultiQ dataset. We first check if models prefer to switch to genealogically related languages when prompt language fidelity is not maintained. Next, we investigate whether knowledge consistency is better preserved within than across genera. We show that genus-level effects are present but strongly conditioned by training resource availability. We further observe distinct multilingual strategies across LLMs families. Our findings suggest that LLMs encode aspects of genus-level structure, but training data imbalances remain the primary factor shaping their multilingual performance.

Country of Origin
🇨🇭 Switzerland

Repos / Data Links

Page Count
17 pages

Category
Computer Science:
Computation and Language