Score: 0

Study of scaling laws in language families

Published: April 2, 2025 | arXiv ID: 2504.01681v1

By: Maelyson R. F. Santos, Marcelo A. F. Gomes

Potential Business Impact:

Finds patterns in how languages grow.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

This article investigates scaling laws within language families using data from over six thousand languages and analyzing emergent patterns observed in Zipf-like classification graphs. Both macroscopic (based on number of languages by family) and microscopic (based on numbers of speakers by language on a family) aspects of these classifications are examined. Particularly noteworthy is the discovery of a distinct division among the fourteen largest contemporary language families, excluding Afro-Asiatic and Nilo-Saharan languages. These families are found to be distributed across three language family quadruplets, each characterized by significantly different exponents in the Zipf graphs. This finding sheds light on the underlying structure and organization of major language families, revealing intriguing insights into the nature of linguistic diversity and distribution.

Page Count
10 pages

Category
Physics:
Physics and Society