Study of scaling laws in language families
By: Maelyson R. F. Santos, Marcelo A. F. Gomes
Potential Business Impact:
Finds patterns in how languages grow.
This article investigates scaling laws within language families using data from over six thousand languages and analyzing emergent patterns observed in Zipf-like classification graphs. Both macroscopic (based on number of languages by family) and microscopic (based on numbers of speakers by language on a family) aspects of these classifications are examined. Particularly noteworthy is the discovery of a distinct division among the fourteen largest contemporary language families, excluding Afro-Asiatic and Nilo-Saharan languages. These families are found to be distributed across three language family quadruplets, each characterized by significantly different exponents in the Zipf graphs. This finding sheds light on the underlying structure and organization of major language families, revealing intriguing insights into the nature of linguistic diversity and distribution.
Similar Papers
From Zipf's Law to Neural Scaling through Heaps' Law and Hilberg's Hypothesis
Information Theory
Makes AI understand language better by finding patterns.
Zipf Distributions from Two-Stage Symbolic Processes: Stability Under Stochastic Lexical Filtering
Methodology
Explains why some words are common, others rare.
Random Text, Zipf's Law, Critical Length,and Implications for Large Language Models
Computation and Language
Explains why words appear often or rarely.