Score: 2

Quantifying Phonosemantic Iconicity Distributionally in 6 Languages

Published: October 15, 2025 | arXiv ID: 2510.14040v1

By: George Flint, Kaustubh Kislay

BigTech Affiliations: University of California, Berkeley

Potential Business Impact:

Words sound like what they mean.

Business Areas:
Semantic Web Internet Services

Language is, as commonly theorized, largely arbitrary. Yet, systematic relationships between phonetics and semantics have been observed in many specific cases. To what degree could those systematic relationships manifest themselves in large scale, quantitative investigations--both in previously identified and unidentified phenomena? This work undertakes a distributional approach to quantifying phonosemantic iconicity at scale across 6 diverse languages (English, Spanish, Hindi, Finnish, Turkish, and Tamil). In each language, we analyze the alignment of morphemes' phonetic and semantic similarity spaces with a suite of statistical measures, and discover an array of interpretable phonosemantic alignments not previously identified in the literature, along with crosslinguistic patterns. We also analyze 5 previously hypothesized phonosemantic alignments, finding support for some such alignments and mixed results for others.

Country of Origin
🇺🇸 United States

Repos / Data Links

Page Count
19 pages

Category
Computer Science:
Computation and Language