Score: 1

Unstable Grounds for Beautiful Trees? Testing the Robustness of Concept Translations in the Compilation of Multilingual Wordlists

Published: March 1, 2025 | arXiv ID: 2503.00464v1

By: David Snee , Luca Ciucci , Arne Rubehn and more

Potential Business Impact:

Makes sure word translations are accurate for language studies.

Business Areas:

A/B Testing Data and Analytics

Multilingual wordlists play a crucial role in comparative linguistics. While many studies have been carried out to test the power of computational methods for language subgrouping or divergence time estimation, few studies have put the data upon which these studies are based to a rigorous test. Here, we conduct a first experiment that tests the robustness of concept translation as an integral part of the compilation of multilingual wordlists. Investigating the variation in concept translations in independently compiled wordlists from 10 dataset pairs covering 9 different language families, we find that on average, only 83% of all translations yield the same word form, while identical forms in terms of phonetic transcriptions can only be found in 23% of all cases. Our findings can prove important when trying to assess the uncertainty of phylogenetic studies and the conclusions derived from them.

Most over-representation of phonological features in basic vocabulary disappears when controlling for spatial and phylogenetic effects

Computation and Language

Makes language sound patterns less common than thought.

8 Dec 2025 0

86%

Resource-sensitive but language-blind: Community size and not grammatical complexity better predicts the accuracy of Large Language Models in a novel Wug Test

Computation and Language

Computers learn new words like people, but for data.

14 Oct 2025 0

85%

Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability

Computers and Society

Helps computers spot fake news in many languages.

4 Jun 2025 1

View PDF Login to Bookmark

Page Count

12 pages

Unstable Grounds for Beautiful Trees? Testing the Robustness of Concept Translations in the Compilation of Multilingual Wordlists

Makes sure word translations are accurate for language studies.

Technical Abstract

Most over-representation of phonological features in basic vocabulary disappears when controlling for spatial and phylogenetic effects

Resource-sensitive but language-blind: Community size and not grammatical complexity better predicts the accuracy of Large Language Models in a novel Wug Test

Facts are Harder Than Opinions -- A Multilingual, Comparative Analysis of LLM-Based Fact-Checking Reliability