Factual Inconsistencies in Multilingual Wikipedia Tables
By: Silvia Cappa , Lingxiao Kong , Pille-Riin Peet and more
Potential Business Impact:
Fixes Wikipedia facts across languages.
Wikipedia serves as a globally accessible knowledge source with content in over 300 languages. Despite covering the same topics, the different versions of Wikipedia are written and updated independently. This leads to factual inconsistencies that can impact the neutrality and reliability of the encyclopedia and AI systems, which often rely on Wikipedia as a main training source. This study investigates cross-lingual inconsistencies in Wikipedia's structured content, with a focus on tabular data. We developed a methodology to collect, align, and analyze tables from Wikipedia multilingual articles, defining categories of inconsistency. We apply various quantitative and qualitative metrics to assess multilingual alignment using a sample dataset. These insights have implications for factual verification, multilingual knowledge interaction, and design for reliable AI systems leveraging Wikipedia content.
Similar Papers
Utilizing citation index and synthetic quality measure to compare Wikipedia languages across various topics
Information Retrieval
Finds best Wikipedia articles across languages.
How Similar Are Grokipedia and Wikipedia? A Multi-Dimensional Textual and Structural Comparison
Computers and Society
AI encyclopedia shows right-wing bias in some topics.
How Similar Are Grokipedia and Wikipedia? A Multi-Dimensional Textual and Structural Comparison
Computers and Society
AI encyclopedia writes longer, fewer facts.