Diagnosing and Mitigating Semantic Inconsistencies in Wikidata's Classification Hierarchy
By: Shixiong Zhao, Hideaki Takeda
Potential Business Impact:
Fixes mistakes in a giant online fact-checker.
Wikidata is currently the largest open knowledge graph on the web, encompassing over 120 million entities. It integrates data from various domain-specific databases and imports a substantial amount of content from Wikipedia, while also allowing users to freely edit its content. This openness has positioned Wikidata as a central resource in knowledge graph research and has enabled convenient knowledge access for users worldwide. However, its relatively loose editorial policy has also led to a degree of taxonomic inconsistency. Building on prior work, this study proposes and applies a novel validation method to confirm the presence of classification errors, over-generalized subclass links, and redundant connections in specific domains of Wikidata. We further introduce a new evaluation criterion for determining whether such issues warrant correction and develop a system that allows users to inspect the taxonomic relationships of arbitrary Wikidata entities-leveraging the platform's crowdsourced nature to its full potential.
Similar Papers
Factual Inconsistencies in Multilingual Wikipedia Tables
Computation and Language
Fixes Wikipedia facts across languages.
A Multi-Axial Mindset for Ontology Design Lessons from Wikidata's Polyhierarchical Structure
Artificial Intelligence
Lets computers organize knowledge in many ways.
Wikontic: Constructing Wikidata-Aligned, Ontology-Aware Knowledge Graphs with Large Language Models
Computation and Language
Builds smarter computer brains from text.