Score: 0

Diagnosing and Mitigating Semantic Inconsistencies in Wikidata's Classification Hierarchy

Published: November 7, 2025 | arXiv ID: 2511.04926v1

By: Shixiong Zhao, Hideaki Takeda

Potential Business Impact:

Fixes mistakes in a giant online fact-checker.

Business Areas:
Semantic Web Internet Services

Wikidata is currently the largest open knowledge graph on the web, encompassing over 120 million entities. It integrates data from various domain-specific databases and imports a substantial amount of content from Wikipedia, while also allowing users to freely edit its content. This openness has positioned Wikidata as a central resource in knowledge graph research and has enabled convenient knowledge access for users worldwide. However, its relatively loose editorial policy has also led to a degree of taxonomic inconsistency. Building on prior work, this study proposes and applies a novel validation method to confirm the presence of classification errors, over-generalized subclass links, and redundant connections in specific domains of Wikidata. We further introduce a new evaluation criterion for determining whether such issues warrant correction and develop a system that allows users to inspect the taxonomic relationships of arbitrary Wikidata entities-leveraging the platform's crowdsourced nature to its full potential.

Page Count
13 pages

Category
Computer Science:
Computation and Language