Diagnosing and Mitigating Semantic Inconsistencies in Wikidata's Classification Hierarchy
Shixiong Zhao, Hideaki Takeda

TL;DR
This paper presents a new validation method and system for detecting and inspecting semantic inconsistencies and errors in Wikidata's classification hierarchy, improving data quality and reliability.
Contribution
It introduces a novel validation approach and an inspection system to identify and evaluate classification errors in Wikidata's taxonomy.
Findings
Detected numerous classification errors and redundant links
Developed a system enabling user inspection of taxonomic relationships
Proposed criteria to prioritize corrections in Wikidata
Abstract
Wikidata is currently the largest open knowledge graph on the web, encompassing over 120 million entities. It integrates data from various domain-specific databases and imports a substantial amount of content from Wikipedia, while also allowing users to freely edit its content. This openness has positioned Wikidata as a central resource in knowledge graph research and has enabled convenient knowledge access for users worldwide. However, its relatively loose editorial policy has also led to a degree of taxonomic inconsistency. Building on prior work, this study proposes and applies a novel validation method to confirm the presence of classification errors, over-generalized subclass links, and redundant connections in specific domains of Wikidata. We further introduce a new evaluation criterion for determining whether such issues warrant correction and develop a system that allows users…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Wikis in Education and Collaboration · Semantic Web and Ontologies
