Enriching Wikidata with Linked Open Data
Bohui Zhang, Filip Ilievski, Pedro Szekely

TL;DR
This paper proposes a workflow to enhance Wikidata by integrating external linked open data sources, addressing gaps with high-quality, large-scale enrichment, and demonstrating successful application with DBpedia and Getty.
Contribution
A novel workflow for enriching Wikidata with external LOD sources, including gap detection, schema alignment, and validation, improving data completeness and quality.
Findings
Enriched Wikidata with millions of new statements
High-quality integration from DBpedia and Getty sources
Schema alignment and data validation are crucial for success
Abstract
Large public knowledge graphs, like Wikidata, contain billions of statements about tens of millions of entities, thus inspiring various use cases to exploit such knowledge graphs. However, practice shows that much of the relevant information that fits users' needs is still missing in Wikidata, while current linked open data (LOD) tools are not suitable to enrich large graphs like Wikidata. In this paper, we investigate the potential of enriching Wikidata with structured data sources from the LOD cloud. We present a novel workflow that includes gap detection, source selection, schema alignment, and semantic validation. We evaluate our enrichment method with two complementary LOD sources: a noisy source with broad coverage, DBpedia, and a manually curated source with a narrow focus on the art domain, Getty. Our experiments show that our workflow can enrich Wikidata with millions of novel…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSemantic Web and Ontologies · Advanced Graph Neural Networks · Natural Language Processing Techniques
