Machine Knowledge: Creation and Curation of Comprehensive Knowledge Bases
Gerhard Weikum, Luna Dong, Simon Razniewski, Fabian Suchanek

TL;DR
This paper surveys the fundamental concepts and practical methods for creating, organizing, and curating large-scale knowledge bases, emphasizing their importance for AI applications like question answering and data analysis.
Contribution
It provides a comprehensive overview of models and techniques for entity discovery, canonicalization, taxonomy organization, and knowledge curation, including case studies.
Findings
Effective methods for entity canonicalization and taxonomy organization.
Techniques for automatic extraction of entity properties.
Strategies for long-term knowledge base maintenance and quality assurance.
Abstract
Equipping machines with comprehensive knowledge of the world's entities and their relationships has been a long-standing goal of AI. Over the last decade, large-scale knowledge bases, also known as knowledge graphs, have been automatically constructed from web contents and text sources, and have become a key asset for search engines. This machine knowledge can be harnessed to semantically interpret textual phrases in news, social media and web tables, and contributes to question answering, natural language processing and data analytics. This article surveys fundamental concepts and practical methods for creating and curating large knowledge bases. It covers models and methods for discovering and canonicalizing entities and their semantic types and organizing them into clean taxonomies. On top of this, the article discusses the automatic extraction of entity-centric properties. To…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
