Feature Engineering for Knowledge Base Construction
Christopher R\'e, Amir Abbas Sadeghian, Zifei Shan, Jaeho Shin, Feiran, Wang, Sen Wu, Ce Zhang

TL;DR
This paper discusses a feature engineering approach for knowledge base construction using probabilistic inference, emphasizing the importance of features over algorithms, and introduces the DeepDive system to facilitate this process.
Contribution
It presents a novel feature engineering methodology for KBC and introduces DeepDive, a system that enables systematic, loosely coupled knowledge base construction.
Findings
Knowledge bases built with this approach match or surpass human quality.
DeepDive allows construction of knowledge bases by a single graduate student.
Feature engineering is crucial for end-to-end quality in KBC.
Abstract
Knowledge base construction (KBC) is the process of populating a knowledge base, i.e., a relational database together with inference rules, with information extracted from documents and structured sources. KBC blurs the distinction between two traditional database problems, information extraction and information integration. For the last several years, our group has been building knowledge bases with scientific collaborators. Using our approach, we have built knowledge bases that have comparable and sometimes better quality than those constructed by human volunteers. In contrast to these knowledge bases, which took experts a decade or more human years to construct, many of our projects are constructed by a single graduate student. Our approach to KBC is based on joint probabilistic inference and learning, but we do not see inference as either a panacea or a magic bullet: inference is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Data Quality and Management · Bayesian Modeling and Causal Inference
