Information Extraction From Co-Occurring Similar Entities

Nicolas Heist; Heiko Paulheim

arXiv:2102.05444·cs.IR·February 16, 2021

Information Extraction From Co-Occurring Similar Entities

Nicolas Heist, Heiko Paulheim

PDF

TL;DR

This paper presents a rule-based method to extract and incorporate new entities and relationships from co-occurring similar entities in listings, significantly expanding the coverage of existing knowledge graphs like DBpedia.

Contribution

It introduces a descriptive rule mining approach using distant supervision to enhance knowledge graphs with entities and assertions from Wikipedia listings.

Findings

01

Extracted up to 3 million new entities and 30 million assertions.

02

Achieved approximately 50% increase in entity coverage for DBpedia.

03

Demonstrated high quality of extracted information for knowledge graph extension.

Abstract

Knowledge about entities and their interrelations is a crucial factor of success for tasks like question answering or text summarization. Publicly available knowledge graphs like Wikidata or DBpedia are, however, far from being complete. In this paper, we explore how information extracted from similar entities that co-occur in structures like tables or lists can help to increase the coverage of such knowledge graphs. In contrast to existing approaches, we do not focus on relationships within a listing (e.g., between two entities in a table row) but on the relationship between a listing's subject entities and the context of the listing. To that end, we propose a descriptive rule mining approach that uses distant supervision to derive rules for these relationships based on a listing's context. Extracted from a suitable data corpus, the rules can be used to extend a knowledge graph with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.