The Links Have It: Infobox Generation by Summarization over Linked   Entities

Kezun Zhang; Yanghua Xiao; Hanghang Tong; Haixun Wang; Wei; Wang

arXiv:1406.6449·cs.IR·June 26, 2014·2 cites

The Links Have It: Infobox Generation by Summarization over Linked Entities

Kezun Zhang, Yanghua Xiao, Hanghang Tong, Haixun Wang, Wei, Wang

PDF

Open Access

TL;DR

This paper introduces a novel approach to infobox generation by summarizing relationships among linked entities in Wikipedia, reducing reliance on complex natural language understanding and supervised learning.

Contribution

It presents a new rank aggregation, clustering, and labeling method to extract structured knowledge from linked entities in Wikipedia articles.

Findings

01

Effective noise reduction through rank aggregation

02

Successful extraction of knowledge via clustering and labeling

03

Improved infobox generation accuracy

Abstract

Online encyclopedia such as Wikipedia has become one of the best sources of knowledge. Much effort has been devoted to expanding and enriching the structured data by automatic information extraction from unstructured text in Wikipedia. Although remarkable progresses have been made, their effectiveness and efficiency is still limited as they try to tackle an extremely difficult natural language understanding problems and heavily relies on supervised learning approaches which require large amount effort to label the training data. In this paper, instead of performing information extraction over unstructured natural language text directly, we focus on a rich set of semi-structured data in Wikipedia articles: linked entities. The idea of this paper is the following: If we can summarize the relationship between the entity and its linked entities, we immediately harvest some of the most…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSemantic Web and Ontologies · Natural Language Processing Techniques · Advanced Text Analysis Techniques