Ontology Guided Information Extraction from Unstructured Text

Raghu Anantharangachar; Srinivasan Ramani; S Rajagopalan

arXiv:1302.1335·cs.IR·February 7, 2013

Ontology Guided Information Extraction from Unstructured Text

Raghu Anantharangachar, Srinivasan Ramani, S Rajagopalan

PDF

TL;DR

This paper presents an ontology-guided method for extracting and populating domain ontologies with information from unstructured text, enabling precise semantic queries with high accuracy.

Contribution

It introduces heuristics for selecting relevant ontologies and extracting semantic triples from text, enhancing ontology population and query precision.

Findings

01

Achieved 95% accuracy in information extraction

02

Successfully integrated extracted data into RDF and existing ontologies

03

Improved semantic query capabilities over domain-specific data

Abstract

In this paper, we describe an approach to populate an existing ontology with instance information present in the natural language text provided as input. An ontology is defined as an explicit conceptualization of a shared domain. This approach starts with a list of relevant domain ontologies created by human experts, and techniques for identifying the most appropriate ontology to be extended with information from a given text. Then we demonstrate heuristics to extract information from the unstructured text and for adding it as structured information to the selected ontology. This identification of the relevant ontology is critical, as it is used in identifying relevant information in the text. We extract information in the form of semantic triples from the text, guided by the concepts in the ontology. We then convert the extracted information about the semantic class instances into…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.