Ontologies and Information Extraction

Claire N\'edellec (MIG); Adeline Nazarenko (LIPN)

arXiv:cs/0609137·cs.AI·August 16, 2016

Ontologies and Information Extraction

Claire N\'edellec (MIG), Adeline Nazarenko (LIPN)

PDF

Open Access

TL;DR

This paper emphasizes that information extraction (IE) fundamentally relies on ontologies for interpreting text within a domain, especially in biology, where content-based literature exploration is crucial.

Contribution

It highlights the importance of ontology-driven approaches in IE and discusses how varying levels of domain knowledge influence extraction processes.

Findings

01

IE is inherently ontology-driven, even in simple cases.

02

Deeper interpretation in IE requires more domain knowledge.

03

Biology exemplifies the critical need for content-based literature exploration.

Abstract

This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Advanced Text Analysis Techniques