Ontologies and Information Extraction
Claire N\'edellec (MIG), Adeline Nazarenko (LIPN)

TL;DR
This paper emphasizes that information extraction (IE) fundamentally relies on ontologies for interpreting text within a domain, especially in biology, where content-based literature exploration is crucial.
Contribution
It highlights the importance of ontology-driven approaches in IE and discusses how varying levels of domain knowledge influence extraction processes.
Findings
IE is inherently ontology-driven, even in simple cases.
Deeper interpretation in IE requires more domain knowledge.
Biology exemplifies the critical need for content-based literature exploration.
Abstract
This report argues that, even in the simplest cases, IE is an ontology-driven process. It is not a mere text filtering method based on simple pattern matching and keywords, because the extracted pieces of texts are interpreted with respect to a predefined partial domain model. This report shows that depending on the nature and the depth of the interpretation to be done for extracting the information, more or less knowledge must be involved. This report is mainly illustrated in biology, a domain in which there are critical needs for content-based exploration of the scientific literature and which becomes a major application domain for IE.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBiomedical Text Mining and Ontologies · Advanced Text Analysis Techniques
