Text Segmentation using Named Entity Recognition and Co-reference   Resolution in English and Greek Texts

Pavlina Fragkou

arXiv:1610.09226·cs.CL·October 31, 2016·1 cites

Text Segmentation using Named Entity Recognition and Co-reference Resolution in English and Greek Texts

Pavlina Fragkou

PDF

Open Access

TL;DR

This study investigates how combining named entity recognition and co-reference resolution can improve text segmentation in English and Greek texts, revealing that effectiveness varies with topic, entity frequency, and segment length.

Contribution

It demonstrates the impact of integrating NER and co-reference resolution on text segmentation performance across two languages, with detailed analysis of influencing factors.

Findings

01

Performance depends on segment topic and length

02

Higher entity frequency improves segmentation accuracy

03

Language-specific differences affect results

Abstract

In this paper we examine the benefit of performing named entity recognition (NER) and co-reference resolution to an English and a Greek corpus used for text segmentation. The aim here is to examine whether the combination of text segmentation and information extraction can be beneficial for the identification of the various topics that appear in a document. NER was performed manually in the English corpus and was compared with the output produced by publicly available annotation tools while, an already existing tool was used for the Greek corpus. Produced annotations from both corpora were manually corrected and enriched to cover four types of named entities. Co-reference resolution i.e., substitution of every reference of the same instance with the same named entity identifier was subsequently performed. The evaluation, using five text segmentation algorithms for the English corpus and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling · Text and Document Classification Technologies