TL;DR
This study examines how low-level NLP tasks like NER and coreference resolution influence the quality of automatically constructed literary character networks, highlighting their importance and comparing traditional methods with large language models.
Contribution
It provides a detailed analysis of the impact of NER and coreference resolution on character network extraction and compares traditional NLP pipelines with large language models.
Findings
NER performance varies with the novel and affects character detection.
NER alone misses many co-occurrences; coreference resolution improves recall.
Traditional NLP pipelines outperform LLMs in recall for character network extraction.
Abstract
The automatic extraction of character networks from literary texts is generally carried out using natural language processing (NLP) cascading pipelines. While this approach is widespread, no study exists on the impact of low-level NLP tasks on their performance. In this article, we conduct such a study on a literary dataset, focusing on the role of named entity recognition (NER) and coreference resolution when extracting co-occurrence networks. To highlight the impact of these tasks' performance, we start with gold-standard annotations, progressively add uniformly distributed errors, and observe their impact in terms of character network quality. We demonstrate that NER performance depends on the tested novel and strongly affects character detection. We also show that NER-detected mentions alone miss a lot of character co-occurrences, and that coreference resolution is needed to prevent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
