Improving Quotation Attribution with Fictional Character Embeddings
Gaspard Michel, Elena V. Epure, Romain Hennequin, Christophe, Cerisara

TL;DR
This paper enhances quotation attribution in literary texts by integrating character embeddings derived from stylometric analysis into existing systems, significantly improving accuracy for complex attribution cases.
Contribution
It introduces a novel approach combining character embeddings with a popular attribution system, and creates a new annotated corpus and tailored stylometric models for literary character analysis.
Findings
Improved speaker identification for anaphoric and implicit quotes.
Achieved state-of-the-art performance on 28 novels.
Demonstrated the effectiveness of character embeddings in literary attribution tasks.
Abstract
Humans naturally attribute utterances of direct speech to their speaker in literary works. When attributing quotes, we process contextual information but also access mental representations of characters that we build and revise throughout the narrative. Recent methods to automatically attribute such utterances have explored simulating human logic with deterministic rules or learning new implicit rules with neural networks when processing contextual information. However, these systems inherently lack \textit{character} representations, which often leads to errors in more challenging examples of attribution: anaphoric and implicit quotes. In this work, we propose to augment a popular quotation attribution system, BookNLP, with character embeddings that encode global stylistic information of characters derived from an off-the-shelf stylometric model, Universal Authorship Representation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques
