On the Feasibility of Automated Detection of Allusive Text Reuse

Enrique Manjavacas; Brian Long; Mike Kestemont

arXiv:1905.02973·cs.CL·May 9, 2019·1 cites

On the Feasibility of Automated Detection of Allusive Text Reuse

Enrique Manjavacas, Brian Long, Mike Kestemont

PDF

Open Access

TL;DR

This paper explores the feasibility of automatically detecting allusive text reuse by leveraging lexical semantics and information retrieval techniques, highlighting challenges and potential improvements in retrieval accuracy.

Contribution

It introduces a novel approach combining lexical semantic information with IR methods and provides an inter-annotator agreement study for benchmark corpus creation.

Findings

01

Manual queries improve retrieval over windowing methods

02

Distributional semantics moderately boost retrieval performance

03

Low inter-annotator agreement highlights annotation challenges

Abstract

The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words. Arguably, lexical semantics can be resorted to since uncovering semantic relations between words has the potential to increase the support underlying the allusion and alleviate the lexical sparsity. A further obstacle is the lack of evaluation benchmark corpora, largely due to the highly interpretative character of the annotation process. In the present paper, we aim to elucidate the feasibility of automated allusion detection. We approach the matter from an Information Retrieval perspective in which referencing texts act as queries and referenced texts as relevant documents to be retrieved, and estimate the difficulty of benchmark corpus compilation by a novel inter-annotator agreement study on query…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Information Retrieval and Search Behavior