Mining Asymmetric Intertextuality

Pak Kin Lau; Stuart Michael McManus

arXiv:2410.15145·cs.IR·October 22, 2024

Mining Asymmetric Intertextuality

Pak Kin Lau, Stuart Michael McManus

PDF

Open Access

TL;DR

This paper presents a scalable, adaptive method for mining asymmetric intertextual relationships in texts, utilizing LLM-assisted normalization, vector similarity, and verification to handle explicit and implicit references across large, evolving corpora.

Contribution

It introduces a novel split-normalize-merge paradigm for detecting asymmetric intertextuality, suitable for dynamic and large-scale literary and historical datasets.

Findings

01

Effective detection of explicit and implicit intertextual links.

02

Scalable approach adaptable to growing corpora.

03

Utilizes LLMs for metadata extraction and verification.

Abstract

This paper introduces a new task in Natural Language Processing (NLP) and Digital Humanities (DH): Mining Asymmetric Intertextuality. Asymmetric intertextuality refers to one-sided relationships between texts, where one text cites, quotes, or borrows from another without reciprocation. These relationships are common in literature and historical texts, where a later work references aclassical or older text that remain static. We propose a scalable and adaptive approach for mining asymmetric intertextuality, leveraging a split-normalize-merge paradigm. In this approach, documents are split into smaller chunks, normalized into structured data using LLM-assisted metadata extraction, and merged during querying to detect both explicit and implicit intertextual relationships. Our system handles intertextuality at various levels, from direct quotations to paraphrasing and cross-document…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques