RELIC: Retrieving Evidence for Literary Claims
Katherine Thai, Yapei Chang, Kalpesh Krishna, and Mohit Iyyer

TL;DR
This paper introduces RELIC, a large dataset for literary evidence retrieval, and develops a dense retriever model that outperforms baselines but still needs improvement, highlighting challenges in understanding complex literary texts.
Contribution
The paper presents RELIC, a novel large-scale dataset for literary evidence retrieval, and proposes a RoBERTa-based dense retriever model for this task.
Findings
The dense retriever outperforms existing IR baselines.
Human experts identify significant room for improvement.
The dataset enables new research in literary text understanding.
Abstract
Humanities scholars commonly provide evidence for claims that they make about a work of literature (e.g., a novel) in the form of quotations from the work. We collect a large-scale dataset (RELiC) of 78K literary quotations and surrounding critical analysis and use it to formulate the novel task of literary evidence retrieval, in which models are given an excerpt of literary analysis surrounding a masked quotation and asked to retrieve the quoted passage from the set of all passages in the work. Solving this retrieval task requires a deep understanding of complex literary and linguistic phenomena, which proves challenging to methods that overwhelmingly rely on lexical and semantic similarity matching. We implement a RoBERTa-based dense passage retriever for this task that outperforms existing pretrained information retrieval baselines; however, experiments and analysis by human domain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Text Analysis Techniques
