BOOKCOREF: Coreference Resolution at Book Scale
Giuliano Martinelli, Tommaso Bonomo, Pere-Llu\'is Huguet Cabot, Roberto Navigli

TL;DR
This paper introduces BOOKCOREF, a novel large-scale benchmark for coreference resolution on full-length books, along with an automatic annotation pipeline, revealing current models' limitations at this scale.
Contribution
The paper presents the first book-scale coreference benchmark and an automatic annotation pipeline, enabling evaluation of coreference systems on texts over 200,000 tokens.
Findings
Models improve up to +20 CoNLL-F1 on full books.
Current models struggle with book-scale texts.
The benchmark reveals new challenges in long-document coreference.
Abstract
Coreference Resolution systems are typically evaluated on benchmarks containing small- to medium-scale documents. When it comes to evaluating long texts, however, existing benchmarks, such as LitBank, remain limited in length and do not adequately assess system capabilities at the book scale, i.e., when co-referring mentions span hundreds of thousands of tokens. To fill this gap, we first put forward a novel automatic pipeline that produces high-quality Coreference Resolution annotations on full narrative texts. Then, we adopt this pipeline to create the first book-scale coreference benchmark, BOOKCOREF, with an average document length of more than 200,000 tokens. We carry out a series of experiments showing the robustness of our automatic procedure and demonstrating the value of our resource, which enables current long-document coreference systems to gain up to +20 CoNLL-F1 points when…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsIntelligent Tutoring Systems and Adaptive Learning · Innovative Teaching and Learning Methods
