AbLit: A Resource for Analyzing and Generating Abridged Versions of English Literature
Melissa Roemmele, Kyle Shaffer, Katrina Olsen, Yiyi Wang, Steve, DeNeefe

TL;DR
This paper introduces AbLit, a novel dataset and models for the challenging task of creating abridged versions of English literature, focusing on passage-level alignment and linguistic relation prediction.
Contribution
It presents the first NLP-focused resource for abridgement, including a dataset with alignments and models for relation prediction and text generation.
Findings
Abridgement is a complex NLP task.
The dataset enables new research in text simplification.
Automated models show promising results in predicting relations and generating abridged texts.
Abstract
Creating an abridged version of a text involves shortening it while maintaining its linguistic qualities. In this paper, we examine this task from an NLP perspective for the first time. We present a new resource, AbLit, which is derived from abridged versions of English literature books. The dataset captures passage-level alignments between the original and abridged texts. We characterize the linguistic relations of these alignments, and create automated models to predict these relations as well as to generate abridgements for new texts. Our findings establish abridgement as a challenging task, motivating future resources and research. The dataset is available at github.com/roemmele/AbLit.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Digital Humanities and Scholarship · Text Readability and Simplification
