Simple is not Enough: Document-level Text Simplification using Readability and Coherence
Laura V\'asquez-Rodr\'iguez, Nhung T.H. Nguyen, Piotr Przyby{\l}a,, Matthew Shardlow, Sophia Ananiadou

TL;DR
This paper introduces SimDoc, a document-level text simplification system that jointly optimizes for simplicity, readability, and coherence, addressing a gap in existing sentence-focused approaches.
Contribution
It extends professional simplification corpora with new annotations and evaluates a multi-objective training approach for document-level simplification.
Findings
Effective in zero-shot, few-shot, and fine-tuning settings
Improves coherence and readability in document simplification
Highlights challenges of document-level simplification
Abstract
In this paper, we present the SimDoc system, a simplification model considering simplicity, readability, and discourse aspects, such as coherence. In the past decade, the progress of the Text Simplification (TS) field has been mostly shown at a sentence level, rather than considering paragraphs or documents, a setting from which most TS audiences would benefit. We propose a simplification system that is initially fine-tuned with professionally created corpora. Further, we include multiple objectives during training, considering simplicity, readability, and coherence altogether. Our contributions include the extension of professionally annotated simplification corpora by the association of existing annotations into (complex text, simple text, readability label) triples to benefit from readability during training. Also, we present a comparative analysis in which we evaluate our proposed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques
MethodsSpatio-temporal stability analysis
