Explainable Prediction of Text Complexity: The Missing Preliminaries for Text Simplification
Cristina Garbacea, Mengtian Guo, Samuel Carton, Qiaozhu Mei

TL;DR
This paper emphasizes the importance of preliminary steps like complexity prediction and complex part identification in text simplification, demonstrating that incorporating these explainable stages significantly enhances the performance of black-box neural models.
Contribution
It introduces a decomposed pipeline for text simplification that improves transparency and out-of-sample performance of existing models.
Findings
Preliminary complexity prediction boosts simplification accuracy.
Explicitly identifying complex parts improves model transparency.
Pipeline approach enhances out-of-sample performance.
Abstract
Text simplification reduces the language complexity of professional content for accessibility purposes. End-to-end neural network models have been widely adopted to directly generate the simplified version of input text, usually functioning as a blackbox. We show that text simplification can be decomposed into a compact pipeline of tasks to ensure the transparency and explainability of the process. The first two steps in this pipeline are often neglected: 1) to predict whether a given piece of text needs to be simplified, and 2) if yes, to identify complex parts of the text. The two tasks can be solved separately using either lexical or deep learning methods, or solved jointly. Simply applying explainable complexity prediction as a preliminary step, the out-of-sample text simplification performance of the state-of-the-art, black-box simplification models can be improved by a large…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Topic Modeling · Natural Language Processing Techniques
