Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification
Reno Kriz, Jo\~ao Sedoc, Marianna Apidianaki, Carolina Zheng, Gaurav, Kumar, Eleni Miltsakaki, Chris Callison-Burch

TL;DR
This paper introduces a complexity-weighted loss and diverse reranking techniques to improve sentence simplification, resulting in simpler, more fluent outputs that outperform some existing models.
Contribution
It proposes novel methods incorporating word complexity into training loss and diverse candidate reranking to enhance sentence simplification quality.
Findings
Models generate simpler sentences with improved fluency and adequacy.
The approach outperforms baseline models on automatic and human evaluations.
Incorporating complexity measures leads to more effective simplification.
Abstract
Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
MethodsSigmoid Activation · Tanh Activation · Long Short-Term Memory · Sequence to Sequence
