Dancing Between Success and Failure: Edit-level Simplification Evaluation using SALSA
David Heineman, Yao Dou, Mounica Maddela, Wei Xu

TL;DR
This paper introduces SALSA, a comprehensive human annotation framework for evaluating text simplification, along with a new automatic metric LENS-SALSA, revealing detailed insights into model and human performance.
Contribution
The paper presents SALSA, a novel fine-grained annotation framework for text simplification evaluation, and develops LENS-SALSA, an automatic metric trained on these annotations.
Findings
GPT-3.5 performs more quality edits than humans
Fine-grained annotations reveal differences in simplification strategies
Word-level quality estimation shows promising results
Abstract
Large language models (e.g., GPT-4) are uniquely capable of producing highly rated text simplification, yet current human evaluation methods fail to provide a clear understanding of systems' specific strengths and weaknesses. To address this limitation, we introduce SALSA, an edit-based human annotation framework that enables holistic and fine-grained text simplification evaluation. We develop twenty one linguistically grounded edit types, covering the full spectrum of success and failure across dimensions of conceptual, syntactic and lexical simplicity. Using SALSA, we collect 19K edit annotations on 840 simplifications, revealing discrepancies in the distribution of simplification strategies performed by fine-tuned models, prompted LLMs and humans, and find GPT-3.5 performs more quality edits than humans, but still exhibits frequent errors. Using our fine-grained annotations, we…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsText Readability and Simplification · Natural Language Processing Techniques · Topic Modeling
Methods{Dispute@FaQ-s}How to file a dispute with Expedia? · 15 Ways to Contact How can i speak to someone at Delta Airlines · Attention Is All You Need · fail · Cosine Annealing · Softmax · Layer Normalization · Byte Pair Encoding · Dropout · Linear Layer
