Are LLM-generated plain language summaries truly understandable? A large-scale crowdsourced evaluation
Yue Guo, Jae Ho Sohn, Gondy Leroy, Trevor Cohen

TL;DR
This study evaluates the effectiveness of large language model-generated plain language summaries in healthcare, revealing that while they appear comparable to human summaries in subjective quality, they do not enhance understanding better, and automated metrics are unreliable.
Contribution
First large-scale evaluation comparing LLM-generated PLSs to human ones using both subjective ratings and comprehension tests, highlighting gaps in current automated evaluation metrics.
Findings
LLM-generated PLSs are perceived as similar to human summaries in subjective assessments.
Human-written PLSs significantly improve reader comprehension over LLM-generated ones.
Automated evaluation metrics do not correlate well with human judgments of PLS quality.
Abstract
Plain language summaries (PLSs) are essential for facilitating effective communication between clinicians and patients by making complex medical information easier for laypeople to understand and act upon. Large language models (LLMs) have recently shown promise in automating PLS generation, but their effectiveness in supporting health information comprehension remains unclear. Prior evaluations have generally relied on automated scores that do not measure understandability directly, or subjective Likert-scale ratings from convenience samples with limited generalizability. To address these gaps, we conducted a large-scale crowdsourced evaluation of LLM-generated PLSs using Amazon Mechanical Turk with 150 participants. We assessed PLS quality through subjective Likert-scale ratings focusing on simplicity, informativeness, coherence, and faithfulness; and objective multiple-choice…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling
