Evaluating LLMs for Targeted Concept Simplification for Domain-Specific   Texts

Sumit Asthana; Hannah Rashkin; Elizabeth Clark; Fantine Huot; Mirella; Lapata

arXiv:2410.20763·cs.CL·January 28, 2025

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella, Lapata

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a new task called targeted concept simplification to help readers understand difficult domain-specific texts, presents a dataset, benchmarks models, and analyzes human preferences and evaluation challenges.

Contribution

It proposes the concept simplification task, creates the WikiDomains dataset, and evaluates multiple LLMs and baselines, highlighting the gap between automated metrics and human judgments.

Findings

01

Humans prefer explanations about difficult concepts over phrase simplification.

02

No single model outperforms others across all quality metrics.

03

Automated metrics show low correlation (~0.2) with human judgments.

Abstract

One useful application of NLP models is to support people in reading complex text from unfamiliar domains (e.g., scientific articles). Simplifying the entire text makes it understandable but sometimes removes important details. On the contrary, helping adult readers understand difficult concepts in context can enhance their vocabulary and knowledge. In a preliminary human study, we first identify that lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We then introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We also introduce WikiDomains, a new dataset of 22k definitions from 13 academic domains paired with a difficult concept within each definition. We benchmark the performance of open-source and commercial…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

google-deepmind/wikidomains
noneOfficial

Videos

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts· underline

Taxonomy

TopicsText Readability and Simplification · Natural Language Processing Techniques