Say It Another Way: Auditing LLMs with a User-Grounded Automated Paraphrasing Framework
Cl\'ea Chataigner, Rebecca Ma, Prakhar Ganesh, Yuhao Chen, Afaf Ta\"ik, Elliot Creager, Golnoosh Farnadi

TL;DR
AUGMENT is a framework that generates controlled, user-grounded paraphrases to improve the reliability of auditing large language models by uncovering systematic weaknesses.
Contribution
We introduce AUGMENT, a novel framework for producing linguistically informed, controlled paraphrases grounded in user behavior, enhancing LLM auditing accuracy.
Findings
Controlled paraphrases reveal weaknesses hidden under unconstrained variation.
AUGMENT improves the reliability of LLM auditing.
Case studies demonstrate effectiveness on BBQ and MMLU datasets.
Abstract
Large language models (LLMs) are highly sensitive to subtle changes in prompt phrasing, posing challenges for reliable auditing. Prior methods often apply unconstrained prompt paraphrasing, which risk missing linguistic and demographic factors that shape authentic user interactions. We introduce AUGMENT (Automated User-Grounded Modeling and Evaluation of Natural Language Transformations), a framework for generating controlled paraphrases, grounded in user behaviors. AUGMENT leverages linguistically informed rules and enforces quality through checks on instruction adherence, semantic similarity, and realism, ensuring paraphrases are both reliable and meaningful for auditing. Through case studies on the BBQ and MMLU datasets, we show that controlled paraphrases uncover systematic weaknesses that remain obscured under unconstrained variation. These results highlight the value of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsTopic Modeling · Text Readability and Simplification · Natural Language Processing Techniques
