Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors

Pulkit Arora; Akbar Karimi; Lucie Flek

arXiv:2501.08276·cs.CL·July 8, 2025

Exploring Robustness of LLMs to Paraphrasing Based on Sociodemographic Factors

Pulkit Arora, Akbar Karimi, Lucie Flek

PDF

Open Access

TL;DR

This paper investigates how large language models' robustness is affected by paraphrasing based on sociodemographic factors, revealing that demographic variations significantly influence model performance.

Contribution

It introduces a new dataset with demographic-based paraphrases and analyzes LLMs' ability to handle diverse linguistic styles conditioned on sociodemographic factors.

Findings

01

Demographic paraphrasing impacts LLM performance

02

Linguistic diversity affects model robustness

03

Models struggle with subtle sociodemographic variations

Abstract

Despite their linguistic prowess, LLMs have been shown to be vulnerable to small input perturbations. While robustness to local adversarial changes has been studied, robustness to global modifications such as different linguistic styles remains underexplored. Therefore, we take a broader approach to explore a wider range of variations across sociodemographic dimensions. We extend the SocialIQA dataset to create diverse paraphrased sets conditioned on sociodemographic factors (age and gender). The assessment aims to provide a deeper understanding of LLMs in (a) their capability of generating demographic paraphrases with engineered prompts and (b) their capabilities in interpreting real-world, complex language scenarios. We also perform a reliability analysis of the generated paraphrases looking into linguistic diversity and perplexity as well as manual evaluation. We find that…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Text Readability and Simplification · Translation Studies and Practices