Do Large Language Models Adapt to Language Variation across Socioeconomic Status?
Elisa Bassignana, Mike Zhang, Dirk Hovy, Amanda Cercas Curry

TL;DR
This study investigates how well large language models adapt their linguistic style to different socioeconomic groups on social media, revealing limited adaptation and potential reinforcement of social hierarchies.
Contribution
It introduces a novel SES-stratified social media dataset and systematically evaluates LLMs' style adaptation across socioeconomic contexts.
Findings
LLMs only minimally adapt to SES differences.
Models tend to emulate upper SES styles more effectively.
LLMs may reinforce linguistic hierarchies.
Abstract
Humans adjust their linguistic style to the audience they are addressing. However, the extent to which LLMs adapt to different social contexts is largely unknown. As these models increasingly mediate human-to-human communication, their failure to adapt to diverse styles can perpetuate stereotypes and marginalize communities whose linguistic norms are less closely mirrored by the models, thereby reinforcing social stratification. We study the extent to which LLMs integrate into social media communication across different socioeconomic status (SES) communities. We collect a novel dataset from Reddit and YouTube, stratified by SES. We prompt four LLMs with incomplete text from that corpus and compare the LLM-generated completions to the originals along 94 sociolinguistic metrics, including syntactic, rhetorical, and lexical features. LLMs modulate their style with respect to SES to only a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsAuthorship Attribution and Profiling · Language and cultural evolution · Mental Health via Writing
