Can Small-Scale Data Poisoning Exacerbate Dialect-Linked Biases in Large Language Models?
Chaymaa Abbas, Mariette Awad, Razane Tajeddine

TL;DR
This paper demonstrates that small-scale data poisoning targeting dialectal styles can significantly amplify sociolinguistic biases and toxicity in large language models, highlighting the need for dialect-aware safeguards.
Contribution
It introduces a novel style-conditioned poisoning method that reveals how dialectal prompts can trigger harmful biases in LLMs, emphasizing the importance of style-aware evaluation and mitigation.
Findings
Poisoned models show increased toxicity for dialectal inputs, especially AAVE.
Conventional toxicity detectors underestimate sociolinguistic harms.
Poisoning can cause emergent jailbreaking behaviors without explicit slurs.
Abstract
Style-conditioned data poisoning is identified as a covert vector for amplifying sociolinguistic bias in large language models. Using small poisoned budgets that pair dialectal prompts -- principally African American Vernacular English (AAVE) and a Southern dialect -- with toxic or stereotyped completions during instruction tuning, this work probes whether linguistic style can act as a latent trigger for harmful behavior. Across multiple model families and scales, poisoned exposure elevates toxicity and stereotype expression for dialectal inputs -- most consistently for AAVE -- while Standard American English remains comparatively lower yet not immune. A multi-metric audit combining classifier-based toxicity with an LLM-as-a-judge reveals stereotype-laden content even when lexical toxicity appears muted, indicating that conventional detectors under-estimate sociolinguistic harms.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsLinguistic Variation and Morphology · Multilingual Education and Policy · Linguistics, Language Diversity, and Identity
