Robustness and Confounders in the Demographic Alignment of LLMs with Human Perceptions of Offensiveness
Shayan Alipour, Indira Sen, Mattia Samory, Tanushree Mitra

TL;DR
This study systematically evaluates demographic biases in LLMs' alignment with human perceptions of offensiveness across multiple datasets, highlighting the influence of confounders like document difficulty and annotator sensitivity.
Contribution
It introduces a comprehensive, confounder-aware analysis of demographic biases in LLMs' offensive language detection across diverse datasets.
Findings
Demographic traits, especially race, influence alignment but vary across datasets.
Confounders such as annotator sensitivity and document difficulty explain more variation than demographics.
Alignment increases with annotator sensitivity and group agreement, decreases with document difficulty.
Abstract
Large language models (LLMs) are known to exhibit demographic biases, yet few studies systematically evaluate these biases across multiple datasets or account for confounding factors. In this work, we examine LLM alignment with human annotations in five offensive language datasets, comprising approximately 220K annotations. Our findings reveal that while demographic traits, particularly race, influence alignment, these effects are inconsistent across datasets and often entangled with other factors. Confounders -- such as document difficulty, annotator sensitivity, and within-group agreement -- account for more variation in alignment patterns than demographic traits alone. Specifically, alignment increases with higher annotator sensitivity and group agreement, while greater document difficulty corresponds to reduced alignment. Our results underscore the importance of multi-dataset…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsOpen Source Software Innovations
