Uncovering Hidden Violent Tendencies in LLMs: A Demographic Analysis via Behavioral Vignettes
Quintin Myers, Yanjun Gao

TL;DR
This study evaluates large language models' tendencies towards violence using social science vignettes and demographic prompts, revealing biases and inconsistencies with established human behavioral research.
Contribution
It introduces a novel methodology combining social science instruments and demographic prompting to analyze biases in LLMs' violent response tendencies.
Findings
LLMs' surface responses often differ from their internal preferences.
Violent tendencies in LLMs vary across demographic prompts.
Results sometimes contradict established social science findings.
Abstract
Large language models (LLMs) are increasingly proposed for detecting and responding to violent content online, yet their ability to reason about morally ambiguous, real-world scenarios remains underexamined. We present the first study to evaluate LLMs using a validated social science instrument designed to measure human response to everyday conflict, namely the Violent Behavior Vignette Questionnaire (VBVQ). To assess potential bias, we introduce persona-based prompting that varies race, age, and geographic identity within the United States. Six LLMs developed across different geopolitical and organizational contexts are evaluated under a unified zero-shot setting. Our study reveals two key findings: (1) LLMs surface-level text generation often diverges from their internal preference for violent responses; (2) their violent tendencies vary across demographics, frequently contradicting…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCybercrime and Law Enforcement Studies · Digital Economy and Work Transformation · Artificial Intelligence in Law
