Persuasiveness and Bias in LLM: Investigating the Impact of Persuasiveness and Reinforcement of Bias in Language Models
Saumya Roy

TL;DR
This research investigates how Large Language Models can persuade users and unintentionally reinforce social biases, highlighting both their potential and risks in spreading misinformation and stereotypes.
Contribution
It introduces a framework to measure persuasion and bias amplification in LLMs, emphasizing safety evaluation and policy implications.
Findings
LLMs can effectively shape narratives and mirror audience values.
They can also unintentionally promote misinformation and reinforce stereotypes.
The study highlights the need for guardrails and policies to prevent misuse.
Abstract
Warning: This research studies AI persuasion and bias amplification that could be misused; all experiments are for safety evaluation. Large Language Models (LLMs) now generate convincing, human-like text and are widely used in content creation, decision support, and user interactions. Yet the same systems can spread information or misinformation at scale and reflect social biases that arise from data, architecture, or training choices. This work examines how persuasion and bias interact in LLMs, focusing on how imperfect or skewed outputs affect persuasive impact. Specifically, we test whether persona-based models can persuade with fact-based claims while also, unintentionally, promoting misinformation or biased narratives. We introduce a convincer-skeptic framework: LLMs adopt personas to simulate realistic attitudes. Skeptic models serve as human proxies; we compare their beliefs…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
