Large language models can effectively convince people to believe conspiracies

Thomas H. Costello; Kellin Pelrine; Matthew Kowal; Antonio A. Arechar; Jean-Fran\c{c}ois Godbout; Adam Gleave; David Rand; Gordon Pennycook

arXiv:2601.05050·cs.AI·January 12, 2026

Large language models can effectively convince people to believe conspiracies

Thomas H. Costello, Kellin Pelrine, Matthew Kowal, Antonio A. Arechar, Jean-Fran\c{c}ois Godbout, Adam Gleave, David Rand, Gordon Pennycook

PDF

Open Access

TL;DR

This study shows that large language models like GPT-4 can persuade people to believe or disbelieve conspiracy theories, with implications for both their persuasive power and potential mitigation strategies.

Contribution

It reveals that LLMs can effectively promote false or true beliefs, even with safety guardrails, and that targeted prompts can mitigate their influence.

Findings

01

GPT-4 can increase conspiracy beliefs when unguarded

02

Standard GPT-4 has similar effects despite safety measures

03

Corrective prompts can reverse induced beliefs

Abstract

Large language models (LLMs) have been shown to be persuasive across a variety of contexts. But it remains unclear whether this persuasive power advantages truth over falsehood, or if LLMs can promote misbeliefs just as easily as refuting them. Here, we investigate this question across three pre-registered experiments in which participants (N = 2,724 Americans) discussed a conspiracy theory they were uncertain about with GPT-4o, and the model was instructed to either argue against ("debunking") or for ("bunking") that conspiracy. When using a "jailbroken" GPT-4o variant with guardrails removed, the AI was as effective at increasing conspiracy belief as decreasing it. Concerningly, the bunking AI was rated more positively, and increased trust in AI, more than the debunking AI. Surprisingly, we found that using standard GPT-4o produced very similar effects, such that the guardrails…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Psychology of Moral and Emotional Judgment · Explainable Artificial Intelligence (XAI)