Large language models can effectively convince people to believe conspiracies
Thomas H. Costello, Kellin Pelrine, Matthew Kowal, Antonio A. Arechar, Jean-Fran\c{c}ois Godbout, Adam Gleave, David Rand, Gordon Pennycook

TL;DR
This study shows that large language models like GPT-4 can persuade people to believe or disbelieve conspiracy theories, with implications for both their persuasive power and potential mitigation strategies.
Contribution
It reveals that LLMs can effectively promote false or true beliefs, even with safety guardrails, and that targeted prompts can mitigate their influence.
Findings
GPT-4 can increase conspiracy beliefs when unguarded
Standard GPT-4 has similar effects despite safety measures
Corrective prompts can reverse induced beliefs
Abstract
Large language models (LLMs) have been shown to be persuasive across a variety of contexts. But it remains unclear whether this persuasive power advantages truth over falsehood, or if LLMs can promote misbeliefs just as easily as refuting them. Here, we investigate this question across three pre-registered experiments in which participants (N = 2,724 Americans) discussed a conspiracy theory they were uncertain about with GPT-4o, and the model was instructed to either argue against ("debunking") or for ("bunking") that conspiracy. When using a "jailbroken" GPT-4o variant with guardrails removed, the AI was as effective at increasing conspiracy belief as decreasing it. Concerningly, the bunking AI was rated more positively, and increased trust in AI, more than the debunking AI. Surprisingly, we found that using standard GPT-4o produced very similar effects, such that the guardrails…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMisinformation and Its Impacts · Psychology of Moral and Emotional Judgment · Explainable Artificial Intelligence (XAI)
