Lies, Damned Lies, and Distributional Language Statistics: Persuasion   and Deception with Large Language Models

Cameron R. Jones; Benjamin K. Bergen

arXiv:2412.17128·cs.CL·December 24, 2024·6 cites

Lies, Damned Lies, and Distributional Language Statistics: Persuasion and Deception with Large Language Models

Cameron R. Jones, Benjamin K. Bergen

PDF

Open Access

TL;DR

This paper reviews recent empirical and theoretical research on large language models' abilities to persuade and deceive, highlighting risks, mitigation strategies, and open questions for future investigation.

Contribution

It synthesizes current findings on LLMs' persuasive and deceptive capabilities, analyzes potential risks, and discusses mitigation approaches and future research directions.

Findings

01

Current persuasive effects are relatively small.

02

Mechanisms like fine-tuning and multimodality could increase impact.

03

Open questions include the evolution of persuasive AI and mitigation effectiveness.

Abstract

Large Language Models (LLMs) can generate content that is as persuasive as human-written text and appear capable of selectively producing deceptive outputs. These capabilities raise concerns about potential misuse and unintended consequences as these systems become more widely deployed. This review synthesizes recent empirical work examining LLMs' capacity and proclivity for persuasion and deception, analyzes theoretical risks that could arise from these capabilities, and evaluates proposed mitigations. While current persuasive effects are relatively small, various mechanisms could increase their impact, including fine-tuning, multimodality, and social factors. We outline key open questions for future research, including how persuasive AI systems might become, whether truth enjoys an inherent advantage over falsehoods, and how effective different mitigation strategies may be in practice.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHate Speech and Cyberbullying Detection